You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
I get the error FileNotFoundError: Dataset does not exist. Please check the path or dataset_id when trying to load the yfcc-10M-filter-euclidean dataset.
Expected Behavior
The dataset should be loaded as its available within list_datasets().
Steps To Reproduce
frompinecone_datasetsimportlist_datasets, load_datasetdatasets=list_datasets()
dataset_name="yfcc-10M-filter-euclidean"assertdataset_nameindatasets, "Dataset does not exists!"dataset=load_dataset(dataset_name)
Relevant log output
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[6], line 1
----> 1 load_dataset('yfcc-10M-filter-euclidean')
File ~/vector_db_benchmark/venv/lib/python3.10/site-packages/pinecone_datasets/public.py:59, in load_dataset(dataset_id, **kwargs)
57 raise FileNotFoundError(f"Dataset {dataset_id} not found in catalog")
58 else:
---> 59 return Dataset.from_catalog(dataset_id, **kwargs)
File ~/vector_db_benchmark/venv/lib/python3.10/site-packages/pinecone_datasets/dataset.py:89, in Dataset.from_catalog(cls, dataset_id, catalog_base_path, **kwargs)
83 catalog_base_path = (
84 catalog_base_path
85 if catalog_base_path
86 else os.environ.get("DATASETS_CATALOG_BASEPATH", cfg.Storage.endpoint)
87 )
88 dataset_path = os.path.join(catalog_base_path, f"{dataset_id}")
---> 89 return cls(dataset_path=dataset_path, **kwargs)
File ~/vector_db_benchmark/venv/lib/python3.10/site-packages/pinecone_datasets/dataset.py:190, in Dataset.__init__(self, dataset_path, **kwargs)
188 self._dataset_path = dataset_path
189 if not self._fs.exists(self._dataset_path):
--> 190 raise FileNotFoundError(
191 "Dataset does not exist. Please check the path or dataset_id"
192 )
193 else:
194 self._fs = None
FileNotFoundError: Dataset does not exist. Please check the path or dataset_id
Is this a new bug?
Current Behavior
I get the error
FileNotFoundError: Dataset does not exist. Please check the path or dataset_id
when trying to load the yfcc-10M-filter-euclidean dataset.Expected Behavior
The dataset should be loaded as its available within
list_datasets()
.Steps To Reproduce
Relevant log output
Environment
Additional Context
Looking at the metadata about the datasets
Results show that the data is not in the
bucket
:The text was updated successfully, but these errors were encountered: