Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EHN] Adding ERPCore datasets #627

Open
wants to merge 77 commits into
base: develop
Choose a base branch
from

Conversation

tahatt13
Copy link
Collaborator

The purpose of this PR is to add ERPCore datasets Here is the link to the official website of the dataset : https://erpinfo.org/erp-core.

tahatt13 and others added 30 commits June 14, 2024 15:43
PierreGtch and others added 18 commits July 10, 2024 15:51
* Copy workflow changes from PR NeuroTechX#584

* Update whats_new.rst

* Remove "always deploy" used for testing

* Restore change to pytest instead of unittest

* Use mne data cache in tests too
* Add summary tables as csv

* Change BaseDataset to include the dataset's summary table in the docstring

* Try update summary table in doc

* Add test for dataset summary table

* Fix future annotations

* Prepare summary tables before doc

* Make build dir if not exist

* Add rstate table

* Replace tables with CSVs in doc

* Add PapersWithCOde column

* Fix initial white space

* Fix formatting of ints

* Remove tables from docstrings

* Add missing DemonsP300 row

* Update whats_new.rst

* Remove Shin2017B leaderboard

* Move PWC link outside of docstring table
* including Liu2024 Dataset

* [pre-commit.ci] auto fixes from pre-commit.com hooks

* Function data_path

* [pre-commit.ci] auto fixes from pre-commit.com hooks

* data_infos and get_single_subject_data functions

* [pre-commit.ci] auto fixes from pre-commit.com hooks

* Data Description

* [pre-commit.ci] auto fixes from pre-commit.com hooks

* updating get_single_subject fct and data_path & adding encoding fct

* [pre-commit.ci] auto fixes from pre-commit.com hooks

* Finishing the code

* [pre-commit.ci] auto fixes from pre-commit.com hooks

* updating docstrings for data_path

* Updating dataset_summary and updating the get_single_subject fct to handle the case of existing file in path_electrodes

* adapting the return of get_single_subject_data fct

* [pre-commit.ci] auto fixes from pre-commit.com hooks

* Adding dataset description and preload = True when reading the data in the get_single_subject fct

* fix: codespell

* fix: changing to static method the encoding

* repushing and resolving pre-commit conflicts

* [pre-commit.ci] auto fixes from pre-commit.com hooks

* fix: changing the mapping

* fix: changing the unmatching between the trigger and the events from the csv

* ehn: using pylint to improve the code (remove not used variables, and change the module);

* modifying the python version in the pre-commit file

* adding description to enhancements

* adjusting the encoding

* adjusting comments and dataset description

* solving channels types and names issues & correcting encoding

* Correcting the number of trials per class and the total trials

* Correcting data description

* Adding the possibility to exclude/include break and instructions events

* Correcting code to include/exclude instr and break events

* Remove table from docstring

Signed-off-by: PierreGtch <[email protected]>

* Add CSV row

---------

Signed-off-by: PierreGtch <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: bruAristimunha <[email protected]>
Co-authored-by: PierreGtch <[email protected]>
Co-authored-by: Pierre Guetschel <[email protected]>
Signed-off-by: PierreGtch <[email protected]>
Signed-off-by: Bru <[email protected]>
…to Adding-ERPCore
Signed-off-by: Bru <[email protected]>
Copy link
Collaborator

@PierreGtch PierreGtch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to load data from subject 1 with the latest version of the code using:

data = ErpCore2021_ERN().get_data([1])

but got:

requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='files.osf.io', port=443): Read timed out. (read timeout=15)

I tried on eduroam and on a mobile connection.
Am I the only one to have problem downloading this dataset?

moabb/datasets/erpcore2021.py Outdated Show resolved Hide resolved
@bruAristimunha bruAristimunha changed the title Adding ERPCore datasets [EHN] Adding ERPCore datasets Feb 13, 2025
@sebVelut
Copy link
Collaborator

I have tested to install the moabb version with the ERPCORE2021, and try to get the data but I get this warning :
InsecureRequestWarning: Unverified HTTPS request is being made to host '[files.osf.io](http://files.osf.io/)'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
and this error:
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='[files.osf.io](http://files.osf.io/)', port=443): Read timed out. (read timeout=5)

@PierreGtch
Copy link
Collaborator

With the latest version, download works on my side
But the local file path is strange:

Downloading data from 'https://files.osf.io/v1/resources/q6gwp/providers/osfstorage/600df65e75226b017d517f6d/?zip=' to file '/Users/Pierre.Guetschel/mne_data/MNE-erpcoreern2021-data/MNE-erpcoreern2021-data/v1/resources/q6gwp/providers/osfstorage/?zip='.

And the file can't be loaded:

Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevconsole.py", line 364, in runcode
    coro = func()
  File "<input>", line 1, in <module>
  File "/Users/Pierre.Guetschel/Projects/moabb/moabb/datasets/base.py", line 433, in get_data
    data[subject] = self._get_single_subject_data_using_cache(
  File "/Users/Pierre.Guetschel/Projects/moabb/moabb/datasets/base.py", line 527, in _get_single_subject_data_using_cache
    sessions_data = self._get_single_subject_data(subject)
  File "/Users/Pierre.Guetschel/Projects/moabb/moabb/datasets/erpcore2021.py", line 213, in _get_single_subject_data
    file_path = self.data_path(subject)[0]
  File "/Users/Pierre.Guetschel/Projects/moabb/moabb/datasets/erpcore2021.py", line 271, in data_path
    dataset_path = self.download_and_extract(path=path, force_update=force_update)
  File "/Users/Pierre.Guetschel/Projects/moabb/moabb/datasets/erpcore2021.py", line 324, in download_and_extract
    zip_ref = z.ZipFile(path_zip, "r")
  File "/Users/Pierre.Guetschel/miniforge3/envs/moabb/lib/python3.9/zipfile.py", line 1268, in __init__
    self._RealGetContents()
  File "/Users/Pierre.Guetschel/miniforge3/envs/moabb/lib/python3.9/zipfile.py", line 1335, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

Also I get the same warning as @sebVelut :

InsecureRequestWarning: Unverified HTTPS request is being made to host 'files.osf.io'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings

Comment on lines +303 to +325
if path is not None:
path = Path(path) / DATASET_PARAMS[self.task]["folder_name"]
else:
# The Default path is in the user's home directory under 'mne_data'
path = Path.home() / "mne_data" / DATASET_PARAMS[self.task]["folder_name"]

path_zip = path / DATASET_PARAMS[self.task]["zipfile"]

# checking if the zip of the dataset is already downloaded
if not Path(path_zip).exists() or force_update:
# Download and extract the dataset
path_zip = dl.data_dl(
DATASET_PARAMS[self.task]["url"],
f"erpcore{self.task.lower()}2021",
path,
force_update,
)

metainformation = path / "participants.tsv"
# check if it has to unzip
if not Path(metainformation).exists():
zip_ref = z.ZipFile(path_zip, "r")
zip_ref.extractall(path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants