diff --git a/.gitignore b/.gitignore
index aa7b6b684b..920ec97f05 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,4 +1,4 @@
site/
.DS_Store
-src/.DS_Store
-src/04-modality-specific-files/.DS_Store
\ No newline at end of file
+.idea
+venvs
diff --git a/.travis.yml b/.travis.yml
index dfa507b50b..c8e31367e5 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,10 +1,18 @@
-language: node_js
-node_js:
- - "10"
-cache:
- directories:
- - node_modules # NPM packages
-before_script:
- - npm install `cat npm-requirements.txt`
-script:
- - remark src/*.md src/*/*.md --frail
+matrix:
+ include:
+ - language: node_js
+ node_js:
+ - "10"
+ cache:
+ directories:
+ - node_modules # NPM packages
+ before_script:
+ - npm install `cat npm-requirements.txt`
+ script:
+ - remark src/*.md src/*/*.md --frail
+ - language: python
+ python: 3.7
+ install:
+ - pip install yamllint
+ script:
+ - yamllint -f standard src/schema/
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 5df6000e14..58e23d892b 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -16,8 +16,9 @@ Been here before? Already know what you're looking for in this guide? Jump to th
* [Understanding issues](#understanding-issues)
* [Commenting on a pull request](#commenting-on-a-pull-request)
* [Writing in markdown](#writing-in-markdown)
-* [Make a change with a pull request](#making-a-change-with-a-pull-request)
+* [Making a change with a pull request](#making-a-change-with-a-pull-request)
* [Example pull request](#example-pull-request)
+* [Updating the specification schema](#updating-the-schema)
* [Fixing Remark errors from Travis](#fixing-travis-remark-errors)
* [Recognizing contributions](#recognizing-contributions)
@@ -224,6 +225,56 @@ GitHub has a [nice introduction](https://help.github.com/articles/github-flow/)
+## Updating the schema
+
+Portions of the BIDS specification are defined using YAML files, in order to make the specification machine-readable.
+Currently, the only portion of the specification that relies on this schema is the Entity Table, but any changes to the specification should be mirrored in the schema.
+
+### The format of the schema
+
+The schema reflects the files and objects in the specification, as well as associations between these objects.
+Here is a list of the files and subfolders of the schema, roughly in order of importance:
+
+- `datatypes/*.yaml`:
+ Data types supported by the specification.
+ Each datatype may support many suffixes.
+ These suffixes are divided into groups based on what extensions and entities are allowed for each.
+ Data types correspond to subfolders (e.g., `anat`, `func`) in the BIDS structure.
+- `auxdatatypes/*.yaml`:
+ Auxiliary (not directly imaging or data-containing) data types supported by the specification.
+ Each auxiliary data type is associated with a set of data types, and these auxiliary data types are grouped based on what data types, extensions, and entities are allowed for each.
+ Examples of auxiliary data types include `channels`, `electrodes`, and `photo`.
+- `entities.yaml`:
+ A list of entities (key/value pairs in folder and filenames) with associated descriptions and formatting rules.
+ The order of the entities in the file determines the order in which entities must appear in filenames.
+- `top_level_files.yaml`:
+ Modality-agnostic files stored at the top level of a BIDS dataset.
+ The schema specifies whether these files are required or optional, as well as acceptable extensions for each.
+- `modalities.yaml`:
+ Modalities supported by the specification, along with a list of associated data types.
+ Modalities are not reflected directly in the BIDS structure, but data types are modality-specific.
+- `associated_data.yaml`:
+ Folders that are commonly contained within the same folder as a BIDS dataset, but which do not follow the BIDS structure internally, such as `code` or `sourcedata`.
+ The schema specifies which folders are accepted and whether they are required or optional.
+
+### Making a change to the schema
+
+#### 1. Ensure that changes to the specification are matched in the schema
+
+The schema formalizes the rules described in the specification text, so you must ensure that any changes which impact the rules of the specification (including, but not limited to, adding new entities, suffixes, datatypes, modalities, etc.) are reflected in the schema as well.
+
+#### 2. Generate an updated entity table before pushing your changes
+
+Run the Python script `tools/bids_schema.py`:
+
+```bash
+python tools/bids_schema.py entity src/schema/ src/99-appendices/04-entity-table.md
+```
+
+#### 3. Push your changes
+
+For more information on making general changes with a pull request, please review [Making a change with a pull request](#making-a-change-with-a-pull-request).
+
## Fixing Travis Remark errors
We use a linter called [Remarkjs](https://github.com/remarkjs/remark-lint) to ensure all of
diff --git a/src/99-appendices/04-entity-table.md b/src/99-appendices/04-entity-table.md
index e460e28488..704e6020a9 100644
--- a/src/99-appendices/04-entity-table.md
+++ b/src/99-appendices/04-entity-table.md
@@ -1,33 +1,48 @@
# Appendix IV: Entity table
This section compiles the entities (key-value pairs) described throughout this
-specification, and establishes a common order within a filename. For example, if
-a file has an acquisition and reconstruction label, the acquisition entity must
-precede the reconstruction entity. REQUIRED and OPTIONAL entities for a given
-file type are denoted. Entity formats indicate whether the value is alphanumeric
+specification, and establishes a common order within a filename.
+For example, if a file has an acquisition and reconstruction label, the
+acquisition entity must precede the reconstruction entity.
+REQUIRED and OPTIONAL entities for a given file type are denoted.
+Entity formats indicate whether the value is alphanumeric
(``) or numeric (``).
A general introduction to entities is given in the section on
[file name structure](../02-common-principles.md#file-name-structure)
-| Entity | Subject | Session | Task | Acquisition | Contrast Enhancing Agent | Reconstruction | Phase-Encoding Direction | Run | Corresponding modality | Echo | Recording | Processed (on device) | Space | Split |
-| :--------------------------------------------------------------------------------------------- | :------------ | :------------ | :------------- | :------------ | :----------------------- | :------------- | :----------------------- | :------------ | :--------------------- | :------------- | :------------------ | :-------------------- | :---------------| :-------------- |
-| Format | `sub-` | `ses-` | `task-` | `acq-` | `ce-` | `rec-` | `dir-` | `run-` | `mod-` | `echo-` | `recording-` | `proc-` | `space-` | `split-` |
-| anat (T1w T2w T1rho T1map T2map T2star FLAIR FLASH PD PDmap PDT2 inplaneT1 inplaneT2 angio) | REQUIRED | OPTIONAL | | OPTIONAL | OPTIONAL | OPTIONAL | | | | | | | | |
-| anat (defacemask) | REQUIRED | OPTIONAL | | OPTIONAL | OPTIONAL | OPTIONAL | | | OPTIONAL | | | | | |
-| func (bold cbv phase sbref events) | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | OPTIONAL | OPTIONAL | OPTIONAL | OPTIONAL | | OPTIONAL | | | | |
-| func (physio stim) | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | | OPTIONAL | | OPTIONAL | | | OPTIONAL | OPTIONAL | | |
-| dwi (dwi bvec bval) | REQUIRED | OPTIONAL | | OPTIONAL | | | OPTIONAL | OPTIONAL | | | | | | |
-| fmap (phasediff phase1 phase2 magnitude1 magnitude2 magnitude fieldmap) | REQUIRED | OPTIONAL | | OPTIONAL | | | | OPTIONAL | | | | | | |
-| fmap (epi) | REQUIRED | OPTIONAL | | OPTIONAL | OPTIONAL | | REQUIRED | OPTIONAL | | | | | | |
-| beh (beh events) | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | | | | OPTIONAL | | | | | | |
-| beh (stim physio) | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | | | | OPTIONAL | | | OPTIONAL | | | |
-| meg | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | | | | OPTIONAL | | | | OPTIONAL | | OPTIONAL |
-| eeg | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | | | | OPTIONAL | | | | | | |
-| ieeg | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | | | | OPTIONAL | | | | | | |
-| channels (meg/eeg/ieeg) | REQUIRED | OPTIONAL | REQUIRED | | | | | OPTIONAL | | | | | | |
-| headshape (meg) | REQUIRED | OPTIONAL | | OPTIONAL | | | | | | | | | OPTIONAL | |
-| markers (meg) | REQUIRED | OPTIONAL | OPTIONAL | OPTIONAL | | | | | | | | | OPTIONAL | |
-| photo (meg/eeg/ieeg) | REQUIRED | OPTIONAL | | OPTIONAL | | | | | | | | | | |
-| electrodes (eeg/ieeg) | REQUIRED | OPTIONAL | | OPTIONAL | | | | | | | | | OPTIONAL | |
-| events (meg/eeg/ieeg) | REQUIRED | OPTIONAL | REQUIRED | | | | | OPTIONAL | | | | | | |
+## Magnetic Resonance Imaging
+
+| Entity | Subject | Session | Task | Acquisition | Contrast Enhancing Agent | Reconstruction | Phase-Encoding Direction | Run | Corresponding Modality | Echo | Recording |
+|------------------------------------------------------------------------------------------------|---------------|---------------|----------------|---------------|----------------------------|------------------|----------------------------|---------------|--------------------------|----------------|---------------------|
+| Format | `sub-` | `ses-` | `task-` | `acq-` | `ce-` | `rec-` | `dir-` | `run-` | `mod-` | `echo-` | `recording-` |
+| anat (T1w T2w T1rho T1map T2map T2star FLAIR FLASH PD PDmap PDT2 inplaneT1 inplaneT2 angio) | REQUIRED | OPTIONAL | | OPTIONAL | OPTIONAL | OPTIONAL | | OPTIONAL | | | |
+| anat (defacemask) | REQUIRED | OPTIONAL | | OPTIONAL | OPTIONAL | OPTIONAL | | OPTIONAL | OPTIONAL | | |
+| dwi (dwi sbref) | REQUIRED | OPTIONAL | | OPTIONAL | | | OPTIONAL | OPTIONAL | | | |
+| fmap (phasediff phase1 phase2 magnitude1 magnitude2 magnitude fieldmap) | REQUIRED | OPTIONAL | | OPTIONAL | | | | OPTIONAL | | | |
+| fmap (epi) | REQUIRED | OPTIONAL | | OPTIONAL | OPTIONAL | | REQUIRED | OPTIONAL | | | |
+| func (bold cbv phase sbref events) | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | OPTIONAL | OPTIONAL | OPTIONAL | OPTIONAL | | OPTIONAL | |
+| func (physio stim) | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | | OPTIONAL | | OPTIONAL | | | OPTIONAL |
+
+## Encephalography (EEG, iEEG, and MEG)
+
+| Entity | Subject | Session | Task | Acquisition | Run | Processed (on device) | Space | Split |
+|----------------------------|---------------|---------------|----------------|---------------|---------------|-------------------------|-----------------|-----------------|
+| Format | `sub-` | `ses-` | `task-` | `acq-` | `run-` | `proc-` | `space-` | `split-` |
+| eeg (eeg) | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | OPTIONAL | | | |
+| ieeg (ieeg) | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | OPTIONAL | | | |
+| meg (meg) | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | OPTIONAL | OPTIONAL | | OPTIONAL |
+| meg (headshape) | REQUIRED | OPTIONAL | | OPTIONAL | | | OPTIONAL | |
+| meg (markers) | REQUIRED | OPTIONAL | OPTIONAL | OPTIONAL | | | OPTIONAL | |
+| channels (meg eeg ieeg) | REQUIRED | OPTIONAL | REQUIRED | | OPTIONAL | | | |
+| electrodes (eeg ieeg) | REQUIRED | OPTIONAL | | OPTIONAL | | | OPTIONAL | |
+| events (meg eeg ieeg) | REQUIRED | OPTIONAL | REQUIRED | | OPTIONAL | | | |
+| photo (meg eeg ieeg) | REQUIRED | OPTIONAL | | OPTIONAL | | | | |
+
+## Behavioral Data
+
+| Entity | Subject | Session | Task | Acquisition | Run | Recording |
+|----------------------|---------------|---------------|----------------|---------------|---------------|---------------------|
+| Format | `sub-` | `ses-` | `task-` | `acq-` | `run-` | `recording-` |
+| beh (stim physio) | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | OPTIONAL | OPTIONAL |
+| beh (events beh) | REQUIRED | OPTIONAL | REQUIRED | OPTIONAL | OPTIONAL | |
diff --git a/src/schema/associated_data.yaml b/src/schema/associated_data.yaml
new file mode 100644
index 0000000000..b54ca34fb2
--- /dev/null
+++ b/src/schema/associated_data.yaml
@@ -0,0 +1,9 @@
+---
+- code/:
+ - required: false
+- derivatives/:
+ - required: false
+- sourcedata/:
+ - required: false
+- stimuli/:
+ - required: false
diff --git a/src/schema/auxdatatypes/channels.yaml b/src/schema/auxdatatypes/channels.yaml
new file mode 100644
index 0000000000..a906208973
--- /dev/null
+++ b/src/schema/auxdatatypes/channels.yaml
@@ -0,0 +1,15 @@
+---
+- datatypes:
+ - meg
+ - eeg
+ - ieeg
+ suffixes:
+ - channels
+ extensions:
+ - .json
+ - .tsv
+ entities:
+ sub: required
+ ses: optional
+ task: required
+ run: optional
diff --git a/src/schema/auxdatatypes/electrodes.yaml b/src/schema/auxdatatypes/electrodes.yaml
new file mode 100644
index 0000000000..b630ad5895
--- /dev/null
+++ b/src/schema/auxdatatypes/electrodes.yaml
@@ -0,0 +1,14 @@
+---
+- datatypes:
+ - eeg
+ - ieeg
+ suffixes:
+ - electrodes
+ extensions:
+ - .json
+ - .tsv
+ entities:
+ sub: required
+ ses: optional
+ acq: optional
+ space: optional
diff --git a/src/schema/auxdatatypes/events.yaml b/src/schema/auxdatatypes/events.yaml
new file mode 100644
index 0000000000..3c0ec521c1
--- /dev/null
+++ b/src/schema/auxdatatypes/events.yaml
@@ -0,0 +1,15 @@
+---
+- datatypes:
+ - meg
+ - eeg
+ - ieeg
+ suffixes:
+ - events
+ extensions:
+ - .json
+ - .tsv
+ entities:
+ sub: required
+ ses: optional
+ task: required
+ run: optional
diff --git a/src/schema/auxdatatypes/photo.yaml b/src/schema/auxdatatypes/photo.yaml
new file mode 100644
index 0000000000..e104d04aa9
--- /dev/null
+++ b/src/schema/auxdatatypes/photo.yaml
@@ -0,0 +1,13 @@
+---
+- datatypes:
+ - meg
+ - eeg
+ - ieeg
+ suffixes:
+ - photo
+ extensions:
+ - .jpg
+ entities:
+ sub: required
+ ses: optional
+ acq: optional
diff --git a/src/schema/datatypes/anat.yaml b/src/schema/datatypes/anat.yaml
new file mode 100644
index 0000000000..bae17648ce
--- /dev/null
+++ b/src/schema/datatypes/anat.yaml
@@ -0,0 +1,43 @@
+---
+# First group
+- suffixes:
+ - T1w
+ - T2w
+ - T1rho
+ - T1map
+ - T2map
+ - T2star
+ - FLAIR
+ - FLASH
+ - PD
+ - PDmap
+ - PDT2
+ - inplaneT1
+ - inplaneT2
+ - angio
+ extensions:
+ - .nii.gz
+ - .nii
+ - .json
+ entities:
+ sub: required
+ ses: optional
+ run: optional
+ acq: optional
+ ce: optional
+ rec: optional
+# Second group
+- suffixes:
+ - defacemask
+ extensions:
+ - .nii.gz
+ - .nii
+ - .json
+ entities:
+ sub: required
+ ses: optional
+ run: optional
+ acq: optional
+ ce: optional
+ rec: optional
+ mod: optional
diff --git a/src/schema/datatypes/beh.yaml b/src/schema/datatypes/beh.yaml
new file mode 100644
index 0000000000..536e883214
--- /dev/null
+++ b/src/schema/datatypes/beh.yaml
@@ -0,0 +1,28 @@
+---
+# First group
+- suffixes:
+ - stim
+ - physio
+ extensions:
+ - .tsv.gz
+ - .json
+ entities:
+ sub: required
+ ses: optional
+ task: required
+ acq: optional
+ run: optional
+ recording: optional
+# Second group
+- suffixes:
+ - events
+ - beh
+ extensions:
+ - .tsv
+ - .json
+ entities:
+ sub: required
+ ses: optional
+ task: required
+ acq: optional
+ run: optional
diff --git a/src/schema/datatypes/dwi.yaml b/src/schema/datatypes/dwi.yaml
new file mode 100644
index 0000000000..26f78761cf
--- /dev/null
+++ b/src/schema/datatypes/dwi.yaml
@@ -0,0 +1,29 @@
+---
+# First group
+- suffixes:
+ - dwi
+ extensions:
+ - .nii.gz
+ - .nii
+ - .json
+ - .bvec
+ - .bval
+ entities:
+ sub: required
+ ses: optional
+ acq: optional
+ dir: optional
+ run: optional
+# Second group
+- suffixes:
+ - sbref
+ extensions:
+ - .nii.gz
+ - .nii
+ - .json
+ entities:
+ sub: required
+ ses: optional
+ acq: optional
+ dir: optional
+ run: optional
diff --git a/src/schema/datatypes/eeg.yaml b/src/schema/datatypes/eeg.yaml
new file mode 100644
index 0000000000..f00ae44c1b
--- /dev/null
+++ b/src/schema/datatypes/eeg.yaml
@@ -0,0 +1,18 @@
+---
+- suffixes:
+ - eeg
+ extensions:
+ - .json
+ - .edf
+ - .vhdr
+ - .vmrk
+ - .eeg
+ - .set
+ - .fdt
+ - .bdf
+ entities:
+ sub: required
+ ses: optional
+ task: required
+ acq: optional
+ run: optional
diff --git a/src/schema/datatypes/fmap.yaml b/src/schema/datatypes/fmap.yaml
new file mode 100644
index 0000000000..98d0f6c45c
--- /dev/null
+++ b/src/schema/datatypes/fmap.yaml
@@ -0,0 +1,33 @@
+---
+# First group
+- suffixes:
+ - phasediff
+ - phase1
+ - phase2
+ - magnitude1
+ - magnitude2
+ - magnitude
+ - fieldmap
+ extensions:
+ - .nii.gz
+ - .nii
+ - .json
+ entities:
+ sub: required
+ ses: optional
+ acq: optional
+ run: optional
+# Second group
+- suffixes:
+ - epi
+ extensions:
+ - .nii.gz
+ - .nii
+ - .json
+ entities:
+ sub: required
+ ses: optional
+ acq: optional
+ ce: optional
+ dir: required
+ run: optional
diff --git a/src/schema/datatypes/func.yaml b/src/schema/datatypes/func.yaml
new file mode 100644
index 0000000000..62a26b60b8
--- /dev/null
+++ b/src/schema/datatypes/func.yaml
@@ -0,0 +1,52 @@
+---
+# First group
+- suffixes:
+ - bold
+ - cbv
+ - phase
+ - sbref
+ extensions:
+ - .nii.gz
+ - .nii
+ - .json
+ entities:
+ sub: required
+ ses: optional
+ task: required
+ acq: optional
+ ce: optional
+ rec: optional
+ dir: optional
+ run: optional
+ echo: optional
+# Second group
+- suffixes:
+ - events
+ extensions:
+ - .tsv
+ - .json
+ entities:
+ sub: required
+ ses: optional
+ task: required
+ acq: optional
+ ce: optional
+ rec: optional
+ dir: optional
+ run: optional
+ echo: optional
+# Third group
+- suffixes:
+ - physio
+ - stim
+ extensions:
+ - .tsv.gz
+ - .json
+ entities:
+ sub: required
+ ses: optional
+ task: required
+ acq: optional
+ rec: optional
+ run: optional
+ recording: optional
diff --git a/src/schema/datatypes/ieeg.yaml b/src/schema/datatypes/ieeg.yaml
new file mode 100644
index 0000000000..ff7e09314d
--- /dev/null
+++ b/src/schema/datatypes/ieeg.yaml
@@ -0,0 +1,19 @@
+---
+- suffixes:
+ - ieeg
+ extensions:
+ - .mefd/
+ - .json
+ - .edf
+ - .vhdr
+ - .eeg
+ - .vmrk
+ - .set
+ - .fdt
+ - .nwb
+ entities:
+ sub: required
+ ses: optional
+ task: required
+ acq: optional
+ run: optional
diff --git a/src/schema/datatypes/meg.yaml b/src/schema/datatypes/meg.yaml
new file mode 100644
index 0000000000..1361781733
--- /dev/null
+++ b/src/schema/datatypes/meg.yaml
@@ -0,0 +1,45 @@
+---
+# First group
+- suffixes:
+ - meg
+ extensions:
+ - / # corresponds to BTi/4D data
+ - .ds/
+ - .json
+ - .fif
+ - .sqd
+ - .con
+ - .raw
+ - .ave
+ - .mrk
+ - .kdf
+ - .mhd
+ entities:
+ sub: required
+ ses: optional
+ task: required
+ acq: optional
+ run: optional
+ proc: optional
+ split: optional
+# Second group
+- suffixes:
+ - headshape
+ extensions:
+ - .pos
+ - .txt
+ entities:
+ sub: required
+ ses: optional
+ acq: optional
+ space: optional
+- suffixes:
+ - markers
+ extensions:
+ - .json
+ entities:
+ sub: required
+ ses: optional
+ task: optional
+ acq: optional
+ space: optional
diff --git a/src/schema/entities.yaml b/src/schema/entities.yaml
new file mode 100644
index 0000000000..bc04a1591f
--- /dev/null
+++ b/src/schema/entities.yaml
@@ -0,0 +1,135 @@
+---
+sub:
+ name: Subject
+ description: |
+ A person or animal participating in the study.
+ format: label
+ses:
+ name: Session
+ description: |
+ A logical grouping of neuroimaging and behavioral data consistent across
+ subjects.
+ Session can (but doesn't have to) be synonymous to a visit in a
+ longitudinal study.
+ In general, subjects will stay in the scanner during one session.
+ However, for example, if a subject has to leave the scanner room and then
+ be re-positioned on the scanner bed, the set of MRI acquisitions will still
+ be considered as a session and match sessions acquired in other subjects.
+ Similarly, in situations where different data types are obtained over
+ several visits (for example fMRI on one day followed by DWI the day after)
+ those can be grouped in one session.
+ Defining multiple sessions is appropriate when several identical or similar
+ data acquisitions are planned and performed on all -or most- subjects,
+ often in the case of some intervention between sessions (e.g., training).
+ format: label
+task:
+ name: Task
+ format: label
+ description: |
+ Each task has a unique label that MUST only consist of letters and/or
+ numbers (other characters, including spaces and underscores, are not
+ allowed).
+ Those labels MUST be consistent across subjects and sessions.
+acq:
+ name: Acquisition
+ description: |
+ The OPTIONAL acq- key/value pair corresponds to a custom label the
+ user MAY use to distinguish a different set of parameters used for
+ acquiring the same modality.
+ For example this should be used when a study includes two T1w images - one
+ full brain low resolution and and one restricted field of view but high
+ resolution.
+ In such case two files could have the following names:
+ sub-01_acq-highres_T1w.nii.gz and sub-01_acq-lowres_T1w.nii.gz, however the
+ user is free to choose any other label than highres and lowres as long as
+ they are consistent across subjects and sessions.
+ In case different sequences are used to record the same modality (e.g. RARE
+ and FLASH for T1w) this field can also be used to make that distinction.
+ At what level of detail to make the distinction (e.g. just between RARE and
+ FLASH, or between RARE, FLASH, and FLASHsubsampled) remains at the
+ discretion of the researcher.
+ format: label
+ce:
+ name: Contrast Enhancing Agent
+ description: |
+ Similarly the OPTIONAL ce- key/value can be used to distinguish
+ sequences using different contrast enhanced images.
+ The label is the name of the contrast agent.
+ The key ContrastBolusIngredient MAY be also be added in the JSON file, with
+ the same label.
+ format: label
+rec:
+ name: Reconstruction
+ description: |
+ Similarly the OPTIONAL rec- key/value can be used to distinguish
+ different reconstruction algorithms (for example ones using motion
+ correction).
+ format: label
+dir:
+ name: Phase-Encoding Direction
+ description: |
+ Similarly the OPTIONAL dir- key/value can be used to distinguish
+ different phase-encoding directions.
+ format: label
+run:
+ name: Run
+ description: |
+ If several scans of the same modality are acquired they MUST be indexed
+ with a key-value pair: `_run-1`, `_run-2`, `_run-3` etc. (only integers
+ are allowed as run labels).
+ When there is only one scan of a given type the run key MAY be omitted.
+ Please note that diffusion imaging data is stored elsewhere (see below).
+ format: index
+mod:
+ name: Corresponding Modality
+ description: |
+ In such cases the OPTIONAL `mod-` key/value pair corresponds to
+ modality label for eg: T1w, inplaneT1, referenced by a defacemask image.
+ E.g., sub-01_mod-T1w_defacemask.nii.gz.
+ format: label
+echo:
+ name: Echo
+ description: |
+ Multi-echo data MUST be split into one file per echo.
+ Each file shares the same name with the exception of the `_echo-`
+ key/value.
+ format: index
+recording:
+ name: Recording
+ description: |
+ More than one continuous recording file can be included (with different
+ sampling frequencies).
+ In such case use different labels.
+ For example: `_recording-contrast`, `_recording-saturation`.
+ format: label
+proc:
+ name: Processed (on device)
+ description: |
+ The proc label is analogous to rec for MR and denotes a variant of a file
+ that was a result of particular processing performed on the device.
+ This is useful for files produced in particular by Elekta’s MaxFilter
+ (e.g. sss, tsss, trans, quat, mc, etc.), which some installations impose to
+ be run on raw data because of active shielding software corrections before
+ the MEG data can actually be exploited.
+ format: label
+space:
+ name: Space
+ description: |
+ The optional space label (`*[_space-]_electrodes.tsv`) can be used
+ to indicate the way in which electrode positions are interpreted.
+ The space label needs to be taken from the list in Appendix VIII.
+ format: label
+split:
+ name: Split
+ description: |
+ In the case of long data recordings that exceed a file size of 2Gb, the
+ .fif files are conventionally split into multiple parts.
+ Each of these files has an internal pointer to the next file.
+ This is important when renaming these split recordings to the BIDS
+ convention.
+ Instead of a simple renaming, files should be read in and saved under their
+ new names with dedicated tools like MNE, which will ensure that not only
+ the file names, but also the internal file pointers will be updated.
+ It is RECOMMENDED that .fif files with multiple parts use the
+ `split-` entity to indicate each part.
+ format: index
diff --git a/src/schema/modalities.yaml b/src/schema/modalities.yaml
new file mode 100644
index 0000000000..f49e671b65
--- /dev/null
+++ b/src/schema/modalities.yaml
@@ -0,0 +1,24 @@
+---
+mri:
+ name: Magnetic Resonance Imaging
+ datatypes:
+ - anat
+ - dwi
+ - fmap
+ - func
+eeg:
+ name: Electroencephalography
+ datatypes:
+ - eeg
+ieeg:
+ name: Intracranial Electroencephalography
+ datatypes:
+ - ieeg
+meg:
+ name: Magnetoencephalography
+ datatypes:
+ - meg
+beh:
+ name: Behavioral experiments
+ datatypes:
+ - beh
diff --git a/src/schema/top_level_files.yaml b/src/schema/top_level_files.yaml
new file mode 100644
index 0000000000..42b402a763
--- /dev/null
+++ b/src/schema/top_level_files.yaml
@@ -0,0 +1,26 @@
+---
+README:
+ required: true
+ extensions:
+ - None
+CHANGES:
+ required: true
+ extensions:
+ - None
+LICENSE:
+ required: false
+ extensions:
+ - None
+dataset_description:
+ required: true
+ extensions:
+ - .json
+genetic_info:
+ required: false
+ extensions:
+ - .json
+participants:
+ required: false
+ extensions:
+ - .tsv
+ - .json
diff --git a/tools/bids_schema.py b/tools/bids_schema.py
new file mode 100755
index 0000000000..ce17feb9ed
--- /dev/null
+++ b/tools/bids_schema.py
@@ -0,0 +1,360 @@
+#!/usr/bin/env python3
+import argparse
+from itertools import chain
+import logging
+import os
+from pathlib import Path
+import sys
+from warnings import warn
+
+import numpy as np
+import pandas as pd
+import yaml
+
+
+#
+# Aux utilities
+#
+def is_interactive():
+ """Return True if all in/outs are tty"""
+ # TODO: check on windows if hasattr check would work correctly and add
+ # value:
+ return sys.stdin.isatty() and sys.stdout.isatty() and sys.stderr.isatty()
+
+
+def setup_exceptionhook(ipython=False):
+ """Overloads default sys.excepthook with our exceptionhook handler.
+
+ If interactive, our exceptionhook handler will invoke
+ pdb.post_mortem; if not interactive, then invokes default handler.
+ """
+
+ def _pdb_excepthook(type, value, tb):
+ import traceback
+
+ traceback.print_exception(type, value, tb)
+ print()
+ if is_interactive():
+ import pdb
+
+ pdb.post_mortem(tb)
+
+ if ipython:
+ from IPython.core import ultratb
+
+ sys.excepthook = ultratb.FormattedTB(
+ mode="Verbose",
+ # color_scheme='Linux',
+ call_pdb=is_interactive(),
+ )
+ else:
+ sys.excepthook = _pdb_excepthook
+
+
+def get_logger(name=None):
+ """Return a logger to use
+ """
+ return logging.getLogger("bids-schema" + (".%s" % name if name else ""))
+
+
+def set_logger_level(lgr, level):
+ if isinstance(level, int):
+ pass
+ elif level.isnumeric():
+ level = int(level)
+ elif level.isalpha():
+ level = getattr(logging, level)
+ else:
+ lgr.warning("Do not know how to treat loglevel %s" % level)
+ return
+ lgr.setLevel(level)
+
+
+_DEFAULT_LOG_FORMAT = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
+
+lgr = get_logger()
+# Basic settings for output, for now just basic
+set_logger_level(lgr, os.environ.get("BIDS_SCHEMA_LOG_LEVEL", logging.INFO))
+FORMAT = "%(asctime)-15s [%(levelname)8s] %(message)s"
+logging.basicConfig(format=FORMAT)
+
+BIDS_SCHEMA = Path(__file__).parent.parent / "src" / "schema"
+
+
+def _get_entry_name(path):
+ if path.suffix == '.yaml':
+ return path.name[:-5] # no .yaml
+ else:
+ return path.name
+
+
+def _get_parser():
+ """
+ Parses command line inputs for NiMARE
+
+ Returns
+ -------
+ parser.parse_args() : argparse dict
+ """
+ parser = argparse.ArgumentParser(prog='bids')
+ subparsers = parser.add_subparsers(help='BIDS workflows')
+ # show()
+ show_parser = subparsers.add_parser(
+ 'show',
+ help=('Print out the schema'),
+ )
+ show_parser.set_defaults(func=show)
+ show_parser.add_argument(
+ 'schema_path',
+ type=Path,
+ help=('Path to schema to show.')
+ )
+ # entity_table()
+ entity_parser = subparsers.add_parser(
+ 'entity',
+ help=('Print entity table')
+ )
+ entity_parser.set_defaults(func=save_entity_table)
+ entity_parser.add_argument(
+ 'schema_path',
+ type=Path,
+ help=('Path to schema to show.')
+ )
+ entity_parser.add_argument(
+ 'out_file',
+ type=str,
+ help=('Output filename.')
+ )
+ return parser
+
+
+def load_schema(schema_path):
+ """The schema loader
+
+ It allows for schema, like BIDS itself, to be specified in
+ a hierarchy of directories and files.
+ File (having .yaml stripped) and directory names become keys
+ in the associative array (dict) of entries composed from content
+ of files and entire directories.
+
+ Parameters
+ ----------
+ schema_path : str
+ Folder containing yaml files or yaml file.
+
+ Returns
+ -------
+ dict
+ Schema in dictionary form.
+ """
+ schema_path = Path(schema_path)
+ if schema_path.is_file() and (schema_path.suffix == '.yaml'):
+ with open(schema_path) as f:
+ return yaml.load(f, Loader=yaml.SafeLoader)
+ elif schema_path.is_dir():
+ # iterate through files and subdirectories
+ res = {
+ _get_entry_name(path): load_schema(path)
+ for path in sorted(schema_path.iterdir())
+ }
+ return {k: v for k, v in res.items() if v is not None}
+ else:
+ warn(f"{schema_path} is somehow nothing we can load")
+
+
+def show(schema_path):
+ """Print full schema."""
+ schema = load_schema(schema_path)
+ print(yaml.safe_dump(schema, default_flow_style=False))
+
+
+def drop_unused_entities(df):
+ df = df.replace('', np.nan).dropna(axis=1, how='all').fillna('')
+ return df
+
+
+def flatten_multiindexed_columns(df):
+ # Flatten multi-index
+ vals = df.index.tolist()
+ df.loc['Format'] = df.columns.get_level_values(1)
+ df.columns = df.columns.get_level_values(0)
+ df = df.loc[['Format'] + vals]
+ df.index.name = 'Entity'
+ df = df.drop(columns=['DataType'])
+ return df
+
+
+def make_entity_table(schema_path):
+ """Produce entity table (markdown) based on schema.
+ This only works if the top-level schema *directory* is provided.
+
+ Parameters
+ ----------
+ schema_path : str
+ Folder containing schema, which is stored in yaml files.
+
+ Returns
+ -------
+ table : pandas.DataFrame
+ DataFrame of entity table, with two layers of columns.
+ """
+ schema = load_schema(schema_path)
+
+ # prepare the table based on the schema
+ # import pdb; pdb.set_trace()
+ header = ['Entity', 'DataType']
+ formats = ['Format', 'DataType']
+ entity_to_col = {}
+ table = [formats]
+
+ # Compose header and formats first
+ for i, (entity, spec) in enumerate(schema['entities'].items()):
+ header.append(spec["name"])
+ formats.append(f'`{entity}-<{spec["format"]}>`')
+ entity_to_col[entity] = i + 1
+
+ # Go through data types
+ for dtype, specs in chain(schema['datatypes'].items(),
+ schema['auxdatatypes'].items()):
+ dtype_rows = {}
+
+ # each dtype could have multiple specs
+ for spec in specs:
+ # datatypes use suffixes, while
+ # for auxdatatypes we need to use datatypes
+ # TODO: RF to avoid this guesswork
+ suffixes = spec.get('datatypes') or spec.get('suffixes')
+ # TODO: is specific for html form
+ suffixes_str = ' '.join(suffixes) if suffixes else ''
+ dtype_row = [dtype] + ([''] * len(entity_to_col))
+ for ent, req in spec.get('entities', []).items():
+ dtype_row[entity_to_col[ent]] = req.upper()
+
+ # Merge specs within dtypes if they share all of the same entities
+ if dtype_row in dtype_rows.values():
+ for k, v in dtype_rows.items():
+ if dtype_row == v:
+ dtype_rows.pop(k)
+ new_k = k + ' ' + suffixes_str
+ new_k = new_k.strip()
+ dtype_rows[new_k] = v
+ break
+ else:
+ dtype_rows[suffixes_str] = dtype_row
+
+ # Reformat first column
+ dtype_rows = {dtype+' ({})'.format(k): v for k, v in
+ dtype_rows.items()}
+ dtype_rows = [[k] + v for k, v in dtype_rows.items()]
+ table += dtype_rows
+
+ # Create multi-level index because first two rows are headers
+ cols = list(zip(header, table[0]))
+ cols = pd.MultiIndex.from_tuples(cols)
+ table = pd.DataFrame(data=table[1:], columns=cols)
+ table = table.set_index(('Entity', 'Format'))
+
+ # Now we can split as needed, in the next function
+ return table
+
+
+def make_entity_table_markdown(schema_path, tablefmt='github'):
+ """
+ Create a tabulated entity table from the schema.
+
+ This only works if the top-level schema *directory* is provided.
+
+ Parameters
+ ----------
+ schema_path : str
+ Path to schema.
+ tablefmt : {'github'}, optional
+ Format for tabulated table.
+
+ Returns
+ -------
+ out_tables : dict
+ Dictionary of tabulated entity tables, with table title as key.
+ """
+ from tabulate import tabulate
+ table = make_entity_table(schema_path)
+
+ # Split table
+ EG_DATATYPES = ['eeg', 'ieeg', 'meg', 'channels', 'electrodes', 'events',
+ 'photo']
+ MRI_DATATYPES = ['anat', 'func', 'fmap', 'dwi']
+ mri_table = table.loc[
+ table[('DataType', 'DataType')].isin(MRI_DATATYPES)
+ ]
+ eg_table = table.loc[
+ table[('DataType', 'DataType')].isin(EG_DATATYPES)
+ ]
+ beh_table = table[
+ ~table[('DataType', 'DataType')].isin(MRI_DATATYPES + EG_DATATYPES)
+ ]
+
+ out_tables = {}
+ titles = [
+ '## Magnetic Resonance Imaging',
+ '## Encephalography (EEG, iEEG, and MEG)',
+ '## Behavioral Data'
+ ]
+ tables = [mri_table, eg_table, beh_table]
+ for i, table in enumerate(tables):
+ title = titles[i]
+ table = drop_unused_entities(table)
+ table = flatten_multiindexed_columns(table)
+ # print it as markdown
+ table_str = tabulate(table, headers='keys', tablefmt=tablefmt)
+ out_tables[title] = table_str
+
+ return out_tables
+
+
+def save_entity_table(schema_path, out_file):
+
+ tables = make_entity_table_markdown(schema_path)
+
+ intro_text = """\
+# Appendix IV: Entity table
+
+This section compiles the entities (key-value pairs) described throughout this
+specification, and establishes a common order within a filename.
+For example, if a file has an acquisition and reconstruction label, the
+acquisition entity must precede the reconstruction entity.
+REQUIRED and OPTIONAL entities for a given file type are denoted.
+Entity formats indicate whether the value is alphanumeric
+(``) or numeric (``).
+
+A general introduction to entities is given in the section on
+[file name structure](../02-common-principles.md#file-name-structure)
+"""
+ with open(out_file, 'w') as fo:
+ fo.write(intro_text)
+ fo.write('\n')
+ for i, (title, table) in enumerate(tables.items()):
+ fo.write(title)
+ fo.write('\n\n')
+ fo.write(table)
+ if i == len(tables) - 1:
+ fo.write('\n')
+ else:
+ fo.write('\n\n')
+
+
+def _main(argv=None):
+ """BIDS schema CLI entrypoint.
+
+ Examples
+ --------
+ >>> python bids_schema.py entity ../src/schema/ \
+ >>> ../src/99-appendices/04-entity-table.md
+ """
+ options = _get_parser().parse_args(argv)
+ args = vars(options).copy()
+ args.pop('func')
+ options.func(**args)
+
+
+if __name__ == '__main__':
+ _main()