diff --git a/core/nwb.misc.yaml b/core/nwb.misc.yaml index d082c4b1..c5424c51 100644 --- a/core/nwb.misc.yaml +++ b/core/nwb.misc.yaml @@ -281,3 +281,49 @@ groups: value: volts doc: Unit of measurement. This value is fixed to 'volts'. required: false + - name: waveforms + neurodata_type_inc: VectorData + dtype: numeric + dims: + - num_waveforms + - num_samples + shape: + - null + - null + doc: "Individual waveforms for each spike on each electrode. This is a doubly indexed column. The 'waveforms_index' + column indexes which waveforms in this column belong to the same spike event for a given unit, where each waveform + was recorded from a different electrode. The 'waveforms_index_index' column indexes the 'waveforms_index' column + to indicate which spike events belong to a given unit. For example, if the + 'waveforms_index_index' column has values [2, 5, 6], then the first 2 elements of the 'waveforms_index' column + correspond to the 2 spike events of the first unit, the next 3 elements of the 'waveforms_index' column correspond + to the 3 spike events of the second unit, and the next 1 element of the 'waveforms_index' column corresponds to + the 1 spike event of the third unit. If the 'waveforms_index' column has values [3, 6, 8, 10, 12, 13], then + the first 3 elements of the 'waveforms' column contain the 3 spike waveforms that were recorded from 3 different + electrodes for the first spike time of the first unit. See + https://nwb-schema.readthedocs.io/en/stable/format_description.html#doubly-ragged-arrays for a graphical + representation of this example. When there is only one electrode for each unit (i.e., each spike time is + associated with a single waveform), then the 'waveforms_index' column will have values 1, 2, ..., N, where N is + the number of spike events. The number of electrodes for each spike event should be the same within a given unit. + The 'electrodes' column should be used to indicate which electrodes are associated with each unit, and the order + of the waveforms within a given unit x spike event should be in the same order as the electrodes referenced in + the 'electrodes' column of this table. The number of samples for each waveform must be the same." + quantity: '?' + attributes: + - name: sampling_rate + dtype: float32 + doc: Sampling rate, in hertz. + required: false + - name: unit + dtype: text + value: volts + doc: Unit of measurement. This value is fixed to 'volts'. + required: false + - name: waveforms_index + neurodata_type_inc: VectorIndex + doc: Index into the waveforms dataset. One value for every spike event. See 'waveforms' for more detail. + quantity: '?' + - name: waveforms_index_index + neurodata_type_inc: VectorIndex + doc: Index into the waveforms_index dataset. One value for every unit (row in the table). See 'waveforms' for more + detail. + quantity: '?' diff --git a/docs/format/source/figures/units_electrodes.png b/docs/format/source/figures/units_electrodes.png new file mode 100644 index 00000000..88eabb93 Binary files /dev/null and b/docs/format/source/figures/units_electrodes.png differ diff --git a/docs/format/source/figures/units_spike_times.png b/docs/format/source/figures/units_spike_times.png new file mode 100644 index 00000000..241bfa69 Binary files /dev/null and b/docs/format/source/figures/units_spike_times.png differ diff --git a/docs/format/source/figures/units_table.pptx b/docs/format/source/figures/units_table.pptx new file mode 100644 index 00000000..a30296f9 Binary files /dev/null and b/docs/format/source/figures/units_table.pptx differ diff --git a/docs/format/source/figures/units_waveforms.png b/docs/format/source/figures/units_waveforms.png new file mode 100644 index 00000000..00aa8316 Binary files /dev/null and b/docs/format/source/figures/units_waveforms.png differ diff --git a/docs/format/source/format_description.rst b/docs/format/source/format_description.rst index f4ecd30a..22045b06 100644 --- a/docs/format/source/format_description.rst +++ b/docs/format/source/format_description.rst @@ -345,3 +345,56 @@ The timestamps\_link and data\_link fields refer to links made between time series, such as if timeseries A and timeseries B, each having different data (or time) share time (or data). This is much more important information as it shows structural associations in the data. + + +Tables and ragged arrays +------------------------ + +The NWB schema includes several tables, such as for storing data/metadata +about trials, epochs, single units and multi-units, electrodes, and ROIs. +All of the tables in NWB derive from the base data type, DynamicTable. +DynamicTable is a column-based representation of a table that allows +users to add custom columns (of type VectorData) that are not +pre-defined in the specification. This is useful for handling types of +data where every experiment or lab may want to store information +unique to that experiment or lab, e.g., metadata +related to the trials in a session or spike sorting metrics. + +DynamicTable objects typically contain columns that are of equal length, +where the i-th element of a column corresponds to the i-th element of +all of the other columns. In other words, each row has a single item +in each column. However, in some situations, users may wish to store and +associate multiple items in a single column for each row. For example, +in the Units table, each row represents a single sorted unit and each +unit has multiple spike times associated with it, where the number of +spike times differs between units (rows). This is sometimes called a +ragged array or jagged array. + +Ragged array columns can be created by creating a primary VectorData +column that contains all of the data values (e.g., spike times) and +creating a secondary VectorIndex column that contains a mapping from rows +to elements of its target VectorData column. The VectorIndex column has the same +number of elements (rows) as the rest of the table. + +The values of the VectorIndex column follow the mapping such that the data +associated with the first row is at VectorData[0:VectorIndex[0]], and the data +associated with the second row is at VectorData[VectorIndex[0]:VectorIndex[1]], +and so on. + +.. image:: figures/units_spike_times.png + :width: 800 + :alt: Demonstration of how spike times are stored in a ragged array column in the Units table. + +Doubly ragged arrays +--------------------- + +.. image:: figures/units_waveforms.png + :width: 800 + :alt: Demonstration of how waveforms are stored in a double ragged array column in the Units table. + +References to rows of a table +------------------------------ + +.. image:: figures/units_electrodes.png + :width: 800 + :alt: Demonstration of how references to rows of the electrodes table are stored in the electrodes column of the Units table. diff --git a/docs/format/source/format_release_notes.rst b/docs/format/source/format_release_notes.rst index e91a89a9..eedcd2f4 100644 --- a/docs/format/source/format_release_notes.rst +++ b/docs/format/source/format_release_notes.rst @@ -4,11 +4,12 @@ Release Notes 2.3.0 (April 23, 2021) ---------------- +- Add optional ``waveforms`` column to the ``Units`` table. - Add optional ``strain`` field to ``Subject``. - Add to ``DecompositionSeries`` an optional ``DynamicTableRegion`` called ``source_channels``. - Add to ``ImageSeries`` an optional link to ``Device``. +- Add optional ``continuity`` field to ``TimeSeries``. - Clarify documentation for electrode impedance and filtering. -- Add optional "continuity" field to ``TimeSeries``. - Update hdmf-common-schema from 1.1.3 to version 1.5.0. - The HDMF-experimental namespace was added, which includes the ``ExternalResources`` and ``EnumData`` data types. Schema in the HDMF-experimental namespace are experimental and subject to breaking changes at any time.