Merge branch 'dev' into hdmf_common_1.5.0

NeurodataWithoutBorders · Apr 23, 2021 · 2f3d776 · 2f3d776
2 parents 5d2cab5 + a051226
commit 2f3d776
Show file tree

Hide file tree

Showing 7 changed files with 101 additions and 1 deletion.
diff --git a/core/nwb.misc.yaml b/core/nwb.misc.yaml
@@ -281,3 +281,49 @@ groups:
       value: volts
       doc: Unit of measurement. This value is fixed to 'volts'.
       required: false
+  - name: waveforms
+    neurodata_type_inc: VectorData
+    dtype: numeric
+    dims:
+      - num_waveforms
+      - num_samples
+    shape:
+      - null
+      - null
+    doc: "Individual waveforms for each spike on each electrode. This is a doubly indexed column. The 'waveforms_index'
+      column indexes which waveforms in this column belong to the same spike event for a given unit, where each waveform
+      was recorded from a different electrode. The 'waveforms_index_index' column indexes the 'waveforms_index' column
+      to indicate which spike events belong to a given unit. For example, if the
+      'waveforms_index_index' column has values [2, 5, 6], then the first 2 elements of the 'waveforms_index' column
+      correspond to the 2 spike events of the first unit, the next 3 elements of the 'waveforms_index' column correspond
+      to the 3 spike events of the second unit, and the next 1 element of the 'waveforms_index' column corresponds to
+      the 1 spike event of the third unit. If the 'waveforms_index' column has values [3, 6, 8, 10, 12, 13], then
+      the first 3 elements of the 'waveforms' column contain the 3 spike waveforms that were recorded from 3 different
+      electrodes for the first spike time of the first unit. See
+      https://nwb-schema.readthedocs.io/en/stable/format_description.html#doubly-ragged-arrays for a graphical
+      representation of this example. When there is only one electrode for each unit (i.e., each spike time is
+      associated with a single waveform), then the 'waveforms_index' column will have values 1, 2, ..., N, where N is
+      the number of spike events. The number of electrodes for each spike event should be the same within a given unit.
+      The 'electrodes' column should be used to indicate which electrodes are associated with each unit, and the order
+      of the waveforms within a given unit x spike event should be in the same order as the electrodes referenced in
+      the 'electrodes' column of this table. The number of samples for each waveform must be the same."
+    quantity: '?'
+    attributes:
+      - name: sampling_rate
+        dtype: float32
+        doc: Sampling rate, in hertz.
+        required: false
+      - name: unit
+        dtype: text
+        value: volts
+        doc: Unit of measurement. This value is fixed to 'volts'.
+        required: false
+  - name: waveforms_index
+    neurodata_type_inc: VectorIndex
+    doc: Index into the waveforms dataset. One value for every spike event. See 'waveforms' for more detail.
+    quantity: '?'
+  - name: waveforms_index_index
+    neurodata_type_inc: VectorIndex
+    doc: Index into the waveforms_index dataset. One value for every unit (row in the table). See 'waveforms' for more
+      detail.
+    quantity: '?'
diff --git a/docs/format/source/figures/units_electrodes.png b/docs/format/source/figures/units_electrodes.png
diff --git a/docs/format/source/figures/units_spike_times.png b/docs/format/source/figures/units_spike_times.png
diff --git a/docs/format/source/figures/units_table.pptx b/docs/format/source/figures/units_table.pptx
diff --git a/docs/format/source/figures/units_waveforms.png b/docs/format/source/figures/units_waveforms.png
diff --git a/docs/format/source/format_description.rst b/docs/format/source/format_description.rst
@@ -345,3 +345,56 @@ The timestamps\_link and data\_link fields refer to links made between
 time series, such as if timeseries A and timeseries B, each having
 different data (or time) share time (or data). This is much more
 important information as it shows structural associations in the data.
+
+
+Tables and ragged arrays
+------------------------
+
+The NWB schema includes several tables, such as for storing data/metadata
+about trials, epochs, single units and multi-units, electrodes, and ROIs.
+All of the tables in NWB derive from the base data type, DynamicTable.
+DynamicTable is a column-based representation of a table that allows
+users to add custom columns (of type VectorData) that are not
+pre-defined in the specification. This is useful for handling types of
+data where every experiment or lab may want to store information
+unique to that experiment or lab, e.g., metadata
+related to the trials in a session or spike sorting metrics.
+
+DynamicTable objects typically contain columns that are of equal length,
+where the i-th element of a column corresponds to the i-th element of
+all of the other columns. In other words, each row has a single item
+in each column. However, in some situations, users may wish to store and
+associate multiple items in a single column for each row. For example,
+in the Units table, each row represents a single sorted unit and each
+unit has multiple spike times associated with it, where the number of
+spike times differs between units (rows). This is sometimes called a
+ragged array or jagged array.
+
+Ragged array columns can be created by creating a primary VectorData
+column that contains all of the data values (e.g., spike times) and
+creating a secondary VectorIndex column that contains a mapping from rows
+to elements of its target VectorData column. The VectorIndex column has the same
+number of elements (rows) as the rest of the table.
+
+The values of the VectorIndex column follow the mapping such that the data
+associated with the first row is at VectorData[0:VectorIndex[0]], and the data
+associated with the second row is at VectorData[VectorIndex[0]:VectorIndex[1]],
+and so on.
+
+.. image:: figures/units_spike_times.png
+  :width: 800
+  :alt: Demonstration of how spike times are stored in a ragged array column in the Units table.
+
+Doubly ragged arrays
+---------------------
+
+.. image:: figures/units_waveforms.png
+  :width: 800
+  :alt: Demonstration of how waveforms are stored in a double ragged array column in the Units table.
+
+References to rows of a table
+------------------------------
+
+.. image:: figures/units_electrodes.png
+  :width: 800
+  :alt: Demonstration of how references to rows of the electrodes table are stored in the electrodes column of the Units table.
diff --git a/docs/format/source/format_release_notes.rst b/docs/format/source/format_release_notes.rst
@@ -4,11 +4,12 @@ Release Notes
 2.3.0 (April 23, 2021)
 ----------------
 
+- Add optional ``waveforms`` column to the ``Units`` table.
 - Add optional ``strain`` field to ``Subject``.
 - Add to ``DecompositionSeries`` an optional ``DynamicTableRegion`` called ``source_channels``.
 - Add to ``ImageSeries`` an optional link to ``Device``.
+- Add optional ``continuity`` field to ``TimeSeries``.
 - Clarify documentation for electrode impedance and filtering.
-- Add optional "continuity" field to ``TimeSeries``.
 - Update hdmf-common-schema from 1.1.3 to version 1.5.0.
   - The HDMF-experimental namespace was added, which includes the ``ExternalResources`` and ``EnumData``
     data types. Schema in the HDMF-experimental namespace are experimental and subject to breaking changes at any time.