Skip to content

Commit

Permalink
Tutorials (#165)
Browse files Browse the repository at this point in the history
* Rename: sliding_window -> moving_average.
* Doc: typo.
* Refact: better naming and more validation.
* Fix: HSMM state time course should be int.
* Doc: updated tutorials.
  • Loading branch information
cgohil8 authored Jul 7, 2023
1 parent b7d6817 commit 0b0976c
Show file tree
Hide file tree
Showing 28 changed files with 691 additions and 1,192 deletions.
7 changes: 4 additions & 3 deletions doc/documentation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,9 @@ The following tutorials illustrate basic usage and analysis that can be done wit
- :doc:`tutorials_build/dynemo_mixing_coef_analysis`.
- :doc:`tutorials_build/dynemo_plotting_networks`.

**Other**:
More examples scripts can be found in the `examples directory <https://github.com/OHBA-analysis/osl-dynamics/tree/main/examples>`_ of the repo.

- :doc:`tutorials_build/statistical_significance_testing`.
Workshops
---------

More examples scripts can be found in the `examples directory <https://github.com/OHBA-analysis/osl-dynamics/tree/main/examples>`_ of the repo.
- `2023 OHBA Software Library (OSL) workshop <https://osf.io/zxb6c/>`_.
2 changes: 1 addition & 1 deletion doc/models/dynemo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ Similar to the `HMM <hmm.html>`_, we perform variational Bayes on the latent var
Amortized Variational Inference
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In DyNeMo, we use a new approach for variational Bayes (from variational auto-encoders [3]) known as **amortized variational inference**. Here, we train an 'inference network' (**inference RNN**) to predict the posterior distribution for the model parameters. This network learns a mapping from the observed data to the parameters of the posterior distributions. This allows us to allows us to efficiently scale to large datasets [3].
In DyNeMo, we use a new approach for variational Bayes (from variational auto-encoders [3]) known as **amortized variational inference**. Here, we train an 'inference network' (**inference RNN**) to predict the posterior distribution for the model parameters. This network learns a mapping from the observed data to the parameters of the posterior distributions. This allows us to efficiently scale to large datasets [3].

Cost Function
^^^^^^^^^^^^^
Expand Down
96 changes: 67 additions & 29 deletions doc/tutorials/data_loading.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,49 +3,45 @@
============
In this tutorial we demonstrate the various options for loading data. This tutorial covers:
1. The Data Class
2. Getting Example Data
3. Loading Data in NumPy Format
4. Loading Data in MATLAB Format
5. Loading Data in fif Format
Note, this webpage does not contain the output of running each cell. See `OSF <https://osf.io/9768c>`_ for the expected output.
"""

#%%
# The Data class
# ^^^^^^^^^^^^^^
#
# In osl-dynamics we typically load data using the `osl_dynamics.data.Data class <https://osl-dynamics.readthedocs.io/en/latest/autoapi/osl_dynamics/data/base/index.html#osl_dynamics.data.base.Data>`_. The Data class has a lot of useful methods that can be used to modify the data.
#
#
# Inputs
# ******
#
# There is one mandatory argument that needs to be passed to the Data class: `inputs`. This can be:
#
#
# - A path to a directory containing .npy files. Each .npy file should be a subject or session.
# - A list of paths to .npy, .mat or .fif files. Each file should be a subject or session.
# - A list of paths to .npy, .mat, or .fif files. Each file should be a subject or session.
# - A numpy array. The array will be treated as continuous data from the same subject.
# - A list of numpy arrays. Each numpy array should be the data for a subject or session.
#
#
# Data format
# ***********
#
# The data files or numpy arrays should be in the format `(n_samples, n_channels)`, i.e. time by channels. If your data is in `(n_channels, n_samples)` format, use should also pass `time_axis_first=False` to the Data class.
#
#
# The temporary store directory
# *****************************
#
# Note, when we load data using the Data class it loads the data as a `memory map <https://numpy.org/doc/stable/reference/generated/numpy.memmap.html>`_. This allows us to access the data without holding it in memory. If you prefer to load the data into memory pass `load_memmaps=False`. The Data class creates a directory called `tmp` which is used for storing temporary data (memory map files and prepared data). This directory can be safely deleted after you run your script. You can specify the name of the temporary directory by passing the `store_dir` argument.
#
#
# We will demonstate how the Data class is used with example data below.
#
#
# Getting Example Data
# ^^^^^^^^^^^^^^^^^^^^
#
#
# Download the dataset
# ********************
#
# We will download example data hosted on `OSF <https://osf.io/by2tc/>`_. Note, `osfclient` must be installed. This can be done in jupyter notebook by running::
#
# !pip install osfclient
Expand All @@ -61,19 +57,24 @@ def get_data(name):
os.remove(f"{name}.zip")
return f"Data downloaded to: {name}"

# Download the dataset (approximately 52 MB)
# Download the dataset (approximately 88 MB)
get_data("example_loading_data")

# List the contents of the downloaded directory containing the dataset
print("Contents of example_loading_data:")
os.listdir("example_loading_data")

#%%
# We can see there's two directories in `example_loading_data`: `numpy_format`, which contains `.npy` files, and `matlab_format`, which contains `.mat` files. We'll show how to load data in each of these data types.
#
# We can see there's three directories in `example_loading_data`:
#
# - `numpy_format`, which contains `.npy` files.
# - `matlab_format`, which contains `.mat` files.
# - `fif_format`, which contains directories with `.fif` files.
#
# We'll show how to load data in each of these data types.
#
# Loading Data in NumPy Format
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#
# Let's first list the `example_loading_data/numpy_format` directory.

os.listdir("example_loading_data/numpy_format")
Expand All @@ -90,7 +91,6 @@ def get_data(name):
#%%
# Importing a numpy array directly
# ********************************
#
# If we have already loaded a numpy array and just want to create an `osl_dynamics.data.Data` object, we can simply pass it to the class:

from osl_dynamics.data import Data
Expand All @@ -115,7 +115,6 @@ def get_data(name):
#%%
# Loading from file
# *****************
#
# Rather than loading the data into memory then creating a Data object, we could load the data directly from the file.

# Just load one of the files
Expand Down Expand Up @@ -152,13 +151,12 @@ def get_data(name):
#%%
# Loading Data in MATLAB Format
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#
# We will discuss two methods for loading MATLAB files. First, we will load the MATLAB files using public python packages (`scipy` and `mat73`), then we'll show how to pass MATLAB files to the Data class.
#
# ### Loading MATLAB files in Python
#
#
# Loading MATLAB files in Python
# ******************************
# The popular python package SciPy has a function for loading MATLAB files: `scipy.io.loadmat <https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.loadmat.html>`_. Note, this function can only be used to load a newer version of MATLAB files, if you saved your files using `v7.3` format, then you need to use `mat73.loadmat <https://github.com/skjerns/mat7.3>`_ to load the file in python. Both of these packages are automatically installed when you install osl-dynamics.
#
#
# Let first see what files we have in the `example_loading_data/matlab_format` directory.

os.listdir("example_loading_data/matlab_format")
Expand All @@ -179,10 +177,9 @@ def get_data(name):

#%%
# The important field is `X`, which is the one that contains the 2D time series data for this subject. Note, MATLAB files created using the `HMM-MAR <https://github.com/OHBA-analysis/HMM-MAR>`_ toolbox come in the above format, i.e. with a `X` and `T` field. For us, only the `X` matters.
#
#
# Loading MATLAB data into the Data class
# ***************************************
#
# We can pass the numpy array contained in the `X` field of the dictionary directly to the Data class:

data = Data(mat["X"])
Expand All @@ -196,16 +193,57 @@ def get_data(name):

#%%
# Note, the default value for the `data_field` argument is `X`, so the Data class would still be able to load the data without it being passed. The `data_field` is useful if the data is contained in a MATLAB in a field with a different name.
#
#
# If we wanted to load multiple data files in MATLAB format we would need to pass a list of file paths.

files = [f"example_loading_data/matlab_format/subject{i}.mat" for i in [0, 1]]
data = Data(files)
print(data)

#%%
# Loading fif files
# *****************
# Another data format that can be loaded with the Data class is fif files. This format is commonly used in `MNE-Python <https://mne.tools/stable/index.html>`_ and is the data format used in `OSL <https://github.com/OHBA-analysis/osl>`_. Here, we will load source reconstruct (parcellated) data created with OSL. In OSL, we often have a separate directory for each subject. The `fif_format` directory contains two directories for different subjects.

os.listdir("example_loading_data/fif_format")

#%%
# Let's see what's inside `subj001_run01`.

os.listdir("example_loading_data/fif_format/subj001_run01")

#%%
# We have a fif file which contains the data for this subject. We could load this with MNE.

import mne

raw = mne.io.read_raw_fif("example_loading_data/fif_format/subj001_run01/sflip_parc-raw.fif")
print(raw.info)

#%%
# We can see this particular fif file contains 38 `misc` channels and 3 `stim` channels. We're interested in the `misc` channels. Let's load these into the Data class.

data = Data(
"example_loading_data/fif_format/subj001_run01/sflip_parc-raw.fif",
picks="misc",
reject_by_annotation="omit",
)
print(data)

#%%
# The `reject_by_annotation="omit"` argument is used to make sure we don't include bad segments. This argument is passed to `Raw.get_data <https://mne.tools/stable/generated/mne.io.Raw.html#mne.io.Raw.get_data>`_ in MNE.
#
# To load multiple subjects we can do:

files =[
f"example_loading_data/fif_format/subj{i:03d}_run01/sflip_parc-raw.fif"
for i in range(1,3)
]
data = Data(files, picks="misc", reject_by_annotation="omit")
print(data)

#%%
# Wrap Up
# ^^^^^^^
#
# - We've shown how to load data using the Data class in osl-dynamics.
# - To see how we can prepare data for training a model, see the `Preparing Data tutorial <https://osl-dynamics.readthedocs.io/en/latest/tutorials_build/data_preparation.html>`_.
Loading

0 comments on commit 0b0976c

Please sign in to comment.