Skip to content

EEG notes

Demetris Roumis edited this page Aug 15, 2023 · 22 revisions

EEG Research Overview

Terminology

Terminology
  • "Trace": One line typically corresponding to data from a single sensor

image

  • "Channel":

image

  • "Viewport": The currently visible plot, which may be a subset of the data that is available

image

Experiment Duration, Channels, Subjects

  • EEG recording sessions typically last from 15 minutes to a few hours, depending on the type of study. Some studies, such as sleep studies, can last up to 8-10 hours. There's nothing stopping anyone from doing full or multi-day recordings but that is probably not fun for the participants, so seldom done.
  • Typical Channel Count:
    • Clinical systems often use about 20 channels following the 10-20 system of electrode placement.
    • Research-grade systems can have 32, 64, 128, or 256 channels.
    • High-density systems can have up to several thousands of channels.
  • Studies typically range from a few subjects (e.g., 10-20 for a pilot study) to several hundred in large-scale studies.

Common Experiment Types

  • Resting State: Participants are asked to relax with their eyes open or closed without performing any task.
  • Event-Related Potentials (ERPs): EEG response to a specific sensory, cognitive, or motor event.
  • Cognitive Tasks: Tasks that require mental processing like memory tasks, attention tasks, etc.
  • Neurofeedback: Participants are given real-time feedback about their brainwave patterns and asked to control them.
  • Brain-Computer Interface (BCI): Interactions between the brain and an external device.

Other Modalities Commonly Used Simultaneously

  • fMRI: Functional Magnetic Resonance Imaging
  • fNIRS: Functional Near-Infrared Spectroscopy
  • MEG: Magnetoencephalography
  • Eye Tracking: To monitor visual attention and detect blinks/artifacts.
  • Behavioral Measures: Reaction times, accuracy, accelerometers, position, etc.
  • Biophysical Measures: electrodermal activity, electrocardiogram, temperature, respiration, etc.

Data

Data Size

  • An estimate is that a one-hour recording from a 64-channel system sampled at 500 Hz and saved in 16-bit would take at least 220 MB of storage in raw binary format, not including overhead from metadata, etc.
    • 👉 Therefore, relative to ephys and imaging, the issues with larger-than-memory data visualization for EEG aren't as common.

Data Format

  • See MNE notes for info about MNE data format and handling
  • dtype in-memory: typically floats. Given the signal range, single-precision floats are sufficient.
  • dtype on-disk: typically signed integers, often 16-bit.
  • Common formats include European Data Format (EDF), BioSemi Data Format (BDF), and BrainVision EEG (BV).
    • 👉 Let's focus on EDF for the workflows as it's an open standard independent of a particular manufacturer.
  • Some labs use proprietary formats linked to the specific EEG hardware they use.

Data and Signal Specification

  • Units: EEG data is typically recorded in microvolts (µV).
  • Typical Signal Range: Most EEG signals fall within a range of plus or minus 100 µV, though larger signals may be recorded, particularly in the presence of artifacts such as blinks.
  • Sampling Rate: Commonly between 250 to ~1000 Hz in most modern EEG systems, but some studies may use higher sampling rates, especially for specific purposes like studying high-frequency oscillations.
  • Frequency Range: Typically 1-100 Hz is analyzed, although the human brain produces activity in a wide range of frequencies, with significant signals detectable up to 500 Hz or higher. EEG is most commonly associated with lower frequency bands: delta (0-4 Hz), theta (4-8 Hz), alpha (8-12 Hz), beta (12-30 Hz), and gamma (30-100+ Hz).

Data Generation/Simulation

  • Artificial Neural Networks: These can be used to simulate the underlying neuronal activity and then a model of EEG signal generation can be used to convert this activity into EEG-like data.
  • Dipole Models: These simulate the electrical activity of the cerebral cortex as a distribution of current dipoles. These dipoles generate electric fields that propagate to the scalp, where they can be summed to simulate EEG signals.
  • Stochastic Processes: Random processes, possibly with specific statistical properties, can be used to simulate EEG data. This might include autoregressive models, or simple Gaussian noise.
  • Physiologically-Based Models: These use equations derived from the biophysics of neurons and brain tissue to simulate the generation of EEG signals. An example of such a model is the neural mass model.

Data Generation plans (from simplest to most complex):

  • Approach 1. Use simple noisy sine waves
  • Approach 2. Use a power law noise process to simulate slightly more realistic EEG data
    • In the context of EEG data generation, the power law (brown or pink) noise process is often used as a model for the background activity of the brain. It captures the fractal-like properties observed in the brain's electrical activity, where fluctuations at different timescales exhibit similar statistical properties. Currently using neurodsp.sim.sim_powerlaw but I could have also used something like this code from colorednoise.
  • Approach 2.5 👉 (current) Add blink artifacts and channel correlations to power law process
  • Approach 3. (probably won't do as it requires computed forward solution) Use MNE tools to create a complex model with ERPs, oscillatory activity, artifacts, and spatial correlations
    • Simulate ERPs by creating a waveform and adding this at specific time points. Introduce oscillatory activity at specific frequencies (e.g., alpha, beta, theta, gamma bands) using mne.time_frequency.tfr_morlet. Add common EEG artifacts such as EOG and ECG with mne.simulation.add_eog and mne.simulation.add_ecg, and simulate EMG as high-frequency noise. Incorporate spatial correlations by creating a covariance matrix that models the spatial correlations between channels, and use this matrix to generate multivariate Gaussian noise.

Lists/Sources of real data

Specific real datasets of interest

  • OpenMIIR: EEG data for music information retrieval.
  • Seizure Data: EEG data from subjects with seizures.
    • Found through this paper). The data has a high (2 kHz) sampling rate, 52 channels, 30 participants, 3 hour sessions, and has sleep stage and candidate seizure annotations. Having at least a 2 kHz sampling rate is important for detecting these high-frequency oscillations (HFO). There is also code for HFO detection.
  • CHB-MIT Seizure data These files only have 23 EEG channels and they are split into very short recordings. It seems suitable for small and medium (when combining runs) data size workflow testing.

Software

Simulation and related utilities

  • neurodsp
  • Brainstorm
  • MNE includes sophisticated functionality for simulating biophysically realistic EEG data
  • pyedflib: write numpy array to edfile

Common Analysis Packages (Especially Python)

  • MNE: Python-based software for M/EEG data processing.
  • EEGLAB: An interactive Matlab toolbox for processing continuous and event-related EEG, MEG and other electrophysiological data.
  • Brainstorm: An open-source application dedicated to the analysis of brain recordings.
  • FieldTrip: Matlab software toolbox for MEG and EEG analysis.
  • PyEEG: A Python module for EEG feature extraction.
  • intheon NeuroPype: commercial (but apparently free for academics?) EEG viz and analysis software

Common Visualization Solutions (Especially Python)

  • MNE: Includes capabilities for visualizing EEG data, topographic maps, etc.
    • Clemens has a nice walkthrough of EEG viz here
    • Plenty of other tutorials on the mne.tools website
  • EEGLAB: Provides extensive graphical capabilities.
  • Matplotlib and Seaborn
  • Plotly
  • Bokeh

EEG browsers 🖥️🧠🔍

  • Clemens' overview about EEG browsers, required features, data formats, etc
    • SigViewer (Windows/macOS/Linux): Written in C++/Qt. Supports multiple file formats. Smooth scrolling. Currently requires that all of the data is loaded into memory.
    • MNE (Windows/macOS/Linux): Written in Python. Based on Matplotlib. Supports multiple file formats. Chunky scrolling (page-based). Should also work when data has not been loaded completely into memory.
    • MNE Qt Browser (Windows/macOS/Linux): Written in Python, based on PyQtGraph. Works as an alternative browser backend in MNE, which means it supports the same file formats. Relatively smooth scrolling and zooming (but this depends on the platform and settings such as OpenGL). Should also work when data has not been loaded completely into memory.
    • EDFbrowser (Windows/macOS/Linux): Written in C++/Qt. Supports only EDF files. Relatively smooth scrolling. Not officially supported on macOS.

Processing and Analysis

  • Data Cleaning: EEG data is often noisy and needs to be cleaned. Techniques include filtering to isolate the frequency bands of interest and remove noise, artifact rejection to remove non-neural signals, and Independent Component Analysis (ICA) to statistically separate sources of signal.

  • Time-Frequency Analysis: To explore changes in power across different frequencies over time

  • Source Localization: Analysis techniques that attempt to infer the original neural sources that produced the EEG signal observed at the scalp.

  • Statistical Analysis: differences between conditions or groups, correlations, predictive modeling, etc.

  • extra: Machine Learning: increasingly being used for tasks such as artifact rejection, feature extraction, and pattern classification.