Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad segment treatment for TDE-PCA #298

Open
Shrecki opened this issue Nov 5, 2024 · 2 comments
Open

Bad segment treatment for TDE-PCA #298

Shrecki opened this issue Nov 5, 2024 · 2 comments

Comments

@Shrecki
Copy link

Shrecki commented Nov 5, 2024

Description of the problem

When loading a .fif file with BAD segments annotated, it is possible to discard bad segments, using reject_by_annotations='omit'.

Under the hood, it will call mne.Raw.get_data(reject_by_annotations='omit'), yielding a numpy array where all data points annotated with "bad" are dropped.

It seems as though this behaviour might be wrong for TDE. Conceptually, timepoints [s_i, s_i+1, BAD,BAD,BAD, s_i+5,s_i+6] will become [s_i, s_i+1, s_i+5, s_i+6]. The time embedding will consider s_i+1 and s_i+5 to be directly adjacent time-wise.

Steps to reproduce

python
import osl_dynamics
from osl_dynamics.data import Data
import numpy as np
import mne

X = np.arange(0,100)*1.0
X[10:15] = np.nan

info = mne.create_info(['ch_1'],1, ch_types='bio')
raw = mne.io.RawArray(X.reshape((1,100)), info)

annots = mne.preprocessing.annotate_nan(raw)
raw.set_annotations(annots)
raw.save('example_raw.fif')

data = Data(['example_raw.fif'], reject_by_annotation='omit')

data.tde(n_embeddings=3)

print(data.arrays[0])

Expected result

We should obtain one TDE for indices [0, 9] and one for [15, 99], since they no longer have any temporal dependency.

Actual result

Segments are considered contiguous:

array([[ 2.,  1.,  0.],
       [ 3.,  2.,  1.],
       [ 4.,  3.,  2.],
       [ 5.,  4.,  3.],
       [ 6.,  5.,  4.],
       [ 7.,  6.,  5.],
       [ 8.,  7.,  6.],
       [ 9.,  8.,  7.],
       [15.,  9.,  8.],
       [16., 15.,  9.],
       [17., 16., 15.],
       [18., 17., 16.],
       [19., 18., 17.],
       [20., 19., 18.],
       [21., 20., 19.],
       [22., 21., 20.],
       [23., 22., 21.],
       [24., 23., 22.],
       [25., 24., 23.],
       [26., 25., 24.],
       [27., 26., 25.],
       [28., 27., 26.],
       [29., 28., 27.],
       [30., 29., 28.],
       [31., 30., 29.],
       [32., 31., 30.],
       [33., 32., 31.],
       [34., 33., 32.],
       [35., 34., 33.],
       [36., 35., 34.],
       [37., 36., 35.],
       [38., 37., 36.],
       [39., 38., 37.],
       [40., 39., 38.],
       [41., 40., 39.],
       [42., 41., 40.],
       [43., 42., 41.],
       [44., 43., 42.],
       [45., 44., 43.],
       [46., 45., 44.],
       [47., 46., 45.],
       [48., 47., 46.],
       [49., 48., 47.],
       [50., 49., 48.],
       [51., 50., 49.],
       [52., 51., 50.],
       [53., 52., 51.],
       [54., 53., 52.],
       [55., 54., 53.],
       [56., 55., 54.],
       [57., 56., 55.],
       [58., 57., 56.],
       [59., 58., 57.],
       [60., 59., 58.],
       [61., 60., 59.],
       [62., 61., 60.],
       [63., 62., 61.],
       [64., 63., 62.],
       [65., 64., 63.],
       [66., 65., 64.],
       [67., 66., 65.],
       [68., 67., 66.],
       [69., 68., 67.],
       [70., 69., 68.],
       [71., 70., 69.],
       [72., 71., 70.],
       [73., 72., 71.],
       [74., 73., 72.],
       [75., 74., 73.],
       [76., 75., 74.],
       [77., 76., 75.],
       [78., 77., 76.],
       [79., 78., 77.],
       [80., 79., 78.],
       [81., 80., 79.],
       [82., 81., 80.],
       [83., 82., 81.],
       [84., 83., 82.],
       [85., 84., 83.],
       [86., 85., 84.],
       [87., 86., 85.],
       [88., 87., 86.],
       [89., 88., 87.],
       [90., 89., 88.],
       [91., 90., 89.],
       [92., 91., 90.],
       [93., 92., 91.],
       [94., 93., 92.],
       [95., 94., 93.],
       [96., 95., 94.],
       [97., 96., 95.],
       [98., 97., 96.],
       [99., 98., 97.]], dtype=float32)

I was wondering if it was by design or is, indeed, a bug?

@cgohil8
Copy link
Collaborator

cgohil8 commented Nov 10, 2024

Your understanding is correct. In the previous Matlab implementation with HMM-MAR you would pass the indices where the session is discontinuous and it would take this into account in the time-delay embedding.

In the new python package, we found this made no difference in practice so to simplify the implementation, we just assume it's all continuous. Note, it would be straightforward to write a separate script loading the fif files with MNE and saving the continuous segments as numpy files which you can then pass into the Data object if you are concerned about the discontinuities.

Please close the issue if this answers your question.

@scho97
Copy link
Contributor

scho97 commented Dec 22, 2024

@Shrecki Could you close this issue if the comment above answered your question? Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants