Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSOC] Add EpochsTFR support to spectral connectivity functions #232

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

tsbinns
Copy link
Collaborator

@tsbinns tsbinns commented Sep 6, 2024

WIP follow up of #225 to finalise GSoC work.

So far adds support for coefficients from Epochs.compute_tfr(method="morlet", output="complex") to spectral_connectivity_epochs(), equivalent to the cwt_morlet mode. Still to be done is adding support for EpochsTFR objects in spectral_connectivity_time().

In addition to the Morlet approach, spectral_connectivity_time() supports the time-freq. multitaper mode. #126 could also be addressed by adding support for this to spectral_connectivity_epochs(), in the form of EpochsTFR objects.

However, when trying this I discovered a bug that prevents the time-freq. multitaper mode being computed from epoched data (mne-tools/mne-python#12831), but that seems like an easy fix.

Will continue to work on this next week.

@tsbinns
Copy link
Collaborator Author

tsbinns commented Sep 10, 2024

Have now added support for EpochsTFR to spectral_connectivity_time (using a custom MNE branch with the bug in #12831 fixed).

Comment on lines +444 to +455
# Warn if mt_bandwidth & n_cycles default-valued (can't be read from TFR)
if mode == "multitaper": # doesn't matter for cwt_morlet
if mt_bandwidth is None:
warn(
"`mt_bandwidth` is not specified; assuming 4.0 Hz was used to "
"compute the TFR"
)
if n_cycles == 7.0:
warn(
"`n_cycles` is the default value; assuming 7.0 was used to "
"compute the TFR"
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EpochsTFR objects computed with the multitaper method & complex output have a taper dimension which needs to be aggregated across, however the weights for these tapers are not stored in the EpochsTFR object and need to be re-computed within spectral_connectivity_time().

For these weights to be accurate, we need to know the bandwidth and number of cycles used to compute the original TFR. However, these are also not stored in the EpochsTFR object, so the user would need to pass this information in to the mt_bandwidth and n_cycles params.

I have noted this in the docstring, but I thought it would also be useful that if an EpochsTFR object with multitaper method is passed in and the default parameters are used (mt_bandwidth=None and n_cycles=7.0), then a warning is raised. Only annoying thing is that the n_cycles warnign is unavoidable if 7 cycles were actually used to compute the TFR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add the mt_bandwidth and n_cycles attributes to the *TFR objects?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have. I'm happy to open a PR if people think it's justified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provenance is usually nice to track and if it's easy enough to add to the class then we might as well. Might be worth a quick MNE issue as we might want them as attributes like tfr.n_cycles but maybe better would be something like a dict of params like tfr.estimation_parameters = {'mt_bandwidth': 4, 'n_cycles': 5} so that the tfr namespace itself doesn't get too busy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will open an issue first then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a naive question, but if what you need is the taper weights, why not store those directly rather than storing the other estimation params necessary to recompute them? It's not a perfect solution because TFRs computed from earlier versions of MNE and then saved to disk may still lack the necessary information... but the same is true of storing the estimation parameters as attributes (instead of the weights).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copying from the discussion in mne-tools/mne-python#12851 (comment) for reference:

drammock: Since we already store weights for spectrum objects, I'm inclined to store weights (not estimation params) for TFRs too. That saves having to recompute them when users pass MNE objects (not arrays) to the connectivity functions, and is more internally consistent.

tsbinns: Sure, I see your reasoning, especially to keep it consistent with *Spectrum objects. I'll open a PR for this.

@tsbinns
Copy link
Collaborator Author

tsbinns commented Sep 17, 2024

Resetting tests to main mne branch now that mne-tools/mne-python#12842 is merged

mne_connectivity/spectral/epochs.py Outdated Show resolved Hide resolved
mne_connectivity/spectral/epochs.py Outdated Show resolved Hide resolved
mne_connectivity/spectral/epochs.py Outdated Show resolved Hide resolved
mne_connectivity/spectral/epochs.py Outdated Show resolved Hide resolved
mne_connectivity/spectral/epochs.py Outdated Show resolved Hide resolved
mne_connectivity/spectral/time.py Outdated Show resolved Hide resolved
Comment on lines +170 to +172
:func:`mne.time_frequency.tfr_array_morlet` documentation. If ``data`` is an
:class:`~mne.time_frequency.EpochsTFR` object computed with the ``multitaper``
method, this should match the value used to compute the TFR.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statements like this worry me, because they mean that it's easy for people to make mistakes.

For things that are ignored if you pass an EpochsTFR object, then (assuming the default param value is None or some other null/auto value) we at least have the option of checking whether non-default input to this function matches what we extract/infer from the object, and raising warnings/errors if they don't match. But for params that have int/float default values, or for quantities/settings that were used to create the TFR object but aren't stored with it, we can't even do that sanity check.

I don't want to throw a big monkey wrench into this PR, but an API like this makes me think that there should be separate functions for operating on MNE Objects and on arrays, and the shared code path should be a helper func rather than the user-facing function(s). Another possibility is 2 public-facing funcs and no shared helper; one of the public funcs would then call the other after suitable transformation of its inputs. This could mean internally extracting arrays from objects and always operating on arrays, or it could mean internally promoting arrays to objects and always operating on objects.

Without looking closely I'm not sure how much work that would be / how (in)consistent it is with the rest of the mne_connectivity API. I also apologize if this was discussed in advance and I missed/forgot about that discussion... I've been rather underwater for the last several months with other responsibilities. But I'd like to at least check with you and @larsoner (and maybe @adam2392 if he has time) before this becomes something that needs a deprecation cycle to change.

(side note re: the discussion elsewhere in this PR between @tsbinns and @larsoner about storing estimation params like n_cycles as part of the TFR object --- that helps prevent some mistakes, but I still have the feeling like this function is trying to do too much.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statements like this worry me, because they mean that it's easy for people to make mistakes.

I 100% agree. With storing the params (or the weights directly like discussed in mne-tools/mne-python#12851), then this statement could be removed and prevent possible mistakes.

but I still have the feeling like this function is trying to do too much.

But yeah, I also understand.

an API like this makes me think that there should be separate functions for operating on MNE Objects and on arrays

Just to clarify, this would mean something like:

  • spec_conn_epochs/time handling Epochs and EpochsSpectrum/TFR objects
  • and a spec_conn_epochs/time_array function handling array of timeseries data or coefficients

If there were separate functions, would it not be simpler to maintain spec_conn_epochs/time as handling timeseries data (either as Epochs or arrays), and introduce equivalent functions for handling spectral coefficients (either as EpochsSpectrum, EpochsTFR, or arrays)? I feel like this would be more compatible with the suggestion below:

Another possibility is 2 public-facing funcs and no shared helper; one of the public funcs would then call the other after suitable transformation of its inputs. This could mean internally extracting arrays from objects and always operating on arrays, or it could mean internally promoting arrays to objects and always operating on objects.

However for the second sentence, I think again it's less a case of distinguishing data in arrays vs. MNE objects, and more distinguishing timeseries vs. coefficients.

Without looking closely I'm not sure how much work that would be / how (in)consistent it is with the rest of the mne_connectivity API.

I imagine the most important thing is for whatever is changed in spec_conn_time, that should be mirrored in spec_conn_epochs and vice versa (so things like #220 would need to be looked at again). I am perfectly happy to work on this (but I also understand that means more work for reviewers).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think again it's less a case of distinguishing data in arrays vs. MNE objects, and more distinguishing timeseries vs. coefficients.

yeah you're right I was confusing the issue. Just to make sure we're on the same page: am I right that the "flavors" of spectral connectivity you want to support are:

  • input: non-spectral (ordinary time series)
    • across: epochs (spec_conn_epochs; objects or arrays)
    • across: times (spec_conn_time; objects or arrays)
  • input: spectral
    • across: epochs (currently spec_conn_epochs; maybe a new func?)
      domain)
    • across: time (not supported; no time dimension)
  • input: spectrotemporal
    • across: epochs (not supported?)
    • across: times (this PR: spec_conn_time; maybe a new func? spec_conn_tfr_time?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input: non-spectral (ordinary time series)

  • across: epochs (spec_conn_epochs; objects or arrays)
  • across: times (spec_conn_time; objects or arrays)

Yes.

input: spectral

  • across: epochs (currently spec_conn_epochs; maybe a new func?)
    domain)
  • across: time (not supported; no time dimension)

Yes.

input: spectrotemporal

  • across: epochs (not supported?)
  • across: times (this PR: spec_conn_time; maybe a new func? spec_conn_tfr_time?)

This PR was also adding support for spectrotemporal input for spec_conn_epochs to match the cwt_morlet mode.

mne_connectivity/spectral/time.py Outdated Show resolved Hide resolved
mne_connectivity/spectral/time.py Outdated Show resolved Hide resolved
Comment on lines +444 to +455
# Warn if mt_bandwidth & n_cycles default-valued (can't be read from TFR)
if mode == "multitaper": # doesn't matter for cwt_morlet
if mt_bandwidth is None:
warn(
"`mt_bandwidth` is not specified; assuming 4.0 Hz was used to "
"compute the TFR"
)
if n_cycles == 7.0:
warn(
"`n_cycles` is the default value; assuming 7.0 was used to "
"compute the TFR"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a naive question, but if what you need is the taper weights, why not store those directly rather than storing the other estimation params necessary to recompute them? It's not a perfect solution because TFRs computed from earlier versions of MNE and then saved to disk may still lack the necessary information... but the same is true of storing the estimation parameters as attributes (instead of the weights).

@tsbinns
Copy link
Collaborator Author

tsbinns commented Sep 19, 2024

Thanks for the detailed review @drammock! Until those wider API questions are addressed I'll double check the unit tests and try to find where the deviation in Fourier/Welch pipelines.

@tsbinns
Copy link
Collaborator Author

tsbinns commented Sep 30, 2024

Conda test keeps freezing on the micromamba setup step, seems similar to this: mamba-org/setup-micromamba#225
Is it worth trying to fix now?

@larsoner
Copy link
Member

Yeah feel free! Sounds like there might be a workaround via pinning in some of those issue comments

@tsbinns
Copy link
Collaborator Author

tsbinns commented Dec 9, 2024

Haven't forgotten about this, just need to sort mne-tools/mne-python#12910 before I can finalise the changes and tests here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants