Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering #329

Merged
merged 48 commits into from
Sep 12, 2024
Merged

Filtering #329

merged 48 commits into from
Sep 12, 2024

Conversation

BalzaniEdoardo
Copy link
Collaborator

I am starting off this as a draft PR since @gviejo and @BalzaniEdoardo are working together on it.

The PR implements a wrapper scipy Butterworth filter, applicable to any BaseTsd object, correctly handling epoch boundaries.

Copy link

codecov bot commented Aug 15, 2024

Codecov Report

Attention: Patch coverage is 93.33333% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pynapple/process/filtering.py 94.59% 2 Missing and 4 partials ⚠️
pynapple/core/time_series.py 50.00% 0 Missing and 1 partial ⚠️
pynapple/process/signal_processing.py 80.00% 0 Missing and 1 partial ⚠️
Files with missing lines Coverage Δ
pynapple/core/base_class.py 99.04% <100.00%> (ø)
pynapple/core/interval_set.py 95.68% <ø> (-0.72%) ⬇️
pynapple/process/__init__.py 100.00% <100.00%> (ø)
pynapple/core/time_series.py 90.74% <50.00%> (+0.03%) ⬆️
pynapple/process/signal_processing.py 98.63% <80.00%> (ø)
pynapple/process/filtering.py 94.59% <94.59%> (ø)

... and 2 files with indirect coverage changes

.github/workflows/main.yml Show resolved Hide resolved
pynapple/process/filtering.py Outdated Show resolved Hide resolved
pynapple/process/filtering.py Outdated Show resolved Hide resolved
@BalzaniEdoardo BalzaniEdoardo marked this pull request as ready for review August 16, 2024 07:38
Copy link
Collaborator Author

@BalzaniEdoardo BalzaniEdoardo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. I think this module would be super useful

docs/api_guide/tutorial_pynapple_filtering.py Outdated Show resolved Hide resolved
docs/api_guide/tutorial_pynapple_filtering.py Outdated Show resolved Hide resolved
docs/api_guide/tutorial_pynapple_filtering.py Outdated Show resolved Hide resolved
docs/api_guide/tutorial_pynapple_filtering.py Outdated Show resolved Hide resolved

# %%
# ***
# Performances
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Performances
# Performance

pynapple/process/filtering.py Outdated Show resolved Hide resolved
pynapple/process/filtering.py Show resolved Hide resolved
If `fs` is not float or None.
If `mode` is not "butter" or "sinc".
If `order` is not an int.
If "transition_bandwidth" is not a float.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as before, add "if any nan are present.."

If `fs` is not float or None.
If `mode` is not "butter" or "sinc".
If `order` is not an int.
If "transition_bandwidth" is not a float.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as before, add "if any nan are present.."

If `fs` is not float or None.
If `mode` is not "butter" or "sinc".
If `order` is not an int.
If "transition_bandwidth" is not a float.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as the other funcs

Copy link

@billbrod billbrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm providing feedback on the new filtering tutorial, as well as the user interface.

User interface:

  • it's a bit confusing to me that the functions are called compute_*_filter. I'm assuming it's because you have a lot of compute_* functions and want to be consistent, but you're not computing the filter, you're applying it to the signal. So calling it apply_*_filter or similar would make more sense to me.
  • what's the use case for user setting fs themselves? it makes more sense to just grab that from the signal, since the user setting it themselves seems likely to lead to mistakes.
  • As we've discussed for nemos, I don't love having a mode kwarg and then additional kwargs that are only used for one of the modes (transition_bandwidth and order here). I think an exception should be raised when they're set and shouldn't be, since that probably means a user has misunderstood something.
  • are there allowable ranges for order and transition_bandwidth? Values that are so large and so small that they don't make sense? my guess is yes, and that at least a warning (if not an error) should be raised in those cases. (with a really large order, for example, you get overflow issues)
  • transition_bandwidth has to be a float, but isn't parsing an int to a float reasonable here as well?
  • The errors raised when cutoff is the wrong length is confusing. I think in the high-level function (e.g., compute_highpass_filter, not _compute_filter), the number of values of cutoff (1 or 2) should be explicitly checked and an informative ValueError raised. E.g., highpass filters require a single cutoff value, but {len(cutoff)} values provided instead!. (Among other things, the error message refers to Wn, and the user has no idea what that variable is)
  • the docstrings should provide a basic description of what highpass/bandpass/lowpass/bandstop means. A one sentence description like, "preserves frequencies above the cutoff, discards those below".
  • docstrings are not rendered correctly by mkdocs. there needs to be an extra newline between the first line and the one that starts Mode can be :

Tutorial:

  • Could be nice to have some high-level description here: why is filtering used in neuroscience, with link to examples (including another of pynapple's tutorials!). And when would one use bandpass/lowpass/etc
  • The question of how should one choose sinc vs. butter isn't addressed. they do give different outputs: is that important? or is the choice largely about efficiency?
  • Same with how to choose the cutoff, order, and transition_bandwidth parameters. cutoff depends on the user's scientific question (but that should be said explicitly), and guidance on the others would be nice.
  • when plotting the results of the bandpass filter, both filters reduce the amplitude of the signal in the beginning. why is that? and why not show the full 2-second signal?
  • also the x-axis on that plot is labelled Time (Hz), it should be Time (s), right?
  • The frequency response plots are confusing -- took me a while to realize the top and bottom rows are the same plots, just with linear and log scales on the y-axis. it's also not clear what users should be taking from this plot.
    • To make it more useful: could show these filters applied to the signal (both the time and frequency domain representations) and talk through why you might want sharper / less sharp of a transition.
    • Is there a principled way to choose values for order / transition bandwidth, or is it empirical? you give some defaults, do those work in most cases or should people always experiment with them?
  • The reference to "kernel is too short" is confusing, since the user never sees the actual kernel. I would mention that in an aside (footnote or admonition) and point out what effect that has in the data (reduction of signal amplitude), and how the user should fix it (reduce transition bandwidth).
    • This should be its own section, it's kind of "troubleshooting"
  • The performance code block is really long and not actually interesting to the user. it would be nice to hide this. what we ended up doing in nemos was moving the utility code to a hidden module. could do that or, since users probably won't run this themselves, could just move it to a utility script in the docs folder.
  • that performance section should also interpret the output. why are there two plots here? what are we supposed to take from this?
  • is it worth removing the noise from the example signal? or at least, do it with and without? real data always has noise, but it would make interpreting the bandstop filter easier if the noise wasn't present. the noise also means the true 10 Hz component and the outputs of the filters are different, which might confuse people. should at least explain this.

@gviejo gviejo merged commit 526fb96 into dev Sep 12, 2024
12 checks passed
@gviejo gviejo deleted the filtering branch September 12, 2024 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants