Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Improve convolution performance for Sparse variables #411

Merged
merged 18 commits into from
Mar 7, 2019

Conversation

adelavega
Copy link
Collaborator

Fixes #354 and related to #356

Profiling indicates that the slow function is actually np.convolve itself. It calls np.correlate which takes exponentially longer as the variable grows in length.
output

(see: numpy/numpy#1858)

The good news is that by not densifying prior to convolution things go much faster.
For example, a predictor with ~850 events takes about 15ms as sparse, but 1.03s when upsampled to 10hz. 50hz takes that up to about a minute (I did the profiling on that).

The question that remains is how to downsample at the end. As it is, it will use the original onsets as the frame_times. That is, it will resample only at those onsets. Does that make sense? Or would uniform resampling at the TR (or some factor above that), be better? Maybe we can even do 10hz resampling, although presumably this should be the final step in Transformations and TR should be OK.

@adelavega adelavega changed the title Do not convert to Dense if variable is Sparse [WIP] HRF: Do not convert to Dense if variable is Sparse Mar 2, 2019
Copy link
Collaborator

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had urge to whine I guess ;-)
Will check in more detail when home with a laptop

source=var.source, sampling_rate=var.sampling_rate)
if isinstance(var, SparseRunVariable):
return SparseRunVariable(
name=var.name, values=convolved[0], onset=onsets,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the pr, but unchecked assumptions started as [0] is a recipe for a trouble or a cryptic error. I believe we addressed some of such, but it would be nice if new code with checks/proper errors or at least assert statements

Copy link
Collaborator Author

@adelavega adelavega Mar 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean convolved[0]? I blame someone else! Maybe if it returns a dictionary?

@yarikoptic
Copy link
Collaborator

Ah, and how to get such a neat profile figure, please teach me

@adelavega
Copy link
Collaborator Author

adelavega commented Mar 2, 2019

Ah, and how to get such a neat profile figure, please teach me

cProfile.run('hrf.compute_regressor(vals, model, onsets)', 'restats')

gprof2dot -f pstats restats | dot -Tpng -o output.png

@codecov
Copy link

codecov bot commented Mar 2, 2019

Codecov Report

Merging #411 into master will decrease coverage by 0.05%.
The diff coverage is 33.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #411      +/-   ##
==========================================
- Coverage    62.3%   62.25%   -0.06%     
==========================================
  Files          27       27              
  Lines        4555     4554       -1     
  Branches     1173     1173              
==========================================
- Hits         2838     2835       -3     
- Misses       1433     1434       +1     
- Partials      284      285       +1
Flag Coverage Δ
#unittests 62.25% <33.33%> (-0.06%) ⬇️
Impacted Files Coverage Δ
bids/analysis/transformations/compute.py 85.57% <33.33%> (-1.09%) ⬇️
bids/analysis/analysis.py 88.29% <0%> (-0.49%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b4bb3cd...ffdaa01. Read the comment docs.

@codecov
Copy link

codecov bot commented Mar 2, 2019

Codecov Report

Merging #411 into master will increase coverage by 0.08%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #411      +/-   ##
==========================================
+ Coverage    62.3%   62.39%   +0.08%     
==========================================
  Files          27       27              
  Lines        4555     4560       +5     
  Branches     1173     1173              
==========================================
+ Hits         2838     2845       +7     
+ Misses       1433     1432       -1     
+ Partials      284      283       -1
Flag Coverage Δ
#unittests 62.39% <100%> (+0.08%) ⬆️
Impacted Files Coverage Δ
bids/analysis/transformations/compute.py 88.18% <100%> (+1.51%) ⬆️
bids/analysis/transformations/munge.py 91.81% <0%> (+0.58%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b4bb3cd...46f638e. Read the comment docs.

@adelavega
Copy link
Collaborator Author

adelavega commented Mar 2, 2019

A few more profiling/timing results. This is passing Sparse variables as Sparse to compute_regressor.

I timed how long it would take to run .setup() on a model with 16 regressors that need to be convolved, while varying the sampling_rate of frame_times. This would still return a Dense variable for a Sparse variable, but would only density after convolution.

Since the frame_times are used (in conjunction w/ the oversampling factor) to upsample the HRF model + events, this is the main determinant of how long convolve will take. It scales linearly.

1hz: 1.2s
2hz: 2.14s
5hz: 10s
10hz: 20s

Now the question is whether its reasonable to downsample to TR (or a small factor above that), assuming Convolve is the final transformation. I'll post graphs of the actual convolved variables at different sampling rates soon.

@adelavega
Copy link
Collaborator Author

adelavega commented Mar 2, 2019

Speech at 0.5, 1, 5 and 10hz
image
image

image
image

and Brightness at 0.5, 1, 5, and 10:

image

image
image
image

Doesn't seem to make much of a difference, so I'd say we downsample to TR or 1-5 hz by default. You do see a slight degradation at 0.5hz, and below that it looks very very poor. E.g. 0.1 for speech:

image

@adelavega
Copy link
Collaborator Author

I think if we want to be smart about it, for no information loss to occur, you'd want to set combination of oversampling and frame_times frequency to match the minimum distance between events.

For example, in the case of speech, the minimum distance between events is 0.028s whereas other events are sampled at 1hz.

I tested out using an adaptive oversampling, such that it is set to the minimum distance between events * sampling_rate of frame time. In my testing, this seemed to work quite well, even when returning events at 10hz (which is what the previous behavior was).

This is also only took 1 second to run analysis.setup() on a run w/ 17 variables convolved.

@tyarkoni
Copy link
Collaborator

tyarkoni commented Mar 4, 2019

Cool, glad that works. But we should probably use the minimum of event duration rather than distance between events (or minimum of both). In the naturalistic context, these will generally coincide (e.g., for uniformly sampled measurements, duration will generally match distance to next sample), but in many other contexts, you can have widely-spaced but very short events.

Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense... I definitely think scaling the oversampling inversely with the actual sampling rate is sensible.

This is going to make refactoring #376 pretty awful, but that's my fault for not getting that merged first.

bids/analysis/transformations/compute.py Outdated Show resolved Hide resolved
bids/analysis/transformations/compute.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks almost ready. Is there a hold-up I'm not aware of being signaled by [WIP]?

bids/analysis/transformations/compute.py Outdated Show resolved Hide resolved
bids/analysis/transformations/compute.py Outdated Show resolved Hide resolved
bids/analysis/transformations/compute.py Outdated Show resolved Hide resolved
@adelavega
Copy link
Collaborator Author

Ready to merge as soon as tests pass. Oddly I can't edit the title...

@effigies effigies changed the title [WIP] HRF: Do not convert to Dense if variable is Sparse ENH: Improve convolution performance for Sparse variables Mar 5, 2019
@effigies
Copy link
Collaborator

effigies commented Mar 5, 2019

Yeah, GitHub's UI seems to be glitchy. I was able to edit it in a tab I had open from earlier.

@tyarkoni
Copy link
Collaborator

tyarkoni commented Mar 5, 2019

Reviewing this now.

Copy link
Collaborator

@tyarkoni tyarkoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor comments, otherwise LGTM.

@@ -49,11 +54,18 @@ def _transform(self, var, model='spm', derivative=False, dispersion=False,
elif model != 'fir':
raise ValueError("Model must be one of 'spm', 'glover', or 'fir'.")

convolved = hrf.compute_regressor(vals, model, onsets,
fir_delays=fir_delays, min_onset=0)
min_interval = min(np.ediff1d(np.sort(var.onset)).min(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment here for posterity explaining min_interval—it will reduce the maintenance burden a year or two down the line

fir_delays=fir_delays, min_onset=0)
min_interval = min(np.ediff1d(np.sort(var.onset)).min(),
var.duration.min())
oversampling = np.ceil(1 / (min_interval * sampling_rate))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add a test to make sure that oversampling is computing properly from the input parameters (which might require mocking compute_regressor).

@adelavega
Copy link
Collaborator Author

adelavega commented Mar 5, 2019

Okay @tyarkoni, I added tests with dense and sparse variables, and mocked compute_regressor to check the oversampling.

Two notes:

  • I am checking for min_interval only with unique value, as it appears more than one onset with the same value is possible (at least in test data), which results in min_interval=0
  • When densifying, rounding errors can lead to oversampling of 2.0, when it was computed to be 1.0000001. I think that's fine, but it would be a slight penalty in performance. Also, for many low frequency events (e.g. 'RT' in test data), the collections sampling_rate would have to be very low before oversampling goes above 1. I'm not 100% confident this is acceptable, but it seems to make sense.

@adelavega
Copy link
Collaborator Author

Ugh, go away python 2. I guess we need to all the mock library.

@codecov-io
Copy link

codecov-io commented Mar 5, 2019

Codecov Report

Merging #411 into master will increase coverage by 0.08%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #411      +/-   ##
==========================================
+ Coverage    62.3%   62.39%   +0.08%     
==========================================
  Files          27       27              
  Lines        4555     4560       +5     
  Branches     1173     1173              
==========================================
+ Hits         2838     2845       +7     
+ Misses       1433     1432       -1     
+ Partials      284      283       -1
Flag Coverage Δ
#unittests 62.39% <100%> (+0.08%) ⬆️
Impacted Files Coverage Δ
bids/analysis/transformations/compute.py 88.18% <100%> (+1.51%) ⬆️
bids/analysis/transformations/munge.py 91.81% <0%> (+0.58%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b4bb3cd...0e18ff7. Read the comment docs.

@codecov-io
Copy link

codecov-io commented Mar 5, 2019

Codecov Report

Merging #411 into master will increase coverage by 0.07%.
The diff coverage is 86.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #411      +/-   ##
==========================================
+ Coverage    62.3%   62.37%   +0.07%     
==========================================
  Files          27       27              
  Lines        4555     4564       +9     
  Branches     1173     1174       +1     
==========================================
+ Hits         2838     2847       +9     
  Misses       1433     1433              
  Partials      284      284
Flag Coverage Δ
#unittests 62.37% <86.66%> (+0.07%) ⬆️
Impacted Files Coverage Δ
bids/analysis/transformations/compute.py 86.84% <86.66%> (+0.17%) ⬆️
bids/analysis/transformations/munge.py 91.81% <0%> (+0.58%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b4bb3cd...916c8a3. Read the comment docs.

@tyarkoni
Copy link
Collaborator

tyarkoni commented Mar 5, 2019

Also, for many low frequency events (e.g. 'RT' in test data), the collections sampling_rate would have to be very low before oversampling goes above 1. I'm not 100% confident this is acceptable, but it seems to make sense.

It's worth experimenting with the convolution code to make sure small oversampling values don't do wonky things.

That issue aside, we should probably also always double the value you're currently using. The minimum of event durations and onset deltas seems like a reasonable approximation of the highest-frequency signal in the timeseries, but we want to make sure we're above the Nyquist rate (i.e., 2 * the highest frequency). This should make a big difference in many cases.

@adelavega
Copy link
Collaborator Author

In my short look at it, it didn't seem to make a difference, since oversampling is essentially already done by requesting high frequency frame-times, but it's worth testing in more detail, and keeping an eye on it.

I'm fine with doubling the oversampling rate

@adelavega
Copy link
Collaborator Author

Turns out this line is (potentially) inaccurate given a float duration:

resample_frames = np.linspace(0, dur, dur * sampling_rate, endpoint=False)                  
                    
/usr/local/bin/ipython:2: DeprecationWarning: object of type <class 'float'> cannot be safely interpreted as an integer.

Or at least it's potentially inconsistent with how DenseRunVariable is created.

I think #361 deserves its own fix (throw error if index and values don't match at __init__), but here I'll aim to have resample_frames result in the same n as create index.

@adelavega
Copy link
Collaborator Author

Anybody want to give this a final review? If not, I will merge soon as its already been reviewed and seems to be working well for me.

Copy link
Collaborator

@tyarkoni tyarkoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

sampling_rate = self.collection.sampling_rate
dur = var.get_duration()
resample_frames = np.linspace(
0, dur, int(math.ceil(dur * sampling_rate)), endpoint=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should center the sampled frames within the timeseries... e.g., suppose we have a 2-second dense timeseries, and we want to downsample to 2 Hz. Currently we would sample at 0, 0.5, 1, and 1.5. Probably we should do 0.25, 0.75, 1.25, and 1.75. Let's not change anything here, because if we were going to do this, we'd need to do it throughout the codebase for consistency. Mostly just making a mental note.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants