Skip to content

Conversation

j-friedrich
Copy link

Decimates images to reduce data size (and imaging rate).
Sequentially averages N frames together, instead of merely taking every N-th frame, thus preserving better SNR. Corresponds to running mean filtering with window length N followed by subsampling by N.

@jwittenbach
Copy link
Contributor

Another way to implement this would be to use Images.map_as_series. It might look something like this (haven't actually tested this, but the general idea should be right):

def decimate(self, n):

   def decimate_block(block):
      return r_[ [mean(block[i:i+n].mean(axis=0)[np.newaxis] for i in arange(0, block.shape[0], n)]]

   new_length = int(np.ceil(self.shape[0]/n))
   return self.map_as_series(decimate_block, value_shape=new_length, dtype=np.float64)

I would be interested in know if there's a performance difference between these. I know that we've have troubles with reduce operations being slow in the past without lots of optimization.

@j-friedrich
Copy link
Author

Images.map_as_series transforms to blocks, which is slow. The main idea behind decimation is to quickly reduce the size of the data first and then transform to blocks to do source extraction for which temporally decimated data is the sweet spot between merely one summary image and the whole data.

I expect a clear performance difference between these implementations, most dramatically if image 1 to n is on the first node, n+1 to 2n on the second, … thus requiring no shuffling between nodes at all. Of course it would be terrible if image i is on node i mod n. How get the images distributed in the first place?

On Apr 29, 2016, at 4:52 PM, Jason Wittenbach [email protected] wrote:

Another way to implement this would be to use Images.map_as_series. It might look something like this (haven't actually tested this, but the general idea should be right):

def decimate(self, n):

def decimate_block(block):
return r_[ [mean(block[i:i+n].mean(axis=0)[np.newaxis] for i in arange(0, block.shape[0], n)]]

new_length = int(np.ceil(self.shape[0]/n))
return self.map_as_series(decimate_block, value_shape=new_length, dtype=np.float64)
I would be interested in know if there's a performance difference between these. I know that we've have troubles with reduce operations being slow in the past without lots of optimization.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub #295 (comment)

@freeman-lab
Copy link
Member

@j-friedrich @jwittenbach it'd be really awesome to see performance numbers on this for a large test dataset, I'd say we just measure it and go with whichever implementation is faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants