PCmetrics values are not constant for multiple runs #75

JRicardo24 · 2022-09-07T20:04:54Z

Hello guys, is it normal that when we run the metrics module, on the exactly same dataset, that the values for isolation distance, l_ratio, d_prime and the 2 nearest_neighbours metrics change their values?

For some clusters the values are indeed pretty similar, but for others, like a cluster I have with 35k spikes, the isolation_distance varied from 361 to 556...

The biggest changes come from clusters with more spikes. Any thoughts about that? Is it normal?

Thank you
@jsiegle

jsiegle · 2022-09-07T20:26:37Z

All of the PC metrics involve random subsampling of spikes to speed up the calculation.

The np.random module is initialized with the same seed value on each run, which should ensure that the results are the same each time. But it's possible the seeding is not working as expected.

isolation_distance in particular can be quite sensitive to the subsampled spikes, which is why we don't use it for any of our unit-level quality control. In fact, the only PC metric we've found to be generally useful is nearest_neighbors_hit_rate. Have you found that one to vary significantly between runs?

JRicardo24 · 2022-10-17T13:42:35Z

I understand. Yes you're right, from all the PC metrics, nearest_neighbors_hit_rate is the one that has most constant values between runs.
The default value for the number of spikes to subsample for computing PC metrics (max_spikes_for_unit) is set to 500, maybe it's the reason for some larger variations in isolation_distance and other metrics on units with significantly more spikes? If yes, what would it be recommended to use on a dataset with units ranging from a few dozen spikes all the way up to 35k? @jsiegle

jsiegle · 2022-10-20T18:19:18Z

You can try increasing max_spikes_per_unit to 2000 or higher. That will increase the computation time, but should make the values more stable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PCmetrics values are not constant for multiple runs #75

PCmetrics values are not constant for multiple runs #75

JRicardo24 commented Sep 7, 2022

jsiegle commented Sep 7, 2022

JRicardo24 commented Oct 17, 2022

jsiegle commented Oct 20, 2022

PCmetrics values are not constant for multiple runs #75

PCmetrics values are not constant for multiple runs #75

Comments

JRicardo24 commented Sep 7, 2022

jsiegle commented Sep 7, 2022

JRicardo24 commented Oct 17, 2022

jsiegle commented Oct 20, 2022