Fix sampling rate issue when aggregating #4024

DradeAW · 2025-07-02T14:13:28Z

Different versions of Kilosort output different sampling rate (sometimes rounding it to 2 decimal places, sometimes not)

This makes a crash when trying to aggregate both sortings together This PR fixes this

Different versions of Kilosort output different sampling rate (sometimes rounding it to 2 decimal places, sometimes not) This makes a crash when trying to aggregate both sortings together This PR fixes this

for more information, see https://pre-commit.ci

DradeAW · 2025-07-02T14:54:06Z

I don't know why the tests fail, it has nothing to do with my changes

zm711 · 2025-07-02T18:25:06Z

updates in probeinterface that I think should be fixed now.

zm711

For me this seems fine. Recordings these days are between 20kHz and 30Khz so we are looking for a tolerance of 0.01 or 30_000.01 which is a fraction of a percent off from each other. @DradeAW you think this is just a float storage issue and why choose 0.01 vs 1e-1 or 1e-3?

DradeAW · 2025-07-02T18:51:18Z

Floating point errors can happen, but I'm not talking about that

In my case, I have multiple sortings for a single recording, and because of how some spike sorting algorithms work, the sampling frequency is sometimes rounded and sometimes not, creating a difference

h-mayorquin · 2025-07-02T18:58:58Z

Can we fix these in the output of the sorting algorithms at least here? This would be safer that the proposed solution.

For neuropixels I have seen variations of 0.01 that do add up with long recordings so I don't think that we should omit those differences by default.

zm711 · 2025-07-02T19:00:22Z

Can we fix these in the output of the sorting algorithms at least here? This would be safer that the proposed solution.

But we don't control the algorithms for getting the sorting sampling rate. So you want our wrappers to change them?

h-mayorquin · 2025-07-02T19:08:13Z

Can we fix these in the output of the sorting algorithms at least here? This would be safer that the proposed solution.

But we don't control the algorithms for getting the sorting sampling rate. So you want our wrappers to change them?

Yes, that's what I am proposing. if we are so sure that they are equal that we want to relax the criteria here then why not just change them there? This seems safer than increasing the false positives here by relaxing the criteria.

zm711 · 2025-07-02T20:27:25Z

For me I think the problem is that we allow fractional sampling rates (ie floats) in general. If sampling rates were always ints then rounding wouldn't be an issue.

Could you explain this more:

For neuropixels I have seen variations of 0.01 that do add up with long recordings so I don't think that we should omit those differences by default.

I am actually curious how sampling rate issues across long recordings get condensed into the singular sampling rate post a sorting. For example if you sampling rate varies in a long recording between 30_000.01, 30_000.02, 30_000.03, the sorting has to pick one value to use for its sampling rate so how do we dealing with that at the recording level.

h-mayorquin · 2025-07-02T20:37:52Z

[EDIT] Ok, we discussed in slack

Zach meant that all known acquisition systems have a nominal sampling rate that is always an integer.
I then mentioned that sometimes the rate needs to be adjusted by really small numbers. For example, a sampling rate of 30_000 vs 30_000.001 means that in an hour those two difference process will be off by three samples. You can set precise timestamps or correct the sampling rate but the point is that sometimes float sampling rates do matter for fine adjustment.

zm711 · 2025-07-02T21:51:39Z

One minor edit. I assume most are nominally an integer. I'm sure there are always exceptions to this rule.

h-mayorquin · 2025-07-03T22:42:35Z

The sampling rate of some Axon files that I got today:

alejoe91 · 2025-07-04T14:02:58Z

The sampling rate of some Axon files that I got today:

This is just to show that some systems don't have nominal sampling rate, right @h-mayorquin ?

I think we can merge this one

zm711 · 2025-07-04T14:30:18Z

Actually it is two fold. One it shows that some do have truly float. But the actually convo (partially on slack) was Heberto is against this PR because it could make it easier to aggregate incorrectly. I was in favor of this because I think float issues with rounding is so common. Heberto pulled me a little closer to his side but I’ve moved more to neutral. So I think it would be helpful for the debate for @alejoe91 to comment why you are in favor.

samuelgarcia · 2025-07-09T10:42:38Z

Zach meant that all known acquisition systems have a nominal sampling rate that is always an integer. I then mentioned that sometimes the rate needs to be adjusted by really small numbers. For example, a sampling rate of 30_000 vs 30_000.001 means that in an hour those two difference process will be off by three samples. You can set precise timestamps or correct the sampling rate but the point is that sometimes float sampling rates do matter for fine adjustment.

In device this is not the sample rate which is an integer but the sampling period (and more often multiple of a shorter sampling period). So the sampling rate should in fact be very rarely a interger

zm711 · 2025-07-09T11:59:41Z

That's a great point @samuelgarcia. My point was machines often have you select a nominal int (which more often for the various machines I use these days are integer sampling rates--> float periods), but the other way would be to in general have int sample periods which would often convert to float sampling rate.

I guess where are you getting the idea that machines are period based rather than sample rate based? You could do either right? I'm definitely willing to be convinced that my sample rate is actually a bias from what I have worked on myself.

But if this is true wouldn't it be better for us to store info as a sample period int rather than a rate. Because the comparison of the int period won't have float issues?

h-mayorquin · 2025-07-15T23:32:45Z

This is just to show that some systems don't have nominal sampling rate, right @h-mayorquin ?

I think we can merge this one

Yes, to show that the sampling rate is not always an integer which I think is a requirement for this PR to be sound. Unless I am misunderstanding.

alejoe91 · 2025-07-16T08:54:47Z

This is just to show that some systems don't have nominal sampling rate, right @h-mayorquin ?
I think we can merge this one

Yes, to show that the sampling rate is not always an integer which I think is a requirement for this PR to be sound. Unless I am misunderstanding.

I don't think sampling frequency being ints is a requirement for the PR, since the math.isclose would work for floats too.

You can set precise timestamps or correct the sampling rate but the point is that sometimes float sampling rates do matter for fine adjustment.

This holds with float sampling frequencies too, no?

DradeAW and others added 2 commits July 2, 2025 16:13

Fix sampling rate issue when aggregating

4c4d8a7

Different versions of Kilosort output different sampling rate (sometimes rounding it to 2 decimal places, sometimes not) This makes a crash when trying to aggregate both sortings together This PR fixes this

[pre-commit.ci] auto fixes from pre-commit.com hooks

8167f65

for more information, see https://pre-commit.ci

Merge branch 'main' into patch-2

67f0c43

zm711 reviewed Jul 2, 2025

View reviewed changes

alejoe91 approved these changes Jul 3, 2025

View reviewed changes

alejoe91 added the core Changes to core module label Jul 3, 2025

alejoe91 added this to the 0.103.0 milestone Jul 3, 2025

Fix sampling rate issue when aggregating #4024

Are you sure you want to change the base?

Fix sampling rate issue when aggregating #4024

Conversation

DradeAW commented Jul 2, 2025

Uh oh!

DradeAW commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zm711 commented Jul 2, 2025

Uh oh!

zm711 left a comment

Choose a reason for hiding this comment

Uh oh!

DradeAW commented Jul 2, 2025

Uh oh!

h-mayorquin commented Jul 2, 2025

Uh oh!

zm711 commented Jul 2, 2025

Uh oh!

h-mayorquin commented Jul 2, 2025

Uh oh!

zm711 commented Jul 2, 2025

Uh oh!

h-mayorquin commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zm711 commented Jul 2, 2025

Uh oh!

h-mayorquin commented Jul 3, 2025

Uh oh!

alejoe91 commented Jul 4, 2025

Uh oh!

zm711 commented Jul 4, 2025

Uh oh!

samuelgarcia commented Jul 9, 2025

Uh oh!

zm711 commented Jul 9, 2025

Uh oh!

h-mayorquin commented Jul 15, 2025

Uh oh!

alejoe91 commented Jul 16, 2025

Uh oh!

Uh oh!

DradeAW commented Jul 2, 2025 •

edited

Loading

h-mayorquin commented Jul 2, 2025 •

edited

Loading