Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatch between data and descriptor on timeout #492

Open
martin-gustafsson opened this issue Apr 9, 2021 · 4 comments
Open

Mismatch between data and descriptor on timeout #492

martin-gustafsson opened this issue Apr 9, 2021 · 4 comments

Comments

@martin-gustafsson
Copy link

If the digitizer times out on one iteration in a sweep, the data on file are absent rather than e.g. zeros or NaNs. Then the data descriptor and the data no longer match, and there is no indication of which data are missing. The shape of the dataset suggests that it's complete, but the last row is all zeros.

Error message during experiment, on iteration 1 of 220 of a parameter sweep:
auspex-ERROR: 2021-04-08 19:38:44,651 ----> Digitizer myX6 timed out.

Resulting data and descriptor:

data, desc, _ = open_data(16, '.', "q1-raw_int", date="210408")
print(desc.axes[1].points.shape)
print(data.shape)
print(data[-2,:])
print(data[-1,:])

(220,)
(220, 1523)
[29.7535543 +26.75779667j 21.34156161+34.60425376j
16.25427695+36.56289939j ... 29.83350915+26.67109178j
29.79431326+26.64494398j 29.68503646+26.96899154j]
[0.+0.j 0.+0.j 0.+0.j ... 0.+0.j 0.+0.j 0.+0.j]

@matthewware
Copy link
Collaborator

This is a symptom of the deeper issues we're chatting about. The right solution is to fix the data pipeline so the card doesn't timeout. But maybe we could explore pre-filling for small datasets.

@grahamrow
Copy link
Member

Both arrays are preallocated, and the descriptor information is actually written before the filters run. I guess the question is what should the default behavior be?

I guess I'm confused by your first statement 'the data on file are absent rather than e.g. zeros or NaNs' — it looks like the data that didn't get recorded are indeed zeros. I'd agree that NaNs might be better.

@martin-gustafsson
Copy link
Author

Graham, the issue is that the failed measurement was in the first step of the sweep, but it's the last row of data that's missing. The first row of the data array contains the data for the second step of the sweep.

@martin-gustafsson
Copy link
Author

I'll try to be more clear: Let's say I have an experiment with one parameter sweep. Then there is effectively one counter for what element in the parameter array to use next and another counter for which row of data to write next. If everything works, those two counters remain in sync and the data array gets completely filled before the experiment terminates.

However, if one iteration in the middle of the sweep results in a digitizer timeout, the parameter step increments but the data counter does not. From that point onward, the data and the descriptor do not match, and when the sweep is finished, there is still one unwritten row at the end of the data array. If there are two timeouts at different points in the sweep, there will be two empty rows at the end of the data array.

You don't see this so easily in sweeps with small linear increments of the parameter, but I was using a random index as a parameter to a sweep, and then it becomes apparent when the data row does not match the parameter value for which it was acquired.

Of course, we want a situation where timeouts never happen, but that seems hard to guarantee. In that case, it's nice if the writer dumps a row's worth of nans in the data array when the error happens. If that's not possible, I think the severity of a digitizer timeout has to be increased from a Warning to an Error, since it corrupts the measured data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants