Make a wrapped lookup table to time-of-flight #180

nvaytet · 2025-02-06T15:47:39Z

For computing time of flight, instead of having a complicated graph where we first unwrap the time of arrivals, then fold them with the pulse period etc...

we make a lookup table which can directly lookup the time-of-flight using the raw event_time_offset.
The graph now looks like

The lookup table before:

After (with wrap around):

To handle pulse skipping, we now add an extra pulse dimension to the lookup table.

Before:

After:

This will make tof cheaper to compute and easier to incorporate the workflow into existing reduction workflow, including workflow control via widgets.
Another added bonus is that the resolution in the event_time_offset dimension is now much more predictable/controllable, as the range is always [0, 71ms] (before it depended on the choppers and the detector distances).

Finally, I also changed the WFM tests to use a tof simulation instead of the list of 6 manually chosen neutrons, as it is more consistent with the other unwrapping tests and catches more potential error (such as neutrons randomly being lost at the edge of the frames).

…hen fold the events

… a 3d interpolator

…eriodic boundary conditions

…or a cleaner split between histogram and event mode

… conditions when pulse skipping

…wrapped-tof-lookup

…st probably cheaper

SimonHeybrock · 2025-02-07T11:19:56Z

src/ess/reduce/time_of_flight/toa_to_tof.py

+        wavs = sc.broadcast(
+            simulation.wavelength.to(unit="m"), sizes=toas.sizes
+        ).flatten(to="event")
+        dist = sc.broadcast(distances + simulation_distance, sizes=toas.sizes).flatten(
+            to="event"
+        )
+        tofs = dist * (sc.constants.m_n / sc.constants.h)
+        tofs *= wavs
+
+        data = sc.DataArray(
+            data=sc.broadcast(simulation.weight, sizes=toas.sizes).flatten(to="event"),
+            coords={
+                "toa": toas.flatten(to="event"),
+                "tof": tofs.to(unit=time_unit, copy=False),
+                "distance": dist,
+            },
+        )


Can we just flatten data, instead of all the pieces?

SimonHeybrock · 2025-02-07T11:33:05Z

src/ess/reduce/time_of_flight/toa_to_tof.py

+    # First, extend the table to the right by 1, and set the coordinate to pulse_period.
+    slab = sc.empty_like(table['event_time_offset', 0])
+    slab.coords['event_time_offset'] = pulse_period
+    table = sc.concat([table, slab], dim='event_time_offset')
+    # Then, copy over the values. Instead of using pulse_stride, we use the number of
+    # pulses in the table, as it could be that there were no events in the first pulse.
+    npulses = table.sizes['pulse']
+    for i in range(npulses):
+        pulse = (i + 1) % npulses
+        left_edge = table.data['pulse', pulse]['event_time_offset', 0]
+        table.data['pulse', i]['event_time_offset', -1] = left_edge


While this does not look wrong, I am a bit confused as to why we need to handle pulse-skipping as done above. Why group by pulse early and then deal with the mess here? Can't we just process based on the actual frame length (say 2*71ms), and then fold to obtain a pulse dim in the end?

Not saying it has to change, just wondering if it is simpler, without having thought through every detail.

After looking at this for too long, I get lost in the details and don't always manage to take a step back and think about it more on the higher conceptual level.

This is a great suggestion, I implemented it and it works 👍

src/ess/reduce/time_of_flight/toa_to_tof.py

SimonHeybrock · 2025-02-07T11:38:24Z

src/ess/reduce/time_of_flight/toa_to_tof.py

+    pulse_index = (
+        (
+            (da.bins.coords['event_time_zero'] - tmin).to(unit=eto_unit)
+            + 0.5 * pulse_period
+        )
+        % frame_period
+    ) // pulse_period


Hmm, if we process data in chunks, without being pulse-skipping aware, won't we get inconsistent pulse_index for various chunks?

I'm not sure I understood which use case you were refering to. Do you mean if we are e.g. processing the data with the StreamProcessor?
Is it because the tmin would be different every time new data comes in?
Does it mean we would need a reference time like the run start time and read that from nexus?

Yes I was referring to something like StreamProcessor. I don't know if it has to come from NeXus, maybe it can be from the first chunk, or something, but simply looking at every chunk seems wrong?

I can add a comment a defer this to another PR?

SimonHeybrock · 2025-02-07T11:40:22Z

src/ess/reduce/time_of_flight/types.py

-Resolution of the time of arrival axis in the lookup table.
-Can be an integer (number of bins) or a sc.Variable (bin width).
+Number of bins to use for the time of arrival (event_time_offset) axis in the lookup
+table. Should be around 1000.


I presume this should be related to the instrument resolution? So instruments with a 1% resolution may need different values than ones with 5%?

Do you think it's better to ask for a resolution in e.g. microseconds instead of a number of bins?

The reason why I did not do that was because the range needs to be exactly [0, 71ms], and if the 71ms is not a multiple of the bin width given by the user, we have to (silently?) modify the resolution that was actually given to us.
If they give a number of bins, we can guarantee that what they put in is reflected in the output.

That said, we could always go with the physical resolution if we properly document that we will guarantee at least what they asked for, but the resolution in the table could actually be a little finer?

Update: I changed it to be a bin width with a unit, instead of an integer.

Co-authored-by: Simon Heybrock <[email protected]>

SimonHeybrock · 2025-02-07T12:07:42Z

src/ess/reduce/time_of_flight/toa_to_tof.py

    simulation: SimulationResults,
    ltotal_range: LtotalRange,
    distance_resolution: DistanceResolution,
-    toa_resolution: TimeOfArrivalResolution,
+    time_resolution: TimeResolution,
+    pulse_period: PulsePeriod,
+    pulse_stride: PulseStride,
+    pulse_stride_offset: PulseStrideOffset,
+    error_threshold: LookupTableRelativeErrorThreshold,
 ) -> TimeOfFlightLookupTable:


This function is so long that the probability that all bugs are caught in code review approaches zero. Can you split it up, so components can be reasoned about and tested individually? Maybe this should turn into a class, but free functions is also an option.

jl-wynen · 2025-02-07T10:39:00Z

requirements/nightly.in

@@ -6,7 +6,7 @@ pooch
 pytest
 scipy>=1.7.0
 scippnexus @ git+https://github.com/scipp/scippnexus@main
-scipp @ https://github.com/scipp/scipp/releases/download/nightly/scipp-nightly-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+scipp @ https://github.com/scipp/scipp/releases/download/nightly/scipp-nightly-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl


Why cp312? Did you lock dependencies in a 3.12 env?

…down

SimonHeybrock

I am generally happy with the changes. It is still hard to truly verify correctness by looking at this, I presume bugs can be weeded out by more extensive testing in the wild?

SimonHeybrock · 2025-02-10T09:38:52Z

src/ess/reduce/time_of_flight/toa_to_tof.py

+    # Now fold the pulses
+    table = table.fold(
+        dim='event_time_offset', sizes={'pulse': pulse_stride, 'event_time_offset': -1}
+    )
+    # The event_time_offset does not need to be 2d, it's the same for all pulses.
+    table.coords['event_time_offset'] = table.coords['event_time_offset']['pulse', 0]
+
+    # We are still missing the upper edge of the table in the event_time_offset axis
+    # (at pulse_period). Because the event_time_offset is periodic, we can simply copy
+    # the left edge over to the right edge.
+    # Note that this needs to be done pulse by pulse, as the left edge of the second
+    # pulse is the same as the right edge of the first pulse, and so on (in the case
+    # of pulse_stride > 1).
+
+    # First, extend the table to the right by 1, and set the coordinate to pulse_period.
+    left = table['event_time_offset', 0]
+    slab = sc.empty_like(left)
+    slab.coords['event_time_offset'] = pulse_period
+    table = sc.concat([table, slab], dim='event_time_offset')
+    # Copy the values. We roll the values along the pulse dimension so that the left
+    # edge of the second pulse is the same as the right edge of the first pulse, and so
+    # on (in the case of pulse_stride > 1).
+    right = table['event_time_offset', -1]
+    right.values = np.roll(left.values, -1, axis=1)
+    right.variances = np.roll(left.variances, -1, axis=1)
+    return table


Is this something like:

table = sc.concat([table, table['event_time_offset', 0], dim='event_time_offset') table.coords['event_time_offset'][-1] = pulse_period return sc.concat([table['event_time_offset', i*size, (i+1)*size+1], for i in range(pulse_stride), dim='pulse')

SimonHeybrock · 2025-02-10T09:41:19Z

src/ess/reduce/time_of_flight/toa_to_tof.py

+    # Compute a pulse index for every event: it is the index of the pulse within a
+    # frame period. When there is no pulse skipping, those are all zero. When there is
+    # pulse skipping, the index ranges from zero to pulse_stride - 1.
+    tmin = da.bins.coords['event_time_zero'].min()


Maybe one could simply use epoch? Or some other fixed datetime?

The more I think about this, the more it seems there's something wrong with the approach.

I tried with just using epoch, and for the examples we have, it also works.
But if I offset my event time zeros by a pulse period, the results are all wrong until I correct it using PulseStrideOffset=1.

This is reproducing the situation where one would start recording the file one pulse period later.
In practice, we can never really know when this was.
This would make auto-reduction impossible, because right now one needs to look at the data, and if it looks wrong, change the value of PulseStrideOffset.

I'm thinking we can only know by looking at the open and close times of the pulse skipping chopper?

I have an ugly hack in https://github.com/scipp/essreduce/tree/pulse-index which basically tries all possible PulseStrideOffsets and then picks the result with the least number of NaNs in it.

It works for the examples I've tried, but it doesn't feel safe and it also doubles or triples the cost of computing tof.
We could optimize by using just a small number of events at the start of the event buffer to decide which offset to go with, but it doesn't fix the fact that it doesn't feel very robust (might actually make robustness worse; how do you decide when you have enough neutrons to perform the test?)

…wrapped-tof-lookup

…n doing pulse skipping

nvaytet · 2025-02-11T13:29:07Z

There is still the issue with the reference time for the pulse_index, but this is blocking other work.
I opened #184 .
I will merge to we can move forward.

nvaytet added 30 commits January 31, 2025 17:15

start making a wrapped lookup table instead of having to unwrap and t…

9b1347f

…hen fold the events

use event_time_offset coord from raw data

6b858ab

TimeOfArrivalResolution -> TimeResolution

0208890

fix table bins in time and begin cleanup

a313893

give a pulse_index to each event to allow for pulse skipping, and use…

a931ce8

… a 3d interpolator

fix mesh in pulse dimension

15f5f8b

remove the need for concatenating bins using modulo

1e54929

compute table in chunks to avoid ram issues

7ba4464

cleanup

9110905

move pulse stride offset to table computation

0eb5ae3

fix some unwrap tests

d41ffec

support histogram data. there is still an issue with to_events

ee922c4

avoid interpolation for table values from centers to edges by using p…

96ef840

…eriodic boundary conditions

cleanup and remove some duplicate calculations

f05e123

add bin edge at 71ms in case of histogrammed data and refactor code f…

5c26a2c

…or a cleaner split between histogram and event mode

update to_events tests

525d6a5

fix more unwrap tests

3961df1

fix more tests, but we still have an issue with the periodic boundary…

a2f5487

… conditions when pulse skipping

start adding pulse index lookup

5c792ff

Merge branch 'wrapped-tof-lookup' of github.com:scipp/essreduce into …

204231f

…wrapped-tof-lookup

ditch the lookup and use modulo ops instead

8db738c

fix test when there are no events in the first pulse

e391466

go back to the lookup as the algorithm is easier to understand and mo…

41d3b21

…st probably cheaper

fix remaining unwrap tests

d147477

fix some WFM tests

8b07aa0

fix last wfm tests

e521b99

Merge branch 'main' into wrapped-tof-lookup

1b2d3a4

update wfm notebook

7574f9c

update deps to get latest tof

35cd5cb

update dream notebook

5866c55

nvaytet marked this pull request as ready for review February 7, 2025 10:28

nvaytet requested review from SimonHeybrock and jl-wynen February 7, 2025 10:29

SimonHeybrock reviewed Feb 7, 2025

View reviewed changes

Update src/ess/reduce/time_of_flight/toa_to_tof.py

8e22b1a

Co-authored-by: Simon Heybrock <[email protected]>

SimonHeybrock reviewed Feb 7, 2025

View reviewed changes

jl-wynen reviewed Feb 7, 2025

View reviewed changes

nvaytet added 6 commits February 7, 2025 13:42

refactor flattening on simulation neutron event data

8b9d47a

compute table on frame period and fold along pulse dimension further …

0e6ee6c

…down

change time resolution to be a scalar variable

0a218fa

split long function into separate parts

51849f9

fix pulse stride offset

17e3e3a

update deps in py310 env

d4b8800

nvaytet mentioned this pull request Feb 7, 2025

Read Ltotal from nexus file rather than from detector data array in tof workflow #183

Closed

nvaytet and others added 3 commits February 7, 2025 16:35

Merge branch 'main' into wrapped-tof-lookup

fa1d7e0

add comment about pulse indices for data arriving in chunks

c15c84c

Merge branch 'main' into wrapped-tof-lookup

2db88f3

SimonHeybrock approved these changes Feb 10, 2025

View reviewed changes

nvaytet added 6 commits February 10, 2025 12:56

simplify table folding

01ed6e7

fix table folding

f08200b

Merge branch 'wrapped-tof-lookup' of github.com:scipp/essreduce into …

18d7ee9

…wrapped-tof-lookup

allow to simulate more than one pulse. This is sometimes required whe…

b9ed7b7

…n doing pulse skipping

add test with pulse skipping chopper phase offset by 180 deg

975ef9b

simplify distance bins

fa28370

nvaytet mentioned this pull request Feb 11, 2025

Find a reference time for computing pulse_index #184

Open

nvaytet merged commit 7c2bcb1 into main Feb 11, 2025
4 checks passed

nvaytet deleted the wrapped-tof-lookup branch February 11, 2025 13:29

This was referenced Feb 13, 2025

Time-of-flight: use epoch reference time and guess the PulseStrideOffset #186

Merged

Use time-of-flight workflow in Dream data reduction scipp/essdiffraction#125

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make a wrapped lookup table to time-of-flight #180

Make a wrapped lookup table to time-of-flight #180

nvaytet commented Feb 6, 2025 •

edited

Loading

SimonHeybrock Feb 7, 2025

SimonHeybrock Feb 7, 2025

nvaytet Feb 7, 2025

SimonHeybrock Feb 7, 2025

nvaytet Feb 7, 2025

SimonHeybrock Feb 10, 2025

nvaytet Feb 10, 2025

SimonHeybrock Feb 7, 2025

nvaytet Feb 7, 2025

nvaytet Feb 7, 2025

SimonHeybrock Feb 7, 2025

nvaytet Feb 7, 2025

jl-wynen Feb 7, 2025

nvaytet Feb 7, 2025

SimonHeybrock left a comment

SimonHeybrock Feb 10, 2025

SimonHeybrock Feb 10, 2025

nvaytet Feb 10, 2025 •

edited

Loading

nvaytet Feb 11, 2025

nvaytet commented Feb 11, 2025

Make a wrapped lookup table to time-of-flight #180

Make a wrapped lookup table to time-of-flight #180

Conversation

nvaytet commented Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SimonHeybrock left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nvaytet Feb 10, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nvaytet commented Feb 11, 2025

nvaytet commented Feb 6, 2025 •

edited

Loading

nvaytet Feb 10, 2025 •

edited

Loading