Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cep 2: event structure #2304

Merged
merged 8 commits into from
Sep 8, 2023
Merged

cep 2: event structure #2304

merged 8 commits into from
Sep 8, 2023

Conversation

maxnoe
Copy link
Member

@maxnoe maxnoe commented Apr 5, 2023

No description provided.

Copy link
Contributor

@kosack kosack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be useful to mention what this change would affect: namely all code that uses ctapipe's event-level API would need to be updated. Perhaps giving an example (or a few common examples) of what needs to be done to upgrade that software, e.g.:

# The orignal:
event.dl0.tel[15].waveform

# becomes
event.tel[15].dl0.waveform

Maybe also an example of how loops are simplified.

In other words, maybe add a "Migration Guide" section to help e.g. NectarChain and LSTChain

@kosack
Copy link
Contributor

kosack commented Apr 5, 2023

Related note: I wonder if we could provide a rope or bowler rule to automatically refactor code (as a first pass?)

Rope in particular mentions this as a feature:

Restructuring (like converting ${a}.f(${b}) to ${b}.g(${a}) where a: type=mymod.A)

@RuneDominik
Copy link
Member

Looking at the proposed structure the muon information is missing. In the current stucture it is listed on the same structure-level as the data-levels. What would be the new structure for it?

@ccossou
Copy link
Contributor

ccossou commented Apr 14, 2023

I have a few comment/questions:

  • dl3 combine multiple telescopes, so that datalevel probably can't be at the same level in the data structure than others. If that's true, it would be weird to have dlX not at the same place in the structure.
  • in the event of an observation block being too big, and if we have to implement a segmented datastructure to store all DL0 information on disk in multiple sub-files, would the old or new structure be better shaped to handle such an evolution?

My uneducated guess would tend to prefer the following structure:
dl1.telX.event.waveform

Meaning that the array of event would be directly packed as data level then telescope when it make sense, or for dl3, only have dl3.event.

I don't know if from a code point of view this make sense, but that's how I'd do it.

@maxnoe
Copy link
Member Author

maxnoe commented Apr 14, 2023

dl3 combine multiple telescopes, so that datalevel probably can't be at the same level in the data structure than others. If that's true, it would be weird to have dlX not at the same place in the structure.

Yes, we probably won't have TelescopeEventContainer.dl3 , but only ArrayEventContainer.dl3

in the event of an observation block being too big, and if we have to implement a segmented datastructure to store all DL0 information on disk in multiple sub-files, would the old or new structure be better shaped to handle such an evolution?

My uneducated guess would tend to prefer the following structure: dl1.telX.event.waveform

That's what we have now and this CEP is proposing to change that.

I will expand the text to better explain, but I think it's the opposite: we will have DL0 event files per telescope, so we might want to process DL0 to DL1 for telescopes individually, which will be much more natural if we have multiple data levels inside one TelescopeEvent (to perform the dl0 to dl1 transformation on one telescope) than having to create ArrayEvents for each Telescope with only one telescope in each of the data levels.

It is also much more natural then to merge multiple TelescopeEvent s into one ArrayEvent.

By the way: this is also how sim_telarray output is organized:

ArrayEvent[2010](event_id=9610)
     TriggerInformation[2009](event_id=9610)
     TelescopeEvent[2225](telescope_id=25)
         TelescopeEventHeader[2011](telescope_id=25)
         ADCSamples[2013](telescope_id=25)
         PixelTiming[2016](telescope_id=25)
         ImageParameters[2014](telescope_id=25)
         PixelList[2027](telescope_id=25, code=1, kind=selected)
     TelescopeEvent[3225](telescope_id=125)
         TelescopeEventHeader[2011](telescope_id=125)
         ADCSamples[2013](telescope_id=125)
         PixelTiming[2016](telescope_id=125)
         ImageParameters[2014](telescope_id=125)
         PixelList[2027](telescope_id=125, code=1, kind=selected)
     TelescopeEvent[3230](telescope_id=130)
         TelescopeEventHeader[2011](telescope_id=130)
         ADCSamples[2013](telescope_id=130)
         PixelTiming[2016](telescope_id=130)
         ImageParameters[2014](telescope_id=130)
         PixelList[2027](telescope_id=130, code=0, kind=triggered)
         PixelList[2027](telescope_id=130, code=1, kind=selected)

@kosack
Copy link
Contributor

kosack commented Apr 17, 2023

Looking at the proposed structure the muon information is missing. In the current stucture it is listed on the same structure-level as the data-levels. What would be the new structure for it?

Muon info are just DL1 parameters, so would appear under event.tel[tel_id].dl1 However, we might need to have DL1 split by a Map of strings similar to DL2, so that we can support multiple images/cleanings

@maxnoe
Copy link
Member Author

maxnoe commented May 16, 2023

I updated the cep with a couple of before / after code examples and some more text

docs/development/ceps/proposed/cep-002.rst Outdated Show resolved Hide resolved
docs/development/ceps/proposed/cep-002.rst Outdated Show resolved Hide resolved
docs/development/ceps/proposed/cep-002.rst Outdated Show resolved Hide resolved
kosack
kosack previously approved these changes Jun 1, 2023
Copy link
Contributor

@kosack kosack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with the new examples added.

@kosack
Copy link
Contributor

kosack commented Jun 1, 2023

"Accepted" and start working on the change. Then we record in this CEP some typical changes that are needed to refactor the code to fit the new structure, and every gnarly special case along with its solution.

One question is when to involve other stakeholders, i.e. the LST, NectarCam, CHEC/SSTCam, (and maybe MAGIC?) teams that are using ctapipe already. Should we ask for comments or just go ahead? At the very least, we might want to demo it at a weekly meeting and invite them to join

@kosack
Copy link
Contributor

kosack commented Jun 2, 2023

Oh, one thing I just realized is not mentioned here that could be useful is the canonical way to determine if a telescope has information at a particular data level. Now that there is only one tel_id map, just looking at tel.keys() is not sufficient, so it would be good to show how it should be done.

In the example:

hillas_dicts = {
    tel_id: dl1.parameters.hillas
    for tel_id, dl1 in event.dl1.items()
}

#After:

hillas_dicts = {
    tel_id: tel_event.dl1.parameters.hillas
    for tel_id, tel_event in event.tel.items()
}

the case where there is no dl1 object or no tel_event.dl1.parameters for a given tel isn't explained.

@maxnoe
Copy link
Member Author

maxnoe commented Jun 2, 2023

How we work at the moment is that all telescopes get all data levels at the same time and we fill invalid values for telescopes that do not fulfill some criteria (e.g. we will have the image parameters, but they might be nan if not enough pixels survived the cleaning).

So the situation you mentioned, that there are dl1 containers for some telescopes but not others cannot really happen. The DL1 container will be there for all telescope events, but some might contain invalid value markers.

The full loop in the hillas reconstructors looks like this now:

        hillas_dict = {
            tel_id: dl1.parameters.hillas
            for tel_id, dl1 in event.dl1.tel.items()
            if all(self.quality_query(parameters=dl1.parameters))
        }

and will look like this after the change:

        hillas_dict = {
            tel_id: tel_event.dl1.parameters.hillas
            for tel_id, tel_event in event.tel.items()
            if all(self.quality_query(parameters=tel_event.dl1.parameters))
        }

kosack
kosack previously approved these changes Jun 2, 2023
@Tobychev
Copy link
Contributor

To restate what I said in inline comment (in violation of CEP 1!), I think this CEP looks pretty clear, and the reasons for carrying out these changes are sound.

One question is when to involve other stakeholders, i.e. the LST, NectarCam, CHEC/SSTCam, (and maybe MAGIC?) teams that are using ctapipe already. Should we ask for comments or just go ahead? At the very least, we might want to demo it at a weekly meeting and invite them to join

I just re-read CEP 1, and one thing I think we've forgotten is announcing this CEP according to the workflow we defined, in particular I think this haven't happened:

an announcement with a link to the pull request should be sent both to the ctapipe mailing list and the ctapipe coordinator list.

so some sort of announcement that will reach more people is already mandated. Separately, I think prodding those "external" stakeholders to leave a review (or at least proof -of-skimming) is valuable, but I'll let @kosack judge if a mailing list announcement is a firm enough poke to get some response.

@maxnoe
Copy link
Member Author

maxnoe commented Jul 6, 2023

I made one last change (removing trigger from the top-level of both SubarrayEvent and TelescopeEvent.

This is related to #2374, trigger info should be part of DL0.

@maxnoe maxnoe requested a review from kosack July 6, 2023 07:56
Tobychev
Tobychev previously approved these changes Jul 7, 2023
kosack
kosack previously approved these changes Jul 21, 2023
@maxnoe
Copy link
Member Author

maxnoe commented Aug 31, 2023

I started implementing a zfits event source following the current draft of the ACADA DPPS ICD, in part as preparation for the ACADA LST integration tests on La Palma.

On thing I realized is that with this proposal here, it would be possible to implement subarray event sources and telescope event sources.

The latter would be useful for reading DL0 telescope event data and processing it up to dl1 in parallel, where we don't need stereo information and to read the anticipated separate streams for shower, calibration and muon events.

E.g. calib pipe would probably only ever use a DL0TelescopeEventSource on calibration and muon stream files and dl1 could processing could also run using that on the shower stream.

A possible SubarrayEventSource would then read the single files and merge the separate telescope data into the SubarrayEvent structure.

@kosack
Copy link
Contributor

kosack commented Sep 8, 2023

Will merge into "proposed" and we will open a new PR moving it to "Accepted" for external review

Tobychev
Tobychev previously approved these changes Sep 8, 2023
Tobychev
Tobychev previously approved these changes Sep 8, 2023
@maxnoe
Copy link
Member Author

maxnoe commented Sep 8, 2023

Ok, merging now to have it under proposed rendered in the docs so we can collect comments.

@maxnoe maxnoe merged commit 6919aa9 into main Sep 8, 2023
12 checks passed
@maxnoe maxnoe deleted the cep02_event_structure branch September 8, 2023 14:31
@kosack kosack mentioned this pull request Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants