Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema and example drafts for the KM3NeT alert programm #194

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

VincentLuminy
Copy link

Description

First draft of KM3NeT alert schema and example. Not yet in production (no publicly available alerts) and subject to evolution.

Testing

import json
from jsonschema import validate, ValidationError

# Load JSON schema from file
with open('./gcn/notices/km3net/test/medal_ranking_alert.schema.json', 'r') as file:
    schema = json.load(file)

# Load JSON example from file
with open('./gcn/notices/km3net/test/medal_ranking_alert.example.json', 'r') as file:
    json_data = json.load(file)

# Validate that example match schema
try:
    validate(instance=json_data, schema=schema)
    print("JSON valide")
except ValidationError as e:
    print(f"JSON invalide : {e.message}")

package.json Outdated Show resolved Hide resolved
Co-authored-by: Leo Singer <[email protected]>
"description": "No search / No match / Array of known astrophysical sources present in the KM3NeT event RoI (not exhaustive) OR reference and position for a coincident time-variable source."
}
},
"required": [
Copy link
Member

@Vidushi-GitHub Vidushi-GitHub Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wish you can skip "required" option, it will enforce all fields must be there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it to help producer and receiver to understand what they can expect. If you don't see any disadvantage of having it, I would like to keep it.
However if you strongly advise to remove it, I will.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your pipeline will produce all the fields for each notices, then fine.
One can read at schema browser (https://gcn.nasa.gov/docs/schema/v4.1.0/gcn/notices), for what to expect, even we will soon create and announce mission page soon after your feedback.
eg: https://gcn.nasa.gov/missions/einstein-probe
It will put constraint on your pipeline, it's upto you.

"healpix_url": "https://www.km3net.org/about-km3net/open-access/",
"far": 8.029e-8,
"additional_info": "Track only / Track+Shower analysis. Up-going / All-sky selection. Analysis pipeline event selection tuned to select X event per month in average.",
"triggering_evts": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this ra,dec is different from the event above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case (where the alert is produced by only one event) there is no difference.
But we have analysis pipeline where several events (coming from the same direction in a short time window, but individually below our threshold to send alert) would be identified as a signal of interest and lead to alert creation.
In that case, the coordinates in "triggering_evts" are the coordinates of our individual events, while the "ra", "dec" and "ra_dec_error" of the alert body correspond to the most probable direction.

We wanted to have an alert message as generic as possible, this is why we decided to duplicate the information, but I am open to suggestions if you have.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks.
Localization (ra,dec) is used at two places.

(---)"ra": 13.82,
"dec": 19.01,
"ra_dec_error": 0.9,
"healpix_url": "https://www.km3net.org/about-km3net/open-access/",
"far": 8.029e-8,
"additional_info": "Track only / Track+Shower analysis. Up-going / All-sky selection. Analysis pipeline event selection tuned to select X event per month in average.",
"triggering_evts": [
{
"trigger_time": "2024-09-01T12:00:00.00Z",
(---) "ra": 13.82,
"dec": 19.01,

I would suggest, either put all localizations in trigger_evts (ra, dec) list, Or create two examples files for one schema. One for alert, another for several events analysis pipeline.
It's creating confusion, example, whether above one is follow-up event.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed it could be confusing. I will create a second example file with several events in the alert. We are also discussing to rename "triggering_evts" into something more explicit ("hits_information" or "internal_triggers_information" or something else). I will commit the change as soon as we reach an internal agreement.
The descriptions in the schema file will be updated as well (both for the alert coordinates and the triggering_evts variable).

@Vidushi-GitHub
Copy link
Member

Hi Vincent, in continuation of your Notices type, your plan (email reference) is "Regarding the topic name, we are planning to send three types of notices: Gold, Silver, and Bronze. However, we are using a single schema called "medal_ranking_alert." I hope it won't be too much of a problem."
Sounds good for keeping one topic. However in schema would you like to add medal_rank property with enum [bronze, gold, silver] ?

"alert_type": "initial",
"alert_datetime": "2024-09-01T12:01:00.00Z",
"analysis_pipeline": "exceptional_evt_arca",
"description": "KM3NeT online analysis, bronze candidate neutrino observation.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it specific field, like medal_rank, instead of description. It is useful property, should be machine-readable.

"$schema": true,
"packet_type": {
"type": "number",
"description": "packet_type provided by GCN as for the old GCN format, associatied to notice_type/topic Gold=, Silver=, Bronze=."
Copy link
Member

@Vidushi-GitHub Vidushi-GitHub Sep 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this associated with your pipeline?
Or, GCN Classic? As we will not rely on the old system in the future. And I am afraid that we have packet_type with new Notices.

@jracusin do we have packet type with new system?
KM3NeT had following concern over the email: "Additionally, my collaboration has requested that we include a packet_type in our alerts. Would it be possible to provide us with three distinct packet_type values (one for each of the Gold, Silver, and Bronze alerts), similar to how it was done with the classical VOEvent messages?"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you stated, we would like to be able to give a packet_type in the notice, like it was done in GCN Classic. The number here are random for example, but will be updated with the accurate value if we have one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Vidushi-GitHub @jracusin would it be possible to have a packet_type even if our notices are not part of GCN classic ?
I know it is not needed from a technical point of view with Kafka, but we want to provide this information for compatibility with pipelines already existing in experiments susceptible to receive our alerts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VincentLuminy We'd rather not reference back to GCN Classic notice formats. We can definitely find a way to list your classification of "Gold, Silver, Bronze", but "packet_type" isn't the most descriptive or intuitive keyword.

You could use the classification in the statistics core schema, though that is designed to have a probabilistic classification for each choice as a dictionary.

You could use "additionalInfo" like IceCube.

Or you could create a custom schema keyword like you have done, but maybe name it something more intuitive like "alert_criteria" or "alert_credibility".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants