Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

77 split the first intervention into a casual observation and the intervention itself #86

Conversation

PietrH
Copy link
Member

@PietrH PietrH commented Oct 27, 2023

What changed?

Splitting the first intervention of an event into an observation, and the original intervention

The mapping now looks for the first record of every event that meets the following criteria:

  • dossier_status == "Opvolging"
  • Has any value for samplingProtocol

These records get duplicated, and assigned a new occurrenceID which is the same as the original source record, but with a suffix: -cas, and get the samplingProtocol set to casual observation.

setting a samplingProtocol for observations

When the following criteria are met:

  • dossier_status != "Verwerkt en afgesloten"
  • there is no value for samplingProtocol

There records are assigned the value casual observation for samplingProtocol

Other

Some small improvements to code documentation and clarity, added a failure mode for duplicate occurrenceID's, added a simple test for the changes above.

I'm open to any idea's for a more thorough test. What could we check to make sure the new changes to the mapping keep having the desired effect. Any examples of what could go wrong?

Before Review:

  • Create occurrence outputs for both the old and the new mapping
  • Create file with only the differences between the old and the new mapping
  • Lien will review

Questions for review

  • Not every eventID will have a casual observation, this is because of the requirement for dossier_status == "Opvolging", and dossier_status == "Verwerkt en afgesloten" are still making it to the occurrence.csv. This is supposed to happen, right?

Difference between old an new mapping outputs

For the review, from the same raw data:

Only the differences between the two: 20231130_diff_rato_mapping.html (Download, and open with browser)

@PietrH PietrH added enhancement New feature or request mapping labels Oct 27, 2023
@PietrH PietrH self-assigned this Oct 27, 2023
@PietrH
Copy link
Member Author

PietrH commented Nov 8, 2023

Blocked by review and #94

@LienReyserhove
Copy link
Contributor

As discussed in our meeting yesterday, we decided to put this PR on hold. We considered that it might not be convenient to split occurrences after all.

The idea behind the split is the following. For the early alert tool, we want to know when a species was first observed at a certain location. This is the first record within an event for which samplingProtocol= casual observation. There's no clear way to identify these observations in the RATO dataset at this point. Most events in the dataset do not seem to hold any observations at all, most records are management actions.

The "hack" described in this PR would be a first step in resolving this issue. For each event (all records with the same Dossier_ID), this hack creates a first observation if it is not provided in the raw dataset. The first management action is split up into one observation and one action, based on these criteria. Only the observations (samplingProtocol= casual observation) would be displayed by the early warning tool, the management actions would be filtered out.

After a discussion with @damianooldoni and @peterdesmet, we suggest to use another approach: all records with the following conditions are considered to be management actions:

  • Actie is not empty and/or
  • Materiaal_Vast is not empty

For these records, samplingProtocol is not equal to casual observation (something like: fike or actionor ...). If an event has no casual observation, we will not generate one ourselves. All records will eventually flow to the early alert tool, including the management actions. The user of this platform can decide whether he/she can filter out management actions or not.

However, as there are a lot of uncertainties at this point, we decided to put this PR on hold.

@PietrH PietrH marked this pull request as draft November 14, 2023 10:08
@PietrH
Copy link
Member Author

PietrH commented Nov 14, 2023

I'm still expecting some information from RATO on this topic, they are going to investigate on the data model side if they can come up with some sort of ruling on what is an intervention, and what isn't.

I understand that for now, we will not be populating any records samplingProtocol with casual observation at all. And will await input/further changes on the side of the early alert tool.

I've converted this PR to draft, as it's no longer in the queue for merging to main.

@PietrH
Copy link
Member Author

PietrH commented Dec 6, 2023

As per #134 we a no longer planning to split records, but instead use a decision tree to decide if a record should be considered a casual observation, or an intervention.

@PietrH PietrH closed this Dec 6, 2023
@PietrH PietrH deleted the 77-split-the-first-intervention-into-a-casual-observation-and-the-intervention-itself branch December 9, 2024 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request mapping
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Split the first intervention into a casual observation and the intervention itself
2 participants