77 split the first intervention into a casual observation and the intervention itself #86

PietrH · 2023-10-27T09:22:58Z

What changed?

Splitting the first intervention of an event into an observation, and the original intervention

The mapping now looks for the first record of every event that meets the following criteria:

dossier_status == "Opvolging"
Has any value for samplingProtocol

These records get duplicated, and assigned a new occurrenceID which is the same as the original source record, but with a suffix: -cas, and get the samplingProtocol set to casual observation.

setting a samplingProtocol for observations

When the following criteria are met:

dossier_status != "Verwerkt en afgesloten"
there is no value for samplingProtocol

There records are assigned the value casual observation for samplingProtocol

Other

Some small improvements to code documentation and clarity, added a failure mode for duplicate occurrenceID's, added a simple test for the changes above.

I'm open to any idea's for a more thorough test. What could we check to make sure the new changes to the mapping keep having the desired effect. Any examples of what could go wrong?

Before Review:

Create occurrence outputs for both the old and the new mapping
Create file with only the differences between the old and the new mapping
Lien will review

Questions for review

Not every eventID will have a casual observation, this is because of the requirement for dossier_status == "Opvolging", and dossier_status == "Verwerkt en afgesloten" are still making it to the occurrence.csv. This is supposed to happen, right?

Difference between old an new mapping outputs

For the review, from the same raw data:

Only the differences between the two: 20231130_diff_rato_mapping.html (Download, and open with browser)

… `samplingProtocol` is empty

PietrH · 2023-11-08T13:46:34Z

Blocked by review and #94

LienReyserhove · 2023-11-10T09:27:11Z

As discussed in our meeting yesterday, we decided to put this PR on hold. We considered that it might not be convenient to split occurrences after all.

The idea behind the split is the following. For the early alert tool, we want to know when a species was first observed at a certain location. This is the first record within an event for which samplingProtocol= casual observation. There's no clear way to identify these observations in the RATO dataset at this point. Most events in the dataset do not seem to hold any observations at all, most records are management actions.

The "hack" described in this PR would be a first step in resolving this issue. For each event (all records with the same Dossier_ID), this hack creates a first observation if it is not provided in the raw dataset. The first management action is split up into one observation and one action, based on these criteria. Only the observations (samplingProtocol= casual observation) would be displayed by the early warning tool, the management actions would be filtered out.

After a discussion with @damianooldoni and @peterdesmet, we suggest to use another approach: all records with the following conditions are considered to be management actions:

Actie is not empty and/or
Materiaal_Vast is not empty

For these records, samplingProtocol is not equal to casual observation (something like: fike or actionor ...). If an event has no casual observation, we will not generate one ourselves. All records will eventually flow to the early alert tool, including the management actions. The user of this platform can decide whether he/she can filter out management actions or not.

However, as there are a lot of uncertainties at this point, we decided to put this PR on hold.

PietrH · 2023-11-14T10:11:30Z

I'm still expecting some information from RATO on this topic, they are going to investigate on the data model side if they can come up with some sort of ruling on what is an intervention, and what isn't.

I understand that for now, we will not be populating any records samplingProtocol with casual observation at all. And will await input/further changes on the side of the early alert tool.

I've converted this PR to draft, as it's no longer in the queue for merging to main.

PietrH · 2023-12-06T10:35:03Z

As per #134 we a no longer planning to split records, but instead use a decision tree to decide if a record should be considered a casual observation, or an intervention.

PietrH added 6 commits September 26, 2023 15:11

add chunk titles

936b593

use explicit assignment

6df7287

add chunk title

340e7f4

update authors

dcae0b3

clarify that input_data is not a stable object

d483396

I want a message if empty rows are removed

4529557

PietrH linked an issue Oct 27, 2023 that may be closed by this pull request

Split the first intervention into a casual observation and the intervention itself #77

Closed

PietrH added enhancement New feature or request mapping labels Oct 27, 2023

PietrH added 7 commits October 27, 2023 11:45

fix typo

28987a2

Stop on error if objectid=occurrenceID's are not unique

c570b37

use correct term for pipes: |

48896b5

Extract first interventions per event to be duplicated to observations

4a583e7

create new observations from first interventions

b869ae8

add new observations as new rows to input_data

16596f4

Set samplingProtocol to casual observation for most records where…

2d49ed3

… `samplingProtocol` is empty

PietrH self-assigned this Oct 27, 2023

PietrH added 2 commits October 27, 2023 14:35

add test to check for new samplingProtocol value

4689de8

add quotes

4051673

PietrH marked this pull request as ready for review October 30, 2023 15:14

PietrH requested a review from LienReyserhove October 30, 2023 15:15

PietrH mentioned this pull request Nov 8, 2023

Distinguish observations from management actions riparias/early-alert-webapp#6

Open

PietrH marked this pull request as draft November 14, 2023 10:08

PietrH mentioned this pull request Nov 16, 2023

Test for new values of samplingProtocol #110

Open

PietrH closed this Dec 6, 2023

PietrH deleted the 77-split-the-first-intervention-into-a-casual-observation-and-the-intervention-itself branch December 9, 2024 09:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

77 split the first intervention into a casual observation and the intervention itself #86

77 split the first intervention into a casual observation and the intervention itself #86

PietrH commented Oct 27, 2023 •

edited

Loading

PietrH commented Nov 8, 2023

LienReyserhove commented Nov 10, 2023

PietrH commented Nov 14, 2023

PietrH commented Dec 6, 2023

77 split the first intervention into a casual observation and the intervention itself #86

77 split the first intervention into a casual observation and the intervention itself #86

Conversation

PietrH commented Oct 27, 2023 • edited Loading

What changed?

Splitting the first intervention of an event into an observation, and the original intervention

setting a samplingProtocol for observations

Other

Before Review:

Questions for review

Difference between old an new mapping outputs

PietrH commented Nov 8, 2023

LienReyserhove commented Nov 10, 2023

PietrH commented Nov 14, 2023

PietrH commented Dec 6, 2023

PietrH commented Oct 27, 2023 •

edited

Loading