You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What we really want is for the dataset to just contain a stabile identifier, preferably a GUID or UUID. This is the preferred option.
All the other points are what we'd need to do if this is impossible:
If we introduce a new occurrenceID, we need to archive the data as it is on GBIF, and start a new dataset for the new data
The cutoff between the datasets is the dataset as it is, before the 7000 records got removed and recovered with a different ID
If split, and older records (before the cutoff) change, these changes will not be reflected on GBIF
If split, there will be one repo per dataset, this repo will be transitioned into the new dataset that will remain updated, a second repository will be created for the old dataset as it stands from a fork as this one. The current GBIF dataset (on the IPT) will need to be updated so it refers to the url of this new repo (with the old data in it)
Lien will schedule a meeting with RATO to discuss the implications of this change, we'd really prefer a stabile identifier to be provided instead
If it turns out that Dossier_ID and/or Laatst_Bewerkt_Datum are not stabile (I've been assured that if they change a record, they make a new one instead) all this effort will be for naught, they have changed Dossier_ID in the past
The occurrenceId generated from these two fields should really be a hash of the two fields instead
Dates are notorious for causing trouble, this will require extra care implementing
The text was updated successfully, but these errors were encountered:
I made a mistake in the first post, we need to split off the older records anyway to maintain the old occurrenceIDs. This means we'll need to archive the current dataset in any case.
Had a call with Emiel, he agrees a stabile identifier on the database side is the preferred option. That way we would not need to mint our own as a data processor, a formula that is bound to go wrong sooner or later.
He'll discuss it with the GIS consultant (Sander), and get back to us.
Discussed with Sander and Anke, propose using
Dossier_ID
concatenated withLaatst_Bewerkt_Datum
asoccurrenceID
Discussed with @damianooldoni and @LienReyserhove:
All the other points are what we'd need to do if this is impossible:
Dossier_ID
and/orLaatst_Bewerkt_Datum
are not stabile (I've been assured that if they change a record, they make a new one instead) all this effort will be for naught, they have changedDossier_ID
in the pastThe text was updated successfully, but these errors were encountered: