Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolution of taxonID/scientificNameID in the matching index (xcol) #1321

Open
djtfmartin opened this issue May 17, 2024 · 0 comments
Open
Assignees
Labels

Comments

@djtfmartin
Copy link
Contributor

djtfmartin commented May 17, 2024

It would be useful if the matching service for extended COL supported the resolution of persistent identifiers for taxa such as the WoRMS IDs, IPNI LSIDs, ZooKeys, UK Species Inventory IDs.

This would give the pipelines a single service to resolve both taxonID and the more typical scientificName(and classification) to a taxon in the backbone taxonomy.

Currently, the XCOL dataset in CLB only contains a subset of these identifiers linked to name usages, as it only stores the primary source for the name usage (via the verbatim_source table). To support this properly we'd need another mechanism to associate all persistent IDs with entries in XCOL.

After some discussion, there are few potential approaches identified to support the generation an additional index (alongside the current matching-ws index) that contains all IDs, and has a mapping to a XCOL where possible:

  • match a selected number checklists (which are known to have persistent IDs) to COL using the CLB web services as an additional step of the index generation to produce a CSV map of identifiers e.g urn:lsid:marinespecies.org:taxname:2131232 -> 4NPTV
  • match checklists during import to COL and persist that in the CLB DB. These links can then be exported along with name_usages as a separate file.

Related to: gbif/pipelines#217
In particular, there are details of implemented flags here - gbif/pipelines#217 (comment)

@djtfmartin djtfmartin self-assigned this May 17, 2024
djtfmartin added a commit that referenced this issue May 30, 2024
djtfmartin added a commit that referenced this issue Jun 2, 2024
djtfmartin added a commit that referenced this issue Jun 2, 2024
djtfmartin added a commit that referenced this issue Jun 5, 2024
djtfmartin added a commit that referenced this issue Jun 6, 2024
djtfmartin added a commit that referenced this issue Jun 6, 2024
djtfmartin added a commit that referenced this issue Jun 11, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 27, 2024
djtfmartin added a commit that referenced this issue Jun 28, 2024
djtfmartin added a commit that referenced this issue Jun 28, 2024
djtfmartin added a commit that referenced this issue Jun 28, 2024
djtfmartin added a commit that referenced this issue Jun 28, 2024
djtfmartin added a commit that referenced this issue Jun 28, 2024
djtfmartin added a commit that referenced this issue Jun 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant