Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResearchStudy.condition coding inconsistencies between dbGap, AnVIL and KidsFirst #44

Open
bwalsh opened this issue Dec 1, 2021 · 7 comments

Comments

@bwalsh
Copy link
Contributor

bwalsh commented Dec 1, 2021

We should use the same ontologies (or provide mappings).
While all three systems use ResearchStudy.condition to indicate the condition that is the focus of the study.
All three have a significant number of studies without a populated condition.
In addition, all three systems use different ontologies:

system:
ontology system: study_count

* 'kidsfirst': 
    * None: 21,
    * 'http://snomed.info/sct': 8
* 'dbgap':
    * None: 234,
    * 'https://dbgap-api.ncbi.nlm.nih.gov/fhir/x1/NamingSystem/MeshEntryTerm': 1597,
    * 'https://uts.nlm.nih.gov/metathesaurus.html': 1459,
    * 'urn:oid:2.16.840.1.113883.6.177': 1591    
* 'anvil':
    * None: 358,
    * 'http://purl.obolibrary.org/obo/doid.owl': 35

@linikujp
Copy link
Member

linikujp commented Dec 2, 2021

Hi Brian, the FHIR ResearchStudy condition and focus are two different things.
Some studies do not have a condition, because the study does not focus on conditions, rather than normal sample as a control or reference. We have explored this when looking at the NCPI data catalog.

Thanks,
Asiyah

@bwalsh
Copy link
Contributor Author

bwalsh commented Dec 2, 2021

Hi Brian, the FHIR ResearchStudy condition and focus are two different things. Some studies do not have a condition, because the study does not focus on conditions, rather than normal sample as a control or reference. We have explored this when looking at the NCPI data catalog.

Thanks, Asiyah

@linikujp

Hi Asiyah,

Agreed that condition and focus are two different things - that may explain at least some of the studies without a Condition.

However, where there is a condition, I'm concerned that we have 5 different ontology systems.

  • What guidance would we provide to researchers searching by condition?
  • What might we do to reduce the number of ontologies used or simplify mapping between them?

@linikujp
Copy link
Member

linikujp commented Dec 2, 2021

Hi Brian,
Can you list the 5 different ontology systems? I have done some work of mapping DO-MONDO-MeSH for NCPI data catalog. which is based on the dbGaP FHIR server. I also have some experience with ontology harmonization or alignment, and I am working with DO and MONDO to resolve the inconsistencies I see from NCPI use case. If you list the 5 ontologies out, we may make a case for aligning all these 5 ontologies for the NCPI community.
Thanks,
Asiyah

@bwalsh
Copy link
Contributor Author

bwalsh commented Dec 2, 2021

If you list the 5 ontologies out, we may make a case for aligning all these 5 ontologies for the NCPI community.

From above ...
Thanks!
-b

@linikujp
Copy link
Member

linikujp commented Dec 3, 2021 via email

@bwalsh
Copy link
Contributor Author

bwalsh commented Dec 3, 2021 via email

@RobertJCarroll
Copy link
Contributor

It looks like that urn is mesh again.

Based on our prior conversations, it sounded like "Given we are using these ResearchStudy objects as a way to do information retrieval, MeSH is a good target". The field is a codeable concept, so we can have a custom free text and multiple ancillary concepts included. IE, if I'd like to flag with SNOMED concept, great, but I should also include the MeSH term that's most relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

3 participants