Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPO vs DOID #1

Open
Shicheng-Guo opened this issue Mar 17, 2022 · 3 comments
Open

HPO vs DOID #1

Shicheng-Guo opened this issue Mar 17, 2022 · 3 comments

Comments

@Shicheng-Guo
Copy link

Hi Tiffany,

I am wondering according to your experience, Is there any one is much better than another among HPO vs DOID?

Thanks.

Shicheng

@callahantiff
Copy link
Owner

Hi @Shicheng-Guo - This is a great question. In my work, I have used both in my work, but if I had to choose only 1, it would depend on my use case. If I were trying to represent data or do some task involving diseases I would probably use DOID because HPO doesn't explicitly represent all diseases (they do have some disease-level concepts). If you wanted to use HPO to represent a disease you would have to identify the set of phenotypes that are associated with it. For example, if you wanted a concept for Cystic Fibrosis, in DOID this would be DOID:1485. For HPO, Cystic Fibrosis would involve all of the concepts shown below (citation: MedGen):
Screen Shot 2022-03-18 at 13 18 22

If you are open to sharing some details about your use case, I would be happy to discuss more specific pros and cons. Let me know if that would be helpful!

Although you did not ask... I suggest you also take a look at Mondo (http://mondo.monarchinitiative.org/). Depending on your use case, it might be the best option. Its scope is similar to DOID (I believe that it is more comprehensive) and it contains many references to the HPO and DOID (as database cross-references). I use this ontology instead of DOID because it's incredibly comprehensive and actively maintained by the most wonderful group of people.

@Shicheng-Guo
Copy link
Author

Hi Tiffany,

Thank you so much for the deep explanation. In a project, I need to integrate different data (RWE, EMR, GEO, ICD, PHECODE) and they used different type of ID to represent the diseases and phenotypes (somethings also called intermediate traits like BMI, blood pressure etc). You are right, they have different aims, therefore, have different characteristics and patterns. GEO (Gene Expression Omnibus) usually only have disease name, RWE have both disease and non-disease (like BMI, blood pressure), ICD and PHECODE only indicate disease name. I am trying to find the best code from the list below for the integration for above data: RWE, EMR, GEO, ICD, PHECODE

PUBLIC_MESH_SC (MeSH)
PUBLIC_MESH (MeSH)
PUBLIC_MEDDRA (MedDRA)
SNOMED (Snomed)
PUBLIC_MONDO
HPO
INDICATIONBOOST (SciBite curated MeSH)
INDICATION (SciBite curated MeSH)
MDRAE (Scibite curated MedDRA)
MDRACUTEAE Scibite curated MedDRA Acute Adverse Events branch)
DOID (Scibite curated DOID)
PUBLIC_DOID
MPATH (Pathology)
BAO
PUBLIC_CDISC_SEND
CLINPROC (Clinical Procedures)

Shicheng

@callahantiff
Copy link
Owner

Hi @Shicheng-Guo. Thanks for the additional context. Given the sources that you listed, I might use both MONDO and HPO, leveraging the mappings that MONDO has created to HPO and DOID (if you only want to sue one, given the sources that you listed above, I would use MONDO). I would then use the UMLS, to help you with aligning both MESH terms and MedRA concepts to MONDO via mappings to HPO and UMLS CUIs. Does that help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants