Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pathway sources should use INDRA-standard gene/protein IDs #22

Closed
bgyori opened this issue May 13, 2021 · 0 comments · Fixed by #24
Closed

Pathway sources should use INDRA-standard gene/protein IDs #22

bgyori opened this issue May 13, 2021 · 0 comments · Fixed by #24

Comments

@bgyori
Copy link
Member

bgyori commented May 13, 2021

The pathway sources use a different ID scheme compared to other sources resulting in most (or maybe all) edges being dropped.

Wikipathway genes are represented as e.g., ncbigene:11303. These should ideally be normalized to HGNC for human genes and UniProt for non-human ones. Note that even if we expect these to be linked up to other nodes via xrefs, the prefix would still have to be EGID per the standard used in INDRA.

Reactome uses a mixture of other namespaces (e.g., chebi:28494, uniprot:A0A140T894) with a similar issue. UniProt IDs should be converted to HGNC for human proteins and to e.g., UP:A0A140T894 for non-human ones. chebi should probably be capitalized for it to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant