Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add taxon disjoints to subset #75

Merged
merged 7 commits into from
Jun 16, 2023
Merged

Add taxon disjoints to subset #75

merged 7 commits into from
Jun 16, 2023

Conversation

anitacaron
Copy link
Contributor

@anitacaron anitacaron commented Jun 15, 2023

Fixes #72

This includes
(1) (in_taxon some X) DisjointWith (in_taxon some (not X)) for every taxon X

I could not test because I don't have the ncbitaxon.obo file and the pipeline needs to download a file that could not resolve curl: (6) Could not resolve host: ftp.ncbi.nih.gov

@anitacaron anitacaron requested a review from matentzn June 15, 2023 18:38
@matentzn
Copy link
Contributor

matentzn commented Jun 16, 2023

Awesome initiative! I am too young (imagine that) to know the exact nature of the disjoint subset, so I will want to get someones eyes on this that has all the context:

Did you test with the current release files? https://github.com/obophenotype/ncbitaxon/releases/tag/v2023-02-24

@anitacaron anitacaron requested a review from balhoff June 16, 2023 11:10
@anitacaron
Copy link
Contributor Author

But we still need to fix the pipeline, right?

@anitacaron
Copy link
Contributor Author

I can download the file now; maybe it was only a network issue.

@anitacaron anitacaron self-assigned this Jun 16, 2023
@anitacaron
Copy link
Contributor Author

Having memory issues to generate the disjoint file to the complete taxonomy.

I set 16G of memory to owltools.

root@4e411ec9b54f:/work# time make ncbitaxon-disjoint-over-in-taxon.owl
OWLTOOLS_MEMORY=16G owltools ncbitaxon.owl --create-taxon-disjoint-over-in-taxon --root NCBITaxon:1 --output ncbitaxon-disjoint-over-in-taxon.owl.tmp.owl
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at owltools.cli.CommandRunner.runSingleIteration(CommandRunner.java:4779)
        at owltools.cli.CommandRunnerBase.run(CommandRunnerBase.java:76)
        at owltools.cli.CommandRunnerBase.run(CommandRunnerBase.java:68)
        at owltools.cli.CommandLineInterface.main(CommandLineInterface.java:12)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at java.base/java.util.HashMap$KeySet.iterator(HashMap.java:913)
        at java.base/java.util.HashSet.iterator(HashSet.java:173)
        at java.base/java.util.AbstractCollection.toArray(AbstractCollection.java:140)
        at java.base/java.util.ArrayList.<init>(ArrayList.java:179)
        at org.semanticweb.owlapi.util.CollectionFactory.sortOptionally(CollectionFactory.java:131)
        at uk.ac.manchester.cs.owl.owlapi.OWLNaryClassAxiomImpl.<init>(OWLNaryClassAxiomImpl.java:56)
        at uk.ac.manchester.cs.owl.owlapi.OWLDisjointClassesAxiomImpl.<init>(OWLDisjointClassesAxiomImpl.java:42)
        at uk.ac.manchester.cs.owl.owlapi.OWLDataFactoryImpl.getOWLDisjointClassesAxiom(OWLDataFactoryImpl.java:935)
        at uk.ac.manchester.cs.owl.owlapi.OWLDataFactoryImpl.getOWLDisjointClassesAxiom(OWLDataFactoryImpl.java:962)
        at uk.ac.manchester.cs.owl.owlapi.OWLDataFactoryImpl.getOWLDisjointClassesAxiom(OWLDataFactoryImpl.java:972)
        at owltools.cli.TaxonCommandRunner.createDisjoint(TaxonCommandRunner.java:307)
        at owltools.cli.TaxonCommandRunner.createTaxonDisjointOverInTaxon(TaxonCommandRunner.java:257)
        ... 8 more
make: *** [Makefile:81: ncbitaxon-disjoint-over-in-taxon.owl] Error 1

real    74m11.658s
user    538m39.829s
sys     2m19.121s

@matentzn
Copy link
Contributor

I can imagine.. let's drop the full one for now and make a comment on the issue

Copy link
Member

@balhoff balhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good—I think it's fine to just create for the taxslim for now. I'll make another issue to get rid of owltools; I think we can do this in less memory with Jena SPARQL.

@anitacaron anitacaron merged commit 26bbfae into master Jun 16, 2023
@anitacaron anitacaron deleted the anitacaron/issue72 branch June 16, 2023 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

improve disjoints file
3 participants