Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dumped). Error: crosstaxonfilterorf step died #24

Open
keishaboateng97 opened this issue Jan 8, 2024 · 0 comments

Comments

@keishaboateng97
Copy link

keishaboateng97 commented Jan 8, 2024

Dear @martin-steinegger
I am trying to run Conterminator, but I keep getting an error saying "crosstaxonfilterorf step died".
I have made my own mapping file, and I also have a gzipped sequence file. Do you have an idea about how I can fix this?

Best regards, Anna.

Below is the log:

Tmp tmp folder does not exist or is not a directory.
Create dir tmp
dna multifasta.txt db_seqs.mapping db_seqs tmp

MMseqs Version: 570993b
Substitution matrix nucl:nucleotide.out,aa:blosum62.out
Add backtrace true
Alignment mode 3
Allow wrapped scoring false
E-value threshold 0.001
Seq. id. threshold 0.9
Min. alignment length 100
Seq. id. mode 0
Alternative alignments 0
Coverage threshold 0
Coverage mode 0
Max sequence length 1000
Compositional bias 0
Realign hits false
Max reject 2147483647
Max accept 2147483647
Include identical seq. id. false
Preload mode 0
Pseudo count a 1
Pseudo count b 1.5
Score bias 0
Gap open cost 5
Gap extension cost 2
Threads 24
Compressed 0
Verbosity 3
Seed substitution matrix nucl:nucleotide.out,aa:VTML80.out
Sensitivity 5.7
K-mer size 15
K-score 2147483647
Alphabet size 21
Split database 0
Split mode 2
Split memory limit 0
Diagonal scoring false
Exact k-mer matching 1
Mask residues 0
Mask lower case residues 0
Minimum diagonal score 25
Spaced k-mers 1
Spaced k-mer pattern
Local temporary path
Rescore mode 2
Remove hits by seq. id. and coverage false
Sort results 0
Mask profile 1
Profile e-value threshold 0.001
Use global sequence weighting false
Allow deletions false
Filter MSA 1
Maximum seq. id. threshold 0.9
Minimum seq. id. 0
Minimum score per column -20
Minimum coverage 0
Select N most diverse seqs 1000
Omit consensus false
Min codons in orf 30
Max codons in length 32734
Max orf gaps 2147483647
Contig start mode 2
Contig end mode 2
Orf start mode 1
Forward frames 1
Reverse frames 1
Translation table 1
Translate orf 0
Use all table starts false
Offset of numeric ids 0
Create lookup 0
Add orf stop false
Chain overlapping alignments 0
Merge query 1
Search type 0
Number search iterations 1
Start sensitivity 4
Search steps 1
Run a seq-profile search in slice mode false
Strand selection 2
Disk space limit 0
MPI runner
Force restart with latest tmp false
Remove temporary files true
Database type 0
Shuffle input database true
Createdb mode 0
NCBI tax dump directory
Taxonomical mapping file
Blacklisted taxa 10239,12908,28384,81077,11632,340016,61964,48479,48510
Compare across kingdoms (2||2157),4751,33208,33090,(2759&&!4751&&!33208&&!33090)

createdb multifasta.txt tmp/13966145965188563130/sequencedb

Converting sequences
[11111] 0s 847ms
Time for merging to sequencedb_h: 0h 0m 0s 130ms
Time for merging to sequencedb: 0h 0m 4s 36ms
Database type: Nucleotide
Time for merging to sequencedb.lookup: 0h 0m 0s 3ms
Time for processing: 0h 0m 8s 659ms
Tmp tmp/13966145965188563130/createtaxdb folder does not exist or is not a directory.
Create dir tmp/13966145965188563130/createtaxdb
createtaxdb tmp/13966145965188563130/sequencedb tmp/13966145965188563130/createtaxdb --tax-mapping-file db_seqs.mapping -v 3

Download taxdump.tar.gz
2024-01-08 09:28:54 URL:https://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz [64135145/64135145] -> "-" [1]
Database created
Remove temporary files
tmp/13966145965188563130/createtaxdb/createindex.sh: 58: [: Illegal number:
splitsequence tmp/13966145965188563130/sequencedb tmp/13966145965188563130/db_rev_split --max-seq-len 1000 --sequence-overlap 0 --sequence-split-mode 1 --create-lookup 0 --threads 24 --compressed 1 -v 3

Sequence split mode (--sequence-split-mode 0) and compressed (--compressed 1) can not be combined.
[=================================================================] 100.00% 11.19K 0s 51ms eta -
Time for merging to db_rev_split_h: 0h 0m 0s 332ms
Time for merging to db_rev_split: 0h 0m 0s 331ms
Time for processing: 0h 0m 1s 271ms
kmermatcher tmp/13966145965188563130/db_rev_split tmp/13966145965188563130/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size 21 --min-seq-id 0.9 --kmer-per-seq 100 --spaced-kmer-mode 1 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 24 -c 0 --max-seq-len 1000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 24 --compressed 0 -v 3

kmermatcher tmp/13966145965188563130/db_rev_split tmp/13966145965188563130/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size 21 --min-seq-id 0.9 --kmer-per-seq 100 --spaced-kmer-mode 1 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 24 -c 0 --max-seq-len 1000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 24 --compressed 0 -v 3

Database size: 265435 type: Nucleotide

Generate k-mers list for 1 split
[=================================================================] 100.00% 265.43K 3s 875ms

Adjusted k-mer length 24
Sort kmer 0h 0m 2s 851ms
Sort by rep. sequence 0h 0m 1s 284ms
Time for fill: 0h 0m 0s 584ms
Time for merging to pref: 0h 0m 0s 209ms
Time for processing: 0h 0m 10s 726ms
tmp/13966145965188563130/pref exists and will be overwritten.
crosstaxonfilterorf tmp/13966145965188563130/sequencedb tmp/13966145965188563130/db_rev_split_h tmp/13966145965188563130/pref tmp/13966145965188563130/pref_cross --blacklist 10239,12908,28384,81077,11632,340016,61964,48479,48510 --kingdoms (2||2157),4751,33208,33090,(2759&&!4751&&!33208&&!33090) --threads 24 -v 3

Loading NCBI taxonomy
Loading nodes file ... Done, got 2550529 nodes
Loading merged file ... Done, added 75736 merged nodes.
Loading names file ... Done
Making matrix ... Done
Init RMQ ...Done
Segmentation fault (core dumped) ] 0.00% 1 eta -
Error: crosstaxonfilterorf step died
s175562@node06:/home/projects2/keisha/data$ ^C
s175562@node06:/home/projects2/keisha/data$ gzip multifasta.txt
s175562@node06:/home/projects2/keisha/data$ ls
CHECK genuslist.txt sci_name_taxid.txt taxontemp3.txt
cleaned_scinamegenus.txt identifier.txt sequence_length_file.txt taxontemp.txt
db_seqs.mapping mapping_file.tsv speciesprofile.txt taxon.txt
downloads_ncbi meta.xml taxdump.tar.gz temp.txt
downloads_ncbi2 multifasta.txt.gz taxidonly.txt tmp
downloads_ncbi_tmp ncbi_taxid.txt tax_ids.txt WoRMS_download_2023-09-01.zip
eml.xml output_lineage.txt taxontemp1.txt
genuslist_tmp.txt sciname_genus.txt taxontemp2.txt
s175562@node06:/home/projects2/keisha/data$ /home/ctools/conterminator/conterminator dna multifasta.txt.gz db_seqs.mapping db_seqs tmp
dna multifasta.txt.gz db_seqs.mapping db_seqs tmp

MMseqs Version: 570993b
Substitution matrix nucl:nucleotide.out,aa:blosum62.out
Add backtrace true
Alignment mode 3
Allow wrapped scoring false
E-value threshold 0.001
Seq. id. threshold 0.9
Min. alignment length 100
Seq. id. mode 0
Alternative alignments 0
Coverage threshold 0
Coverage mode 0
Max sequence length 1000
Compositional bias 0
Realign hits false
Max reject 2147483647
Max accept 2147483647
Include identical seq. id. false
Preload mode 0
Pseudo count a 1
Pseudo count b 1.5
Score bias 0
Gap open cost 5
Gap extension cost 2
Threads 24
Compressed 0
Verbosity 3
Seed substitution matrix nucl:nucleotide.out,aa:VTML80.out
Sensitivity 5.7
K-mer size 15
K-score 2147483647
Alphabet size 21
Split database 0
Split mode 2
Split memory limit 0
Diagonal scoring false
Exact k-mer matching 1
Mask residues 0
Mask lower case residues 0
Minimum diagonal score 25
Spaced k-mers 1
Spaced k-mer pattern
Local temporary path
Rescore mode 2
Remove hits by seq. id. and coverage false
Sort results 0
Mask profile 1
Profile e-value threshold 0.001
Use global sequence weighting false
Allow deletions false
Filter MSA 1
Maximum seq. id. threshold 0.9
Minimum seq. id. 0
Minimum score per column -20
Minimum coverage 0
Select N most diverse seqs 1000
Omit consensus false
Min codons in orf 30
Max codons in length 32734
Max orf gaps 2147483647
Contig start mode 2
Contig end mode 2
Orf start mode 1
Forward frames 1
Reverse frames 1
Translation table 1
Translate orf 0
Use all table starts false
Offset of numeric ids 0
Create lookup 0
Add orf stop false
Chain overlapping alignments 0
Merge query 1
Search type 0
Number search iterations 1
Start sensitivity 4
Search steps 1
Run a seq-profile search in slice mode false
Strand selection 2
Disk space limit 0
MPI runner
Force restart with latest tmp false
Remove temporary files true
Database type 0
Shuffle input database true
Createdb mode 0
NCBI tax dump directory
Taxonomical mapping file
Blacklisted taxa 10239,12908,28384,81077,11632,340016,61964,48479,48510
Compare across kingdoms (2||2157),4751,33208,33090,(2759&&!4751&&!33208&&!33090)

createdb multifasta.txt.gz tmp/13401688708221171541/sequencedb

Converting sequences
[11111] 2s 414ms
Time for merging to sequencedb_h: 0h 0m 0s 138ms
Time for merging to sequencedb: 0h 0m 4s 206ms
Database type: Nucleotide
Time for merging to sequencedb.lookup: 0h 0m 0s 2ms
Time for processing: 0h 0m 10s 426ms
Tmp tmp/13401688708221171541/createtaxdb folder does not exist or is not a directory.
Create dir tmp/13401688708221171541/createtaxdb
createtaxdb tmp/13401688708221171541/sequencedb tmp/13401688708221171541/createtaxdb --tax-mapping-file db_seqs.mapping -v 3

Download taxdump.tar.gz
2024-01-08 09:47:06 URL:https://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz [64135032/64135032] -> "-" [1]
Database created
Remove temporary files
tmp/13401688708221171541/createtaxdb/createindex.sh: 58: [: Illegal number:
splitsequence tmp/13401688708221171541/sequencedb tmp/13401688708221171541/db_rev_split --max-seq-len 1000 --sequence-overlap 0 --sequence-split-mode 1 --create-lookup 0 --threads 24 --compressed 1 -v 3

Sequence split mode (--sequence-split-mode 0) and compressed (--compressed 1) can not be combined.
[=================================================================] 100.00% 11.19K 0s 54ms eta -
Time for merging to db_rev_split_h: 0h 0m 0s 333ms
Time for merging to db_rev_split: 0h 0m 0s 357ms
Time for processing: 0h 0m 1s 304ms
kmermatcher tmp/13401688708221171541/db_rev_split tmp/13401688708221171541/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size 21 --min-seq-id 0.9 --kmer-per-seq 100 --spaced-kmer-mode 1 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 24 -c 0 --max-seq-len 1000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 24 --compressed 0 -v 3

kmermatcher tmp/13401688708221171541/db_rev_split tmp/13401688708221171541/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size 21 --min-seq-id 0.9 --kmer-per-seq 100 --spaced-kmer-mode 1 --kmer-per-seq-scale 0 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 0 -k 24 -c 0 --max-seq-len 1000 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 24 --compressed 0 -v 3

Database size: 265435 type: Nucleotide

Generate k-mers list for 1 split
[=================================================================] 100.00% 265.43K 3s 547ms

Adjusted k-mer length 24
Sort kmer 0h 0m 1s 865ms
Sort by rep. sequence 0h 0m 1s 499ms
Time for fill: 0h 0m 0s 589ms
Time for merging to pref: 0h 0m 0s 209ms
Time for processing: 0h 0m 9s 545ms
tmp/13401688708221171541/pref exists and will be overwritten.
crosstaxonfilterorf tmp/13401688708221171541/sequencedb tmp/13401688708221171541/db_rev_split_h tmp/13401688708221171541/pref tmp/13401688708221171541/pref_cross --blacklist 10239,12908,28384,81077,11632,340016,61964,48479,48510 --kingdoms (2||2157),4751,33208,33090,(2759&&!4751&&!33208&&!33090) --threads 24 -v 3

Loading NCBI taxonomy
Loading nodes file ... Done, got 2550529 nodes
Loading merged file ... Done, added 75736 merged nodes.
Loading names file ... Done
Making matrix ... Done
Init RMQ ...Done
Segmentation fault (core dumped) ] 0.00% 1 eta -
Error: crosstaxonfilterorf step died

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant