Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

classify bins catches on weird taxa #33

Open
karoraw1 opened this issue Jun 26, 2018 · 3 comments
Open

classify bins catches on weird taxa #33

karoraw1 opened this issue Jun 26, 2018 · 3 comments

Comments

@karoraw1
Copy link

Apparently there is a bit of hamster in my mapping.tax file.

An unrecoverable error occurred: std::exception

Here is some debugging information to locate the problem:
/home/johdro/projects/taxator-tk_default.git/src/fileparser.hh(52): Throw in function FileParser<FactoryType>::RecordType* FileParser<FactoryType>::next() [with FactoryType = AlignmentRecordFactory<AlignmentRecordTaxonomy>; FileParser<FactoryType>::RecordType = AlignmentRecordTaxonomy]
Dynamic exception type: boost::exception_detail::clone_impl<Exception>
std::exception::what: std::exception
[exception_tag_line*] = 1488404
[exception_tag_taxid*] = 10026
[exception_tag_general*] = bad alignment reference taxon

I fixed this by using

taxknife -f 2 --mode traverse -r species genus family order class phylum superkingdom < mapping.tax > newmapping.tax

as referenced here:
fungs/taxator-tk#51

i am using an older version of metawrap so i am not sure if you've corrected for this already, but i thought i would pass it along.

@ursky
Copy link
Collaborator

ursky commented Jun 26, 2018

Thanks for pointing it out. I am a bit confused about your fix though - metawrap runs the command cat ${out}/binned_predictions.txt | taxknife -f 2 --mode annotate -s path | grep -v "Could not" | cut -f1,2 > ${out}/contig_taxonomy.tab. This returns the taxonomy of each contig. How would you use your solution to fix it?

Looking at the error, I wonder if its caused earlier versions of metawrap not installing the correddct gcc/boost libraries (I had quite a few error related to that). If you dont mind, can you install metawrap in a new conda environment (when it comes to library dependencies updating doenst always fix it) and see if it still gives you the error?

@karoraw1
Copy link
Author

sorry for not explaining the issue well enough. the error is produced earlier in the script by the taxator command (line 147).

the megablast results yielded a mapping.tax file that contained the taxid 10026. according to the nodes.dmp file in the NCBI taxonomy this is a "subfamily". taxator only accepts the major taxonomic hierarchy levels listed in the taxknife command i showed above. I used taxknife to remove it and then re-ran taxator on the new mapping file, without issue.

i've been meaning to update anyways, so I will go ahead and get that started.

@ursky
Copy link
Collaborator

ursky commented Jun 26, 2018

Ah, I understand now. I encountered similar issues before. Actually, I already have a dedicated script bin/metawrap-scripts/prune_blast_hits.py to sort out the "no rank", "subspecies", "varietas", "forma" from the blast hits before proceeding. I never encountered a "subfamily" before, but I guess Ill just add it to the exception list. This fix should be in metaWRAP v=0.9.3 when it rolls out!

@ursky ursky closed this as completed Jun 26, 2018
@ursky ursky reopened this Sep 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants