cenote-taker2 vs (blastn nt & diamond nr) #42

NailouZhang · 2022-12-20T13:58:08Z

Hi Mike,
Recently， I ran cenote-taker2 and blastn against nt database & diamond against nr database with the contigs assembled by Megahit. I found that about 10000 sequences were classified as viruses, while about 1000 were identified by blast. I am confused about why the results from blast are ten times less than cenote-taker2.

As you pointed that "Many virus genomes are integrated into host chromosomes" and "viral genes and genomes are often misidentified as host sequences"（Tisza M J, Belford A K, Dominguez-Huerta G, et al. Cenote-Taker 2 democratizes virus discovery and sequence annotation[J]. Virus evolution, 2021, 7(1): veaa100.）. Thus, blast may have some false-negatives results. So, Is there a threshold to classify sequences as viral or non-viral using both tools (e.g. blast p-value or percent of ident or mapping length)?

wish you a merry Christmas in advance!

Nailou Zhang

mtisza1 · 2023-01-11T19:07:03Z

Hi Nailou,

Thanks for your comment. It's a bit complicated to assess this without more information about how Cenote-Taker 2 was run and what settings you used with blast and diamond.

Using blastn against nt could be a great way to look for viruses present in this database and their close relatives, however, the vast majority of the viruses that exist on earth are not catalogued in nt. Recent estimates suggest that there are around 1 billion virus species on earth. The number of virus species in nt is in the tens of thousands.

Of course, as a general statement, Cenote-Taker 2 will return false positives at some unknown rate. If you are querying contigs assembled from WGS reads and you use -db virion --lin_minimum_hallmark_genes 2 --circ_minimum_hallmark_genes 2, I would estimate the false positive rate is only about ~1%, maybe less. It's hard to measure this meaningfully, in my opinion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cenote-taker2 vs (blastn nt & diamond nr) #42

cenote-taker2 vs (blastn nt & diamond nr) #42

NailouZhang commented Dec 20, 2022 •

edited

Loading

mtisza1 commented Jan 11, 2023

cenote-taker2 vs (blastn nt & diamond nr) #42

cenote-taker2 vs (blastn nt & diamond nr) #42

Comments

NailouZhang commented Dec 20, 2022 • edited Loading

mtisza1 commented Jan 11, 2023

NailouZhang commented Dec 20, 2022 •

edited

Loading