processing paired reads #891

pedres · 2024-11-27T13:58:36Z

Hi,
I was searching here and in the manual but I am not sure how to proceed. In the manual says that "--paired option to kraken2 will indicate to kraken2 that the input files provided are paired read data, and data will be read from the pairs of files concurrently." So if I want to classify a sample with paired reads and I understand that I have to pass --paired flag. However, in the "Metagenome analysis using the Kraken software suite" for the microbiome protocol the kraken2 command has not the --paired flag as it appears in the pathogen protocol. When looking at the code on https://github.com/martin-steinegger/kraken-protocol none of the kraken2 commands have the --paired flags.
So, what would be the correct approach to process a set of paired reads?
In fact if I run:
kraken2 --db $DATABASE --memory-mapping --threads 20 --report krak_test/VCM180_paired.k2report --paired shotgun_NOVOG/VCM180_R1.fq.gz shotgun_NOVOG/VCM180_R2.fq.gz > krak_test/VCM180.kraken2
Loading database information... done.
68391295 sequences (20505.76 Mbp) processed in 390.736s (10501.9 Kseq/m, 3148.79 Mbp/m).
205168 sequences classified (0.30%)
68186127 sequences unclassified (99.70%)

kraken2 --db $DATABASE --memory-mapping --threads 20 --report krak_test/VCM180_notpaired.k2report shotgun_NOVOG/VCM180_R1.fq.gz shotgun_NOVOG/VCM180_R2.fq.gz > krak_test/VCM180.kraken2
Loading database information... done.
136782590 sequences (20505.76 Mbp) processed in 362.249s (22655.6 Kseq/m, 3396.41 Mbp/m).
197818 sequences classified (0.14%)
136584772 sequences unclassified (99.86%)

bracken -d $DATABASE -i krak_test/VCM180_notpaired.k2report -o krak_test/VCM180_notpaired.bracken -w krak_test/VCM180_notpaired.breport -r 150 -l S
bracken -d $DATABASE -i krak_test/VCM180_paired.k2report -o krak_test/VCM180_paired.bracken -w krak_test/VCM180_paired.breport -r 150 -l S

When not setting --paired flag kraken2 classifies the double amount of sequences (separately classify R1 and R2 fastq files) and this affects counts for k2report and bracken estimation

Thank you very much for your help
VCM180_notpaired_bracken.txt
VCM180_notpaired_k2report.txt
VCM180_paired_bracken.txt
VCM180_paired_k2report.txt

Manuel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

processing paired reads #891

processing paired reads #891

pedres commented Nov 27, 2024 •

edited

Loading

processing paired reads #891

processing paired reads #891

Comments

pedres commented Nov 27, 2024 • edited Loading

pedres commented Nov 27, 2024 •

edited

Loading