diff --git a/CHANGELOG.md b/CHANGELOG.md index 043a62c7..347d0604 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,6 +12,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### `Fixed` - [#484](https://github.com/nf-core/taxprofiler/pull/484) Improved input validation to immediately fail if run accession IDs within a given sample ID are not unique (❤️ to @sofstam for reporting, fixed by @jfy133) +- [#491](https://github.com/nf-core/taxprofiler/pull/491) Added flag to publish intermediate bracken files (❤️ to @ewissel for reporting, fixed by @sofstam and @jfy133) +- [489](https://github.com/nf-core/taxprofiler/pull/489) Fix KrakenUniq classified reads output format mismatch (❤️ to @SannaAb for reporting, fixed by @jfy133) ### `Dependencies` diff --git a/conf/modules.config b/conf/modules.config index dce43d1c..2148b69f 100644 --- a/conf/modules.config +++ b/conf/modules.config @@ -467,7 +467,8 @@ process { publishDir = [ path: { "${params.outdir}/kraken2/${meta.db_name}/" }, mode: params.publish_dir_mode, - pattern: '*.{txt,fastq.gz}' + pattern: '*.{txt,fastq.gz}', + saveAs: { !params.bracken_save_intermediatekraken2 && meta.tool == "bracken" ? null : it } ] } diff --git a/docs/output.md b/docs/output.md index 8fbb22f9..732d5964 100644 --- a/docs/output.md +++ b/docs/output.md @@ -376,11 +376,11 @@ The main taxonomic profiling file from Bracken is the `*.tsv` file. This provide - `kraken2/` - `_combined_reports.txt`: A combined profile of all samples aligned to a given database (as generated by `krakentools`) - - If you have also run Bracken, the original Kraken report (i.e., _before_ read re-assignment) will also be included in this directory with `-bracken` suffixed to your Bracken database name. For example: `kraken2--bracken.tsv`. However in most cases you want to use the actual Bracken file (i.e., `bracken_.tsv`). + - If you have also run Bracken, the original Kraken report (i.e., _before_ read re-assignment) will also be included in this directory with `-bracken` suffixed to your Bracken database name if you supply `--bracken_save_intermediatekraken2` to the run. For example: `kraken2--bracken.tsv`. However in most cases you want to use the actual Bracken file (i.e., `bracken_.tsv`). - `/` - `_.classified.fastq.gz`: FASTQ file containing all reads that had a hit against a reference in the database for a given sample - `_.unclassified.fastq.gz`: FASTQ file containing all reads that did not have a hit in the database for a given sample - - `_.report.txt`: A Kraken2 report that summarises the fraction abundance, taxonomic ID, number of Kmers, taxonomic path of all the hits in the Kraken2 run for a given sample. Will be 6 column rather than 8 if `--save_minimizers` specified. + - `_.report.txt`: A Kraken2 report that summarises the fraction abundance, taxonomic ID, number of Kmers, taxonomic path of all the hits in the Kraken2 run for a given sample. Will be 6 column rather than 8 if `--save_minimizers` specified. This report will **only** be included if you supply `--bracken_save_intermediatekraken2` to the run. - `_.classifiedreads.txt`: A list of read IDs and the hits each read had against each database for a given sample @@ -389,6 +389,8 @@ The main taxonomic classification file from Kraken2 is the `_combined_reports.tx You will only receive the `.fastq` and `*classifiedreads.txt` file if you supply `--kraken2_save_reads` and/or `--kraken2_save_readclassifications` parameters to the pipeline. +When running Bracken, you will only get the 'intermediate' Kraken2 report files in this directory if you supply `--bracken_save_intermediatekraken2` to the run. + ### KrakenUniq [KrakenUniq](https://github.com/fbreitwieser/krakenuniq) (formerly KrakenHLL) is an extension to the fast k-mer-based classification performed by [Kraken](https://github.com/DerrickWood/kraken) with an efficient algorithm for additionally assessing the coverage of unique k-mers found in each species in a dataset. diff --git a/nextflow.config b/nextflow.config index b8a9ba8f..1b105692 100644 --- a/nextflow.config +++ b/nextflow.config @@ -135,7 +135,8 @@ params { krakenuniq_batch_size = 20 // Bracken - run_bracken = false + run_bracken = false + bracken_save_intermediatekraken2 = false // centrifuge run_centrifuge = false diff --git a/nextflow_schema.json b/nextflow_schema.json index 3f7d9eec..0e4185b5 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -466,6 +466,11 @@ "description": "Turn on Bracken (and the required Kraken2 prerequisite step).", "fa_icon": "fas fa-toggle-on" }, + "bracken_save_intermediatekraken2": { + "type": "boolean", + "fa_icon": "fas fa-save", + "description": "Turn on the saving of the intermediate Kraken2 files used as input to Bracken itself into Kraken2 results folder" + }, "run_malt": { "type": "boolean", "fa_icon": "fas fa-toggle-on",