Skip to content

Commit

Permalink
Merge pull request #491 from sofstam/bracken_inter_files
Browse files Browse the repository at this point in the history
Add flag to publish intermediate bracken files
  • Loading branch information
jfy133 authored May 31, 2024
2 parents dc285e0 + ee31ff0 commit a6475c5
Show file tree
Hide file tree
Showing 5 changed files with 15 additions and 4 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### `Fixed`

- [#484](https://github.com/nf-core/taxprofiler/pull/484) Improved input validation to immediately fail if run accession IDs within a given sample ID are not unique (❤️ to @sofstam for reporting, fixed by @jfy133)
- [#491](https://github.com/nf-core/taxprofiler/pull/491) Added flag to publish intermediate bracken files (❤️ to @ewissel for reporting, fixed by @sofstam and @jfy133)
- [489](https://github.com/nf-core/taxprofiler/pull/489) Fix KrakenUniq classified reads output format mismatch (❤️ to @SannaAb for reporting, fixed by @jfy133)

### `Dependencies`

Expand Down
3 changes: 2 additions & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -467,7 +467,8 @@ process {
publishDir = [
path: { "${params.outdir}/kraken2/${meta.db_name}/" },
mode: params.publish_dir_mode,
pattern: '*.{txt,fastq.gz}'
pattern: '*.{txt,fastq.gz}',
saveAs: { !params.bracken_save_intermediatekraken2 && meta.tool == "bracken" ? null : it }
]
}

Expand Down
6 changes: 4 additions & 2 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -376,11 +376,11 @@ The main taxonomic profiling file from Bracken is the `*.tsv` file. This provide

- `kraken2/`
- `<db_name>_combined_reports.txt`: A combined profile of all samples aligned to a given database (as generated by `krakentools`)
- If you have also run Bracken, the original Kraken report (i.e., _before_ read re-assignment) will also be included in this directory with `-bracken` suffixed to your Bracken database name. For example: `kraken2-<mydatabase>-bracken.tsv`. However in most cases you want to use the actual Bracken file (i.e., `bracken_<mydatabase>.tsv`).
- If you have also run Bracken, the original Kraken report (i.e., _before_ read re-assignment) will also be included in this directory with `-bracken` suffixed to your Bracken database name if you supply `--bracken_save_intermediatekraken2` to the run. For example: `kraken2-<mydatabase>-bracken.tsv`. However in most cases you want to use the actual Bracken file (i.e., `bracken_<mydatabase>.tsv`).
- `<db_name>/`
- `<sample_id>_<db_name>.classified.fastq.gz`: FASTQ file containing all reads that had a hit against a reference in the database for a given sample
- `<sample_id>_<db_name>.unclassified.fastq.gz`: FASTQ file containing all reads that did not have a hit in the database for a given sample
- `<sample_id>_<db_name>.<kraken2/bracken2>report.txt`: A Kraken2 report that summarises the fraction abundance, taxonomic ID, number of Kmers, taxonomic path of all the hits in the Kraken2 run for a given sample. Will be 6 column rather than 8 if `--save_minimizers` specified.
- `<sample_id>_<db_name>.<kraken2/bracken2>report.txt`: A Kraken2 report that summarises the fraction abundance, taxonomic ID, number of Kmers, taxonomic path of all the hits in the Kraken2 run for a given sample. Will be 6 column rather than 8 if `--save_minimizers` specified. This report will **only** be included if you supply `--bracken_save_intermediatekraken2` to the run.
- `<sample_id>_<db_name>.classifiedreads.txt`: A list of read IDs and the hits each read had against each database for a given sample

</details>
Expand All @@ -389,6 +389,8 @@ The main taxonomic classification file from Kraken2 is the `_combined_reports.tx

You will only receive the `.fastq` and `*classifiedreads.txt` file if you supply `--kraken2_save_reads` and/or `--kraken2_save_readclassifications` parameters to the pipeline.

When running Bracken, you will only get the 'intermediate' Kraken2 report files in this directory if you supply `--bracken_save_intermediatekraken2` to the run.

### KrakenUniq

[KrakenUniq](https://github.com/fbreitwieser/krakenuniq) (formerly KrakenHLL) is an extension to the fast k-mer-based classification performed by [Kraken](https://github.com/DerrickWood/kraken) with an efficient algorithm for additionally assessing the coverage of unique k-mers found in each species in a dataset.
Expand Down
3 changes: 2 additions & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,8 @@ params {
krakenuniq_batch_size = 20

// Bracken
run_bracken = false
run_bracken = false
bracken_save_intermediatekraken2 = false

// centrifuge
run_centrifuge = false
Expand Down
5 changes: 5 additions & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -466,6 +466,11 @@
"description": "Turn on Bracken (and the required Kraken2 prerequisite step).",
"fa_icon": "fas fa-toggle-on"
},
"bracken_save_intermediatekraken2": {
"type": "boolean",
"fa_icon": "fas fa-save",
"description": "Turn on the saving of the intermediate Kraken2 files used as input to Bracken itself into Kraken2 results folder"
},
"run_malt": {
"type": "boolean",
"fa_icon": "fas fa-toggle-on",
Expand Down

0 comments on commit a6475c5

Please sign in to comment.