Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update modules #512

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
5d3ee55
Merge pull request #499 from nf-core/dev
jfy133 Jun 20, 2024
aaea700
update modules
LilyAnderssonLee Aug 2, 2024
7707280
update ganon malt motus
LilyAnderssonLee Aug 2, 2024
e263afe
update filtlong,ganon,krakentools
LilyAnderssonLee Aug 7, 2024
c5a7932
update bracken
LilyAnderssonLee Aug 8, 2024
65899de
add diamond warning for paired-end reads
LilyAnderssonLee Aug 8, 2024
833f72b
update krona,nonpareil and krakentools
LilyAnderssonLee Aug 9, 2024
7cb4f72
update CHANGELOG.md
LilyAnderssonLee Aug 9, 2024
32d4faf
update megan/rma2info
LilyAnderssonLee Aug 9, 2024
d2bc3aa
update metaphlan to the lastest
LilyAnderssonLee Aug 27, 2024
65dbe75
Merge branch 'bouncy-basenji' into update_tools
LilyAnderssonLee Aug 27, 2024
2b1afc4
Update Multiqc to the latest
LilyAnderssonLee Aug 28, 2024
05db993
Add nanoq in the multiqc report
LilyAnderssonLee Aug 30, 2024
37b275e
Fix bug where Nanopore FASTA files would be picked up into the FASTQ …
jfy133 Sep 3, 2024
aaa8715
Use Ida's GH handle
jfy133 Sep 3, 2024
6149c42
Update CHANGELOG.md
jfy133 Sep 4, 2024
811d33a
Update docs/usage.md
jfy133 Sep 4, 2024
a90456d
Publish the ganon classify log file in the output
LilyAnderssonLee Sep 4, 2024
c4d5ed7
update motus/profile
LilyAnderssonLee Sep 4, 2024
81902b3
update modules, most changes happened in nf-test
LilyAnderssonLee Sep 4, 2024
d429463
update utils_nextflow_pipeline and utils_nfcore_pipeline
LilyAnderssonLee Sep 4, 2024
5e0d556
Merge pull request #518 from nf-core/fix-nanopore-fasta
LilyAnderssonLee Sep 5, 2024
8a42385
Merge branch 'nf-core:master' into update_tools
LilyAnderssonLee Sep 5, 2024
75efdb7
merge dev to the current branch
LilyAnderssonLee Sep 5, 2024
4e2b85f
separate fasta into fasta_shrt and fasta_long
LilyAnderssonLee Sep 5, 2024
2f3ea43
Update changelog
LilyAnderssonLee Sep 5, 2024
bd66c35
add nanoq is default option to the usage
LilyAnderssonLee Sep 5, 2024
4ef2579
unique check for fasta file names
LilyAnderssonLee Sep 6, 2024
54cb683
state that the pipeline is built on nextflow to README.md
LilyAnderssonLee Sep 6, 2024
ef5db33
Update CHANGELOG.md
LilyAnderssonLee Sep 6, 2024
9a23069
Update CHANGELOG.md
LilyAnderssonLee Sep 6, 2024
df25a1b
update modules
LilyAnderssonLee Sep 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,33 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#505](https://github.com/nf-core/taxprofiler/pull/505) - Add small files to the file `tower.yml` (added by @LilyAnderssonLee)
- [#508](https://github.com/nf-core/taxprofiler/pull/508) - Add `nanoq` as a filtering tool for nanopore reads (added by @LilyAnderssonLee)
- [#511](https://github.com/nf-core/taxprofiler/pull/511) - Add `porechop_abi` as an alternative adapter removal tool for long reads nanopore data (added by @LilyAnderssonLee)
- [#512](https://github.com/nf-core/taxprofiler/pull/512) - Update all tools to the latest version and include nf-test (Updated by @LilyAnderssonLee & @jfy133)

### `Fixed`

- [#518](https://github.com/nf-core/taxprofiler/pull/518) Fixed a bug where Oxford Nanopore FASTA input files would not be processed (❤️ to @ikarls for reporting, fixed by @jfy133)

### `Dependencies`

| Tool | Previous version | New version |
| ------------- | ---------------- | ----------- |
| bbmap | 39.01 | 39.06 |
| bowtie2 | 2.4.4 | 2.5.2 |
| bracken | 2.7 | 2.9 |
| cat/fastq | 8.30 |
| diamond | 2.0.15 | 2.1.8 |
| ganon | 1.5.1 | 2.0.0 |
| kraken2 | 2.1.2 | 2.1.3 |
| krona | 2.8 | 2.8.1 |
| megan | 6.24.20 | 6.25.9 |
| metaphlan | 4.0.6 | 4.1.1 |
| minimap2 | 2.24 | 2.28 |
| motus/profile | 3.0.3 | 3.1.0 |
| multiqc | 1.21 | 1.24.1 |
| nanoq | | 0.10.0 |
| samtools | 1.17 | 1.20 |
| untar | 4.7 | 4.8 |

### `Deprecated`

## v1.1.8 - Augmented Akita Patch [2024-06-20]
Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@

**nf-core/taxprofiler** is a bioinformatics best-practice analysis pipeline for taxonomic classification and profiling of shotgun short- and long-read metagenomic data. It allows for in-parallel taxonomic identification of reads or taxonomic abundance estimation with multiple classification and profiling tools against multiple databases, and produces standardised output tables for facilitating results comparison between different tools and databases.

The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!

On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/scnanoseq/results).

## Pipeline summary

![](docs/images/taxprofiler_tube.png)
Expand Down
7 changes: 7 additions & 0 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ report_section_order:
order: 200
filtlong:
order: 100
nanoq:
order: 95
bowtie2:
order: 90
samtools:
Expand Down Expand Up @@ -68,6 +70,7 @@ run_modules:
- prinseqplusplus
- porechop
- filtlong
- nanoq
- bowtie2
- minimap2
- samtools
Expand Down Expand Up @@ -206,6 +209,8 @@ table_columns_placement:
Middle Split Percent: 560
Filtlong:
Target bases: 600
nanoq:
Read N50: 700
BBDuk:
Input reads: 800
Total Removed bases percent: 810
Expand Down Expand Up @@ -305,6 +310,8 @@ table_columns_visible:
Middle Split Percent: True
Filtlong:
Target bases: True
nanoq:
ReadN50: True
BBDuk:
Input reads: False
Total Removed bases Percent: False
Expand Down
3 changes: 3 additions & 0 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -38,18 +38,21 @@
"type": "string",
"format": "file-path",
"pattern": "^\\S+\\.f(ast)?q\\.gz$",
"unique": true,
"errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'"
},
"fastq_2": {
"type": "string",
"format": "file-path",
"pattern": "^\\S+\\.f(ast)?q\\.gz$",
"unique": true,
"errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'. If not applicable, leave it empty."
},
"fasta": {
"type": "string",
"format": "file-path",
"pattern": "^\\S+\\.(f(ast)?q|fa(sta)?)\\.gz$",
"unique": true,
sofstam marked this conversation as resolved.
Show resolved Hide resolved
"errorMessage": "FastA file must be provided, cannot contain spaces and must have extension '.fa.gz' or '.fasta.gz'. If not applicable, leave it empty."
}
},
Expand Down
2 changes: 1 addition & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -813,7 +813,7 @@ process {
publishDir = [
path: { "${params.outdir}/ganon/${meta.db_name}/" },
mode: params.publish_dir_mode,
pattern: '*.{tre,rep,lca,all,unc}'
pattern: '*.{tre,rep,lca,all,unc,log}'
]
}

Expand Down
8 changes: 7 additions & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,10 @@ While one can include both short-read and long-read data in one run, we recommen

An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline.

:::warning
FASTA input will not go through any preprocessing steps, and will go directly to profiling.
:::

### Full database sheet

nf-core/taxprofiler supports multiple databases being classified/profiled against in parallel for each tool.
Expand Down Expand Up @@ -299,7 +303,7 @@ Complexity filtering is primarily a run-time optimisation step. It is not necess

There are currently three options for short-read complexity filtering: [`bbduk`](https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbduk-guide/), [`prinseq++`](https://github.com/Adrian-Cantu/PRINSEQ-plus-plus), and [`fastp`](https://github.com/OpenGene/fastp#low-complexity-filter).

There are two options for long-read quality filtering: [`Filtlong`](https://github.com/rrwick/Filtlong) and [`nanoq`](https://github.com/esteinig/nanoq).
There are two options for long-read quality filtering: [`Filtlong`](https://github.com/rrwick/Filtlong) and [`nanoq`](https://github.com/esteinig/nanoq), with `nanoq` being the default option.
sofstam marked this conversation as resolved.
Show resolved Hide resolved

The tools offer different algorithms and parameters for removing low complexity reads and quality filtering. We therefore recommend reviewing the pipeline's [parameter documentation](https://nf-co.re/taxprofiler/parameters) and the documentation of the tools (see links above) to decide on optimal methods and parameters for your dataset.

Expand Down Expand Up @@ -363,6 +367,8 @@ Centrifuge currently does not accept FASTA files as input, therefore no output w

##### DIAMOND

DIAMOND can only accept a single input read file. To run DIAMOND on paired-end reads, please merge the reads (e.g., using `--shortread_qc_mergepairs`).

DIAMOND only allows output of a single file format at a time, therefore parameters such `--diamond_save_reads` supplied will result in only aligned reads in SAM format will be produced, no taxonomic profiles will be available. Be aware of this when setting up your pipeline runs, depending on your particular use case.

##### Kaiju
Expand Down
Loading