nf-core · LilyAnderssonLee · Sep 9, 2024 · Jun 20, 2024 · Aug 2, 2024 · Aug 2, 2024
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -12,11 +12,33 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - [#505](https://github.com/nf-core/taxprofiler/pull/505) - Add small files to the file `tower.yml` (added by @LilyAnderssonLee)
 - [#508](https://github.com/nf-core/taxprofiler/pull/508) - Add `nanoq` as a filtering tool for nanopore reads (added by @LilyAnderssonLee)
 - [#511](https://github.com/nf-core/taxprofiler/pull/511) - Add `porechop_abi` as an alternative adapter removal tool for long reads nanopore data (added by @LilyAnderssonLee)
+- [#512](https://github.com/nf-core/taxprofiler/pull/512) - Update all tools to the latest version and include nf-test (Updated by @LilyAnderssonLee & @jfy133)
 
 ### `Fixed`
 
+- [#518](https://github.com/nf-core/taxprofiler/pull/518) Fixed a bug where Oxford Nanopore FASTA input files would not be processed (❤️ to @ikarls for reporting, fixed by @jfy133)
+
 ### `Dependencies`
 
+| Tool          | Previous version | New version |
+| ------------- | ---------------- | ----------- |
+| bbmap         | 39.01            | 39.06       |
+| bowtie2       | 2.4.4            | 2.5.2       |
+| bracken       | 2.7              | 2.9         |
+| cat/fastq     | 8.30             |
+| diamond       | 2.0.15           | 2.1.8       |
+| ganon         | 1.5.1            | 2.0.0       |
+| kraken2       | 2.1.2            | 2.1.3       |
+| krona         | 2.8              | 2.8.1       |
+| megan         | 6.24.20          | 6.25.9      |
+| metaphlan     | 4.0.6            | 4.1.1       |
+| minimap2      | 2.24             | 2.28        |
+| motus/profile | 3.0.3            | 3.1.0       |
+| multiqc       | 1.21             | 1.24.1      |
+| nanoq         |                  | 0.10.0      |
+| samtools      | 1.17             | 1.20        |
+| untar         | 4.7              | 4.8         |
+
 ### `Deprecated`
 
 ## v1.1.8 - Augmented Akita Patch [2024-06-20]

diff --git a/README.md b/README.md
@@ -23,6 +23,10 @@
 
 **nf-core/taxprofiler** is a bioinformatics best-practice analysis pipeline for taxonomic classification and profiling of shotgun short- and long-read metagenomic data. It allows for in-parallel taxonomic identification of reads or taxonomic abundance estimation with multiple classification and profiling tools against multiple databases, and produces standardised output tables for facilitating results comparison between different tools and databases.
 
+The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!
+
+On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/scnanoseq/results).
+
 ## Pipeline summary
 
 ![](docs/images/taxprofiler_tube.png)

diff --git a/assets/multiqc_config.yml b/assets/multiqc_config.yml
@@ -33,6 +33,8 @@ report_section_order:
     order: 200
   filtlong:
     order: 100
+  nanoq:
+    order: 95
   bowtie2:
     order: 90
   samtools:
@@ -68,6 +70,7 @@ run_modules:
   - prinseqplusplus
   - porechop
   - filtlong
+  - nanoq
   - bowtie2
   - minimap2
   - samtools
@@ -206,6 +209,8 @@ table_columns_placement:
     Middle Split Percent: 560
   Filtlong:
     Target bases: 600
+  nanoq:
+    Read N50: 700
   BBDuk:
     Input reads: 800
     Total Removed bases percent: 810
@@ -305,6 +310,8 @@ table_columns_visible:
     Middle Split Percent: True
   Filtlong:
     Target bases: True
+  nanoq:
+    ReadN50: True
   BBDuk:
     Input reads: False
     Total Removed bases Percent: False

diff --git a/assets/schema_input.json b/assets/schema_input.json
@@ -38,18 +38,21 @@
                 "type": "string",
                 "format": "file-path",
                 "pattern": "^\\S+\\.f(ast)?q\\.gz$",
+                "unique": true,
                 "errorMessage": "FastQ file for reads 1 must be provided, cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'"
             },
             "fastq_2": {
                 "type": "string",
                 "format": "file-path",
                 "pattern": "^\\S+\\.f(ast)?q\\.gz$",
+                "unique": true,
                 "errorMessage": "FastQ file for reads 2 cannot contain spaces and must have extension '.fq.gz' or '.fastq.gz'. If not applicable, leave it empty."
             },
             "fasta": {
                 "type": "string",
                 "format": "file-path",
                 "pattern": "^\\S+\\.(f(ast)?q|fa(sta)?)\\.gz$",
+                "unique": true,
                 "errorMessage": "FastA file must be provided, cannot contain spaces and must have extension '.fa.gz' or '.fasta.gz'. If not applicable, leave it empty."
             }
         },

diff --git a/conf/modules.config b/conf/modules.config
@@ -813,7 +813,7 @@ process {
         publishDir = [
             path: { "${params.outdir}/ganon/${meta.db_name}/" },
             mode: params.publish_dir_mode,
-            pattern: '*.{tre,rep,lca,all,unc}'
+            pattern: '*.{tre,rep,lca,all,unc,log}'
         ]
     }
 

diff --git a/docs/usage.md b/docs/usage.md
@@ -97,6 +97,10 @@ While one can include both short-read and long-read data in one run, we recommen
 
 An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline.
 
+:::warning
+FASTA input will not go through any preprocessing steps, and will go directly to profiling.
+:::
+
 ### Full database sheet
 
 nf-core/taxprofiler supports multiple databases being classified/profiled against in parallel for each tool.
@@ -299,7 +303,7 @@ Complexity filtering is primarily a run-time optimisation step. It is not necess
 
 There are currently three options for short-read complexity filtering: [`bbduk`](https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbduk-guide/), [`prinseq++`](https://github.com/Adrian-Cantu/PRINSEQ-plus-plus), and [`fastp`](https://github.com/OpenGene/fastp#low-complexity-filter).
 
-There are two options for long-read quality filtering: [`Filtlong`](https://github.com/rrwick/Filtlong) and [`nanoq`](https://github.com/esteinig/nanoq).
+There are two options for long-read quality filtering: [`Filtlong`](https://github.com/rrwick/Filtlong) and [`nanoq`](https://github.com/esteinig/nanoq), with `nanoq` being the default option.
 
 The tools offer different algorithms and parameters for removing low complexity reads and quality filtering. We therefore recommend reviewing the pipeline's [parameter documentation](https://nf-co.re/taxprofiler/parameters) and the documentation of the tools (see links above) to decide on optimal methods and parameters for your dataset.
 
@@ -363,6 +367,8 @@ Centrifuge currently does not accept FASTA files as input, therefore no output w
 
 ##### DIAMOND
 
+DIAMOND can only accept a single input read file. To run DIAMOND on paired-end reads, please merge the reads (e.g., using `--shortread_qc_mergepairs`).
+
 DIAMOND only allows output of a single file format at a time, therefore parameters such `--diamond_save_reads` supplied will result in only aligned reads in SAM format will be produced, no taxonomic profiles will be available. Be aware of this when setting up your pipeline runs, depending on your particular use case.
 
 ##### Kaiju