Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding plasflow module #3555

Closed
wants to merge 84 commits into from
Closed
Show file tree
Hide file tree
Changes from 55 commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
13c52e9
New module for the cram-size command of samtools.
limrp Jun 24, 2023
9d24d79
"Adding module for cram-size command of the samtools tool"
limrp Jun 24, 2023
d64477f
Merge branch 'master' into master
limrp Jun 29, 2023
65b3a9a
Update modules/nf-core/samtools/cramsize/main.nf
limrp Jul 6, 2023
99c3260
Update modules/nf-core/samtools/cramsize/main.nf
limrp Jul 6, 2023
20c1d63
Update modules/nf-core/samtools/cramsize/main.nf
limrp Jul 6, 2023
43fa886
Stub added to the main.nf script.
limrp Jul 8, 2023
dda123d
Adding generated test.yml file.
limrp Jul 8, 2023
62af573
Merge branch 'master' of https://github.com/nf-core/modules
limrp Jul 8, 2023
7506ff9
Updating my master branch with the changes made in cramsize-branch:
limrp Jul 8, 2023
e80fa18
Adding pytest.
limrp Jul 9, 2023
afda96d
Adding a new nf-core module called plasflow.
limrp Jul 10, 2023
7b0714f
Merging branch with new module plasflow
limrp Jul 10, 2023
e6aebc5
Update manta modules (#3615)
maxulysse Jul 10, 2023
21d77d8
add varlociraptor call subcommand (#3529)
FriederikeHanssen Jul 11, 2023
9f28ebe
add new module iCount-mini/metagene (#3612)
CharlotteAnne Jul 12, 2023
9a7a2ec
Newmodule ngmerge (#3626)
CharlotteAnne Jul 12, 2023
c076b4d
NEW MODULE: PURECLIP (#3624)
CharlotteAnne Jul 12, 2023
c69b9ca
fix integer overflow (#3620)
sstrong99 Jul 12, 2023
c7e83fa
Fix metaeuk_easypredict when using mmseqs database (#3525)
prototaxites Jul 12, 2023
40a29ea
3630 add unstitched reads as emitted output of ngmerge (#3632)
CharlotteAnne Jul 12, 2023
b971c6a
Update vardictjava (#3633)
nvnieuwk Jul 13, 2023
10085b1
Adding modules from #scrnaseq for simpleaf (#3619)
Jul 13, 2023
1e42df3
update icount mini version (#3640)
CharlotteAnne Jul 14, 2023
4fcf729
Bump taxpasta version (biocontainer not yet available) (#3639)
jfy133 Jul 14, 2023
8f61632
Purecn/run (#3140)
aldosr Jul 14, 2023
4592ece
new module: samtools/import (#3642)
matthdsm Jul 14, 2023
e335edf
Adding seqkit sliding command (#3637)
DLBPointon Jul 17, 2023
60c0ee5
Hicexplorer hicpca (#2933)
jianhong Jul 17, 2023
ad367dc
Fix bug in pyrodigal module (#3643)
jasmezz Jul 19, 2023
76e88d8
Update gecco/run (#3652)
jasmezz Jul 19, 2023
9134e17
New module: instrain/compare (#3623)
CarsonJM Jul 19, 2023
5f017ea
Update LAST modules to v1453 (#3303)
johannesnicolaus Jul 20, 2023
4b0d05a
Hisat2-build: index construction w/ truly optional inputs (#3649)
JackCurragh Jul 20, 2023
fb4126c
update metaphlan module (#3661)
LilyAnderssonLee Jul 21, 2023
35a1d9d
CNVKit 0.9.10 update (#3651)
adamrtalbot Jul 21, 2023
664e3c4
Update gatk3 modules (#3660)
TCLamnidis Jul 21, 2023
2347ffa
Update tiddit to v3.6.1 (#3663)
asp8200 Jul 23, 2023
5281289
Update stubs for samtools flagstat (#3657)
ramsainanduri Jul 24, 2023
291f9bf
R differential modules control over output prefixes (#3664)
Jul 25, 2023
5033ada
Update stubs for samtools idxstats (#3665)
ramsainanduri Jul 25, 2023
adb910a
Bump ensembl-vep modules to 110.0 (#3658)
maxulysse Jul 25, 2023
14ed454
Remove containers with cache with snpEff too (#3670)
maxulysse Jul 25, 2023
aae7de5
Update stubs for bwa mem (#3671)
ramsainanduri Jul 26, 2023
e84144d
Additional optional output to homer/annotatepeaks (#3667)
lnblum Jul 26, 2023
879be43
Hisat2 align splicesitesoptional (#3656)
JackCurragh Jul 27, 2023
674c31a
Fix invalid GATK container (#3673)
adamrtalbot Jul 27, 2023
c381236
Update: bamtools, gstama, ultra (#3655)
sguizard Jul 27, 2023
0f30afb
Fix centrifuge: reduce number of input channels (#3674)
jfy133 Jul 27, 2023
e950f93
Remove unused ublast module
limrp Jul 28, 2023
a34398f
Checking tests that were failing from the samtools/cramsize module.
limrp Jul 28, 2023
926f0c5
Writing some errors.
limrp Jul 28, 2023
ca0df04
Checking failed test of the samtools/cram-size module.
limrp Jul 28, 2023
4216797
Merging remote-tracking branch 'upstream/master'
limrp Jul 28, 2023
d9d1cb7
Merging branch 'cramsize' with up-to-date master branch
limrp Jul 28, 2023
d8bd46c
Checking failed test on yml file.
limrp Jul 29, 2023
8d6df3c
Fixing test.yml file.
limrp Jul 29, 2023
0fa899e
Merge remote-tracking branch 'upstream/master'
limrp Jul 29, 2023
1cc38e5
Moving test.yml from plasflow module.
limrp Jul 29, 2023
573f18c
Fixing test.yml from plasflow module.
limrp Jul 29, 2023
e52e586
Removing trailing whitespaces from files.
limrp Jul 29, 2023
9f6c9b5
Fixing test.yml from plasflow module.
limrp Jul 29, 2023
fe506d2
Removing Plasflow from conda. It doesn't work.
limrp Jul 30, 2023
2605c97
Running prettier on test.yml from plasflow.
limrp Jul 30, 2023
1eaa0fb
Details of the conda environment.
limrp Jul 31, 2023
3e45482
Merge remote-tracking branch 'upstream/master'
limrp Jul 31, 2023
d73b1c9
Merge branch 'pf' with latest change about the conda environment for
limrp Jul 31, 2023
99edf70
Adding specific Tensorflow version for Plasflow.
limrp Aug 1, 2023
b9199b7
Merge branch 'pf' with Tensorflow version in conda environment.
limrp Aug 1, 2023
eeb8cba
Adding channel for python 3.5
limrp Aug 3, 2023
beb5d82
Merge remote-tracking branch 'upstream/master'
limrp Aug 3, 2023
5d1f7d1
Merge branch 'pf' with channel for python 3.5 specified
limrp Aug 3, 2023
31d0b5d
Fixing the channel for python 3.5 for plasflow module.
limrp Aug 3, 2023
05337f4
Merge branch 'pf'
limrp Aug 3, 2023
78e7c83
Changing the channel for python 3.5 (to conda-forge) for plasflow mod…
limrp Aug 3, 2023
5f8751a
Merge branch 'pf'
limrp Aug 3, 2023
bdb7df9
Merge branch 'master' into master
SPPearce Apr 30, 2024
dedbeac
Merge branch 'master' into master
SPPearce May 10, 2024
4407fb4
Update cramsize
SPPearce May 10, 2024
e630347
Excluse cramsize from pytest
SPPearce May 10, 2024
be0e4c4
Merge branch 'master' into master
SPPearce Jun 4, 2024
39f9030
Delete modules/nf-core/samtools/cramsize_old/main.nf
SPPearce Jun 4, 2024
38347b0
Delete modules/nf-core/samtools/cramsize_old/meta.yml
SPPearce Jun 4, 2024
3459a65
Update environment.yml
SPPearce Jun 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions modules/nf-core/plasflow/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
process PLASFLOW {
tag "$meta.id"
label 'process_medium'

conda "bioconda::plasflow=1.1.0"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/plasflow:1.1.0--py35_0':
'biocontainers/plasflow:1.1.0--py35_0' }"

input:
tuple val(meta), path(assembly)

output:
tuple val(meta), path("*.tsv") , emit: tsv
tuple val(meta), path("*_chromosomes.fasta.gz") , emit: chromosomes
tuple val(meta), path("*_plasmids.fasta.gz") , emit: plasmids
tuple val(meta), path("*_unclassified.fasta.gz"), emit: unclassified
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
def VERSION = '1.1' // WARN: Version information not provided by tool on CLI. Please update this string when bumping container versions.
"""
if [[ "$assembly" == *.gz ]]; then
gunzip -c $assembly > ${prefix}.fasta
PlasFlow.py \\
$args \\
--input ${prefix}.fasta \\
--output ${prefix}.tsv
else
PlasFlow.py \\
$args \\
--input $assembly \\
--output ${prefix}.tsv
fi

if [ -f ${prefix}.tsv_chromosomes.fasta ]; then
mv ${prefix}.tsv_chromosomes.fasta ${prefix}_chromosomes.fasta
gzip -n ${prefix}_chromosomes.fasta
fi

if [ -f ${prefix}.tsv_plasmids.fasta ]; then
mv ${prefix}.tsv_plasmids.fasta ${prefix}_plasmids.fasta
gzip -n ${prefix}_plasmids.fasta
fi

if [ -f ${prefix}.tsv_unclassified.fasta ]; then
mv ${prefix}.tsv_unclassified.fasta ${prefix}_unclassified.fasta
gzip -n ${prefix}_unclassified.fasta
fi

cat <<-END_VERSIONS > versions.yml
"${task.process}":
PlasFlow: $VERSION
END_VERSIONS
"""

stub:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
"""
touch ${prefix}.tsv
touch ${prefix}_chromosomes.fasta.gz
touch ${prefix}_plasmids.fasta.gz
touch ${prefix}_unclassified.fasta.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
PlasFlow: $VERSION
END_VERSIONS
"""
}
59 changes: 59 additions & 0 deletions modules/nf-core/plasflow/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: "plasflow"
description: Uses PlasFlow for prediction of plasmid sequences in metagenomic contigs.
keywords:
- plasmid
- chromosomes
- metagenomes
- contigs
tools:
- "PlasFlow":
description: |
PlasFlow is a set of scripts used for prediction of plasmid sequences in metagenomic contigs.
It relies on the neural network models trained on full genome and plasmid sequences and is able
to differentiate between plasmids and chromosomes with accuracy reaching 96%. It outperforms
other available solutions for plasmids recovery from metagenomes and incorporates the thresholding
which allows for exclusion of incertain predictions.
homepage: https://github.com/smaegol/PlasFlow
documentation: https://github.com/smaegol/PlasFlow
tool_dev_url: https://github.com/smaegol/PlasFlow
doi: 10.1093/nar/gkx1321
licence: ["GPL v3"]
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. `[ id:'test', single_end:false ]`
- assembly:
type: file
description: fasta file
pattern: "*.{gz,fasta,fa,fna}"
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. `[ id:'test', single_end:false ]`
- tsv:
type: file
description: file containing classified sequences
pattern: "*.tsv"
- chromosomes:
type: file
description: Fasta file containing chromosome sequences
pattern: "*_chromosomes.fasta.gz"
- plasmids:
type: file
description: Fasta file containing plasmid sequences
pattern: "*_plasmids.fasta.gz"
- unclassified:
type: file
description: Fasta file containing unclassified sequences
pattern: "*_unclassified.fasta.gz"
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"

authors:
- "@limrp"
46 changes: 46 additions & 0 deletions modules/nf-core/samtools/cramsize/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
process SAMTOOLS_CRAMSIZE {
tag "$meta.id"
label 'process_medium'

conda "bioconda::samtools=1.17"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/samtools:1.17--h00cdaf9_0':
'biocontainers/samtools:1.17--h00cdaf9_0' }"

input:
tuple val(meta), path(cram)

output:
tuple val(meta), path("*.size"), emit: size
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "$meta.id"
"""
samtools \\
cram-size \\
$args \\
-o ${prefix}.size \\
$cram

cat <<-END_VERSIONS > versions.yml
"${task.process}":
samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

stub:
def prefix = task.ext.prefix ?: "$meta.id"
"""
touch ${prefix}.size

cat <<-END_VERSIONS > versions.yml
"${task.process}":
samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""
}
SPPearce marked this conversation as resolved.
Show resolved Hide resolved
42 changes: 42 additions & 0 deletions modules/nf-core/samtools/cramsize/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
name: samtools_cramsize
description: List CRAM Content-ID and Data-Series sizes
keywords:
- cram-size
- cram
- size
tools:
- samtools:
description: |
SAMtools is a set of utilities for interacting with and post-processing
short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng Li.
These files are generated as output by short read aligners like BWA.
homepage: http://www.htslib.org/
documentation: http://www.htslib.org/doc/samtools.html
doi: 10.1093/bioinformatics/btp352
licence: ["MIT"]
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- cram:
type: file
description: CRAM file
pattern: "*.cram"
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- size:
type: file
description: Size information file
pattern: "*.size"
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"
authors:
- "@limrp"
12 changes: 12 additions & 0 deletions tests/config/pytest_modules.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2799,6 +2799,10 @@ pirate:
- modules/nf-core/pirate/**
- tests/modules/nf-core/pirate/**

plasflow:
- modules/nf-core/plasflow/**
- tests/modules/nf-core/plasflow/**

plasmidfinder:
- modules/nf-core/plasmidfinder/**
- tests/modules/nf-core/plasmidfinder/**
Expand Down Expand Up @@ -3119,6 +3123,10 @@ samtools/coverage:
- modules/nf-core/samtools/coverage/**
- tests/modules/nf-core/samtools/coverage/**

samtools/cramsize:
- modules/nf-core/samtools/cramsize/**
- tests/modules/nf-core/samtools/cramsize/**

samtools/depth:
- modules/nf-core/samtools/depth/**
- tests/modules/nf-core/samtools/depth/**
Expand Down Expand Up @@ -3868,6 +3876,10 @@ trinity:
- modules/nf-core/trinity/**
- tests/modules/nf-core/trinity/**

ublast:
- modules/nf-core/ublast/**
- tests/modules/nf-core/ublast/**

ucsc/bedclip:
- modules/nf-core/ucsc/bedclip/**
- tests/modules/nf-core/ucsc/bedclip/**
Expand Down
15 changes: 15 additions & 0 deletions tests/modules/nf-core/plasflow/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/usr/bin/env nextflow

nextflow.enable.dsl = 2

include { PLASFLOW } from '../../../../modules/nf-core/plasflow/main.nf'

workflow test_plasflow {

input = [
[ id:'test', single_end:false ], // meta map
file(params.test_data['bacteroides_fragilis']['illumina']['test1_contigs_fa_gz'], checkIfExists: true)
]

PLASFLOW ( input )
}
5 changes: 5 additions & 0 deletions tests/modules/nf-core/plasflow/nextflow.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
process {

publishDir = { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }

}
19 changes: 19 additions & 0 deletions tests/modules/nf-core/plasflow/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
- name: test_plasflow
command: nextflow run ./tests/modules/nf-core/plasflow -entry test_plasflow -c ./tests/config/nextflow.config -c ./tests/modules/nf-core/plasflow/nextflow.config
tags:
- metagenome
- chromosome
- plasmid
files:
- path: output/plasflow/test.tsv
md5sum: a7c0ee75bca40f7ae5000a120d3f2099
- path: output/plasflow/test_chromosomes.fasta.gz
contains:
- Fasta file containing chromosome sequences
- path: output/plasflow/test_plasmids.fasta.gz
contains:
- Fasta file containing plasmid sequences
- path: output/plasflow/test_unclassified.fasta.gz
contains:
- Fasta file containing unclassified sequences
- path: output/plasflow/versions.yml
15 changes: 15 additions & 0 deletions tests/modules/nf-core/samtools/cramsize/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/usr/bin/env nextflow

nextflow.enable.dsl = 2

include { SAMTOOLS_CRAMSIZE } from '../../../../../modules/nf-core/samtools/cramsize/main.nf'

workflow test_samtools_cramsize {

input = [
[ id:'test', single_end:false ], // meta map
file(params.test_data['homo_sapiens']['illumina']['test_paired_end_sorted_cram'], checkIfExists: true)
]

SAMTOOLS_CRAMSIZE ( input )
}
5 changes: 5 additions & 0 deletions tests/modules/nf-core/samtools/cramsize/nextflow.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
process {

publishDir = { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }

}
8 changes: 8 additions & 0 deletions tests/modules/nf-core/samtools/cramsize/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
- name: samtools cramsize test_samtools_cramsize
command: nextflow run ./tests/modules/nf-core/samtools/cramsize -entry test_samtools_cramsize -c ./tests/config/nextflow.config -c ./tests/modules/nf-core/samtools/cramsize/nextflow.config
tags:
- samtools/cramsize
files:
- path: output/samtools/test.size
md5sum: aff113286a3368bcb9c5b708bdf5f777
- path: output/samtools/versions.yml
Loading