Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding single-read functionality to PROFILE #84

Merged
merged 63 commits into from
Jan 6, 2025
Merged
Show file tree
Hide file tree
Changes from 57 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
c4b1f62
Updated subsetReads to optionally take in single reads. Added PROFILE…
simonleandergrimm Oct 28, 2024
193dd22
Added BBDUK single read script.
simonleandergrimm Oct 28, 2024
0e081f2
Added subsetting for single reads/
simonleandergrimm Oct 28, 2024
4e140aa
Added selection of single or paired end reads.
simonleandergrimm Oct 28, 2024
d270d3f
WIP edits to taxonomy main scripts (commenting, subselecting processe…
simonleandergrimm Oct 28, 2024
bad2787
Adding kraken path to run_dev_se.nf
simonleandergrimm Oct 28, 2024
79f9222
Adding WIP changes to test/nextflow.config to run on Simon's S3.
simonleandergrimm Oct 28, 2024
50adde1
fixed wrong indent in run_dev_se.nf
simonleandergrimm Nov 7, 2024
daa6c3e
Slight edit to inline commments in taxonomy/main.nf. Reverted emit st…
simonleandergrimm Nov 7, 2024
39c797c
Revert "Adding WIP changes to test/nextflow.config to run on Simon's …
simonleandergrimm Nov 7, 2024
2da99f1
removed white space in two files
simonleandergrimm Nov 7, 2024
1d30115
Removed unneeded comments in taxonomy/main.nf
simonleandergrimm Nov 7, 2024
0f8636e
Merge branch 'single-read-raw-clean' into single-read-profile
simonleandergrimm Nov 19, 2024
d4b7239
dropped params.*
simonleandergrimm Nov 20, 2024
487c6c5
added selection of correct concat_group process in extractViralReads
simonleandergrimm Nov 20, 2024
dd71ea8
added paired paired and single version for concat_group
simonleandergrimm Nov 20, 2024
329acb1
Adapted profile to be cleaner and take in changes introduced by Will.
simonleandergrimm Nov 20, 2024
ea74362
Cleaned up taxonomy and took into account changes of v2.5.0
simonleandergrimm Nov 20, 2024
bbced47
Fixed single-end flags in run.nf
simonleandergrimm Nov 20, 2024
4a1edda
added "params" to the single_end flags to run_dev_se.nf
simonleandergrimm Nov 20, 2024
753982d
Fixing run_dev_se.config file
simonleandergrimm Nov 20, 2024
5c3cb73
updated config in paired end test dataset
simonleandergrimm Nov 20, 2024
5c29398
added params to single end.
simonleandergrimm Nov 22, 2024
9b47532
Merge branch 'single-read-raw-clean' into single-read-profile
simonleandergrimm Nov 30, 2024
8fa88fa
Dropped params.* for single_end.
simonleandergrimm Nov 30, 2024
abe95bf
moved single_end parameter definition to .config files.
simonleandergrimm Dec 2, 2024
15facd1
moved order of single_end definition.
simonleandergrimm Dec 2, 2024
8f5afe8
Reformatted tests/run-dev-se.config to be thesame as tests/run.config
simonleandergrimm Dec 3, 2024
b0d6737
removed unneded whitespace
simonleandergrimm Dec 3, 2024
5827044
reset test-data/nextflowconfig to have settings set to will defaults.
simonleandergrimm Dec 3, 2024
8302f25
removed test-se configfile.
simonleandergrimm Dec 3, 2024
ac10124
placeholder comiit
simonleandergrimm Dec 3, 2024
30a2a6e
Merge pull request #116 from naobservatory/harmon_fix_copying_file_bug
willbradshaw Nov 27, 2024
cc5fd19
Changed version of nextflow, hopefully this works
harmonbhasin Dec 3, 2024
ab78257
Updated label to be correct
harmonbhasin Dec 3, 2024
1db7ba8
updated end-to-end.yml and run dev se config.
simonleandergrimm Dec 3, 2024
1751288
test commit
simonleandergrimm Dec 3, 2024
1aef130
Merge branch 'single-read-raw-clean' into single-read-profile
simonleandergrimm Dec 3, 2024
236175b
Merge remote-tracking branch 'origin/harmon_fix_gh_actions_test' into…
simonleandergrimm Dec 3, 2024
2159431
Merge branch 'single-read-profile' of https://github.com/naobservator…
simonleandergrimm Dec 3, 2024
764cf9c
Merge remote-tracking branch 'origin/harmon_fix_gh_actions_test' into…
simonleandergrimm Dec 4, 2024
162fb19
Merge branch 'single-read-raw-clean' into single-read-profile
simonleandergrimm Dec 4, 2024
6216f77
Merge branch 'single-read-raw-clean' into single-read-profile
simonleandergrimm Dec 4, 2024
3648257
Merge branch 'single-read-raw-clean' into single-read-profile
simonleandergrimm Dec 4, 2024
aa62e5a
Merge branch 'single-read-raw-clean' into single-read-profile
simonleandergrimm Dec 4, 2024
5567568
dropped read type info from config files where its not needed.
simonleandergrimm Dec 4, 2024
a3af7db
Made rundevse index and outputs look the same as run.nf
simonleandergrimm Dec 4, 2024
d465db7
fixed memory issue in bbduk
simonleandergrimm Dec 5, 2024
4e74d7c
Fixing setup of run_dev_se test config.
simonleandergrimm Dec 5, 2024
d061c1d
fixed imports in taxonomy/main.nf
simonleandergrimm Dec 9, 2024
db187e6
Removed redundancy
simonleandergrimm Dec 9, 2024
ce74e2c
Merge branch 'single-read-raw-clean' into single-read-profile
simonleandergrimm Dec 9, 2024
fcc2f4a
Merge branch 'single-read-raw-clean' into single-read-profile
simonleandergrimm Dec 10, 2024
8a09b08
delete unexpected cahracter
simonleandergrimm Dec 10, 2024
11f2145
Merge branch 'single-read-raw-clean' into single-read-profile
simonleandergrimm Dec 11, 2024
55f75f8
moved kraken db location
simonleandergrimm Dec 11, 2024
1210a3c
Update main.nf
simonleandergrimm Dec 12, 2024
9113fd4
Update end-to-end.yml
simonleandergrimm Dec 20, 2024
dca3c30
Merge branch 'dev' into single-read-profile. Fixed remaining issues i…
simonleandergrimm Dec 20, 2024
4bfbc46
updated changelog. updated rundevse to have correct loadsamplesheet p…
simonleandergrimm Dec 20, 2024
d2826d2
removed whitespace
simonleandergrimm Dec 20, 2024
6b1da71
Merge branch 'single-read-profile' of https://github.com/naobservator…
simonleandergrimm Dec 20, 2024
ae1fe8a
Merge branch 'dev' into single-read-profile
willbradshaw Dec 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/end-to-end.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,4 +82,5 @@ jobs:
sudo mv nf-test /usr/local/bin/

- name: Run run_validation workflow
run: nf-test test --tag validation --verbose
run: nf-test test --tag validation --verbose

31 changes: 30 additions & 1 deletion modules/local/bbduk/main.nf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonleandergrimm any particular reason you're using BBDUK rather than BBDUK_HITS here? The main pipeline uses the latter.

Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
// Detection and removal of contaminant reads
process BBDUK {
process BBDUK_PAIRED {
label "large"
label "BBTools"
input:
Expand Down Expand Up @@ -31,6 +31,35 @@ process BBDUK {
'''
}

process BBDUK_SINGLE {
label "large"
label "BBTools"
input:
tuple val(sample), path(reads)
path(contaminant_ref)
val(min_kmer_fraction)
val(k)
val(suffix)
output:
tuple val(sample), path("${sample}_${suffix}_bbduk_pass.fastq.gz"), emit: reads
tuple val(sample), path("${sample}_${suffix}_bbduk_fail.fastq.gz"), emit: fail
tuple val(sample), path("${sample}_${suffix}_bbduk.stats.txt"), emit: log
shell:
'''
# Define input/output
in=!{reads}
op=!{sample}_!{suffix}_bbduk_pass.fastq.gz
of=!{sample}_!{suffix}_bbduk_fail.fastq.gz
stats=!{sample}_!{suffix}_bbduk.stats.txt
ref=!{contaminant_ref}
io="in=${in} ref=${ref} out=${op} outm=${of} stats=${stats}"
# Define parameters
par="minkmerfraction=!{min_kmer_fraction} k=!{k} t=!{task.cpus} -Xmx!{task.memory.toGiga()}g"
# Execute
bbduk.sh ${io} ${par}
'''
}

// Detection and removal of contaminant reads (use minkmerhits instead of minkmerfraction)
process BBDUK_HITS {
label "large"
Expand Down
18 changes: 17 additions & 1 deletion modules/local/concatGroup/main.nf
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
// Copy a file to a new location with a custom path
process CONCAT_GROUP {
process CONCAT_GROUP_PAIRED {
label "coreutils"
label "single"
input:
Expand All @@ -14,3 +14,19 @@ process CONCAT_GROUP {
cat ${fastq_2_list.join(' ')} > ${group}_R2.fastq.gz
"""
}


process CONCAT_GROUP_SINGLE {
label "base"
label "single"
input:
tuple val(samples), path(fastq_list), val(group)

output:
tuple val(group), path("${group}.fastq.gz")

script:
"""
cat ${fastq_list.join(' ')} > ${group}.fastq.gz
"""
}
61 changes: 60 additions & 1 deletion modules/local/subsetReads/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,31 @@ process SUBSET_READS_PAIRED {
'''
}

// Subsample reads with seqtk (single-end)
process SUBSET_READS_SINGLE {
label "seqtk"
label "single"
input:
tuple val(sample), path(reads)
val readFraction
val suffix
output:
tuple val(sample), path("${sample}_subset.${suffix}.gz")
shell:
'''
# Define input/output
in=!{reads}
out=!{sample}_subset.!{suffix}.gz
# Count reads for validation
echo "Input reads: $(zcat ${in} | wc -l | awk '{ print $1/4 }')"
# Carry out subsetting
seed=${RANDOM}
seqtk sample -s ${seed} ${in} !{readFraction} | gzip -c > ${out}
# Count reads for validation
echo "Output reads: $(zcat ${out} | wc -l | awk '{ print $1/4 }')"
'''
}

// Subsample reads with seqtk (no sample name)
process SUBSET_READS_PAIRED_MERGED {
label "seqtk"
Expand Down Expand Up @@ -54,7 +79,7 @@ process SUBSET_READS_PAIRED_MERGED {
'''
}

// Subsample reads with seqtk with an autocomputed read fraction
// Subsample reads with seqtk with an autocomputed read fraction (paired-end)
process SUBSET_READS_PAIRED_TARGET {
label "seqtk"
label "single"
Expand Down Expand Up @@ -91,3 +116,37 @@ process SUBSET_READS_PAIRED_TARGET {
echo "Output reads: $(zcat ${out1} | wc -l | awk '{ print $1/4 }')"
'''
}

// Subsample reads with seqtk with an autocomputed read fraction (single-end)
process SUBSET_READS_SINGLE_TARGET {
label "seqtk"
label "single"
input:
tuple val(sample), path(reads)
val readTarget
val suffix
output:
tuple val(sample), path("${sample}_subset.${suffix}.gz")
shell:
'''
# Define input/output
in=!{reads}
out=!{sample}_subset.!{suffix}.gz
# Count reads and compute target fraction
n_reads=$(zcat ${in} | wc -l | awk '{ print $1/4 }')
echo "Input reads: ${n_reads}"
echo "Target reads: !{readTarget}"
if (( ${n_reads} <= !{readTarget} )); then
echo "Target larger than input; returning all reads."
cp ${in} ${out}
else
frac=$(awk -v a=${n_reads} -v b=!{readTarget} 'BEGIN {result = b/a; print (result > 1) ? 1.0 : result}')
echo "Read fraction for subsetting: ${frac}"
# Carry out subsetting
seed=${RANDOM}
seqtk sample -s ${seed} ${in} ${frac} | gzip -c > ${out}
fi
# Count reads for validation
echo "Output reads: $(zcat ${out} | wc -l | awk '{ print $1/4 }')"
'''
}
9 changes: 7 additions & 2 deletions subworkflows/local/extractViralReads/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,11 @@ include { COLLAPSE_VIRUS_READS } from "../../../modules/local/collapseVirusReads
include { ADD_FRAG_DUP_TO_VIRUS_READS } from "../../../modules/local/addFragDupToVirusReads"
include { MAKE_VIRUS_READS_FASTA } from "../../../modules/local/makeVirusReadsFasta"
include { COUNT_VIRUS_CLADES } from "../../../modules/local/countVirusClades"
include { CONCAT_GROUP } from "../../../modules/local/concatGroup"
if (params.single_end) {
include { CONCAT_GROUP_SINGLE as CONCAT_GROUP } from "../../../modules/local/concatGroup"
} else {
include { CONCAT_GROUP_PAIRED as CONCAT_GROUP } from "../../../modules/local/concatGroup"
}

/***********
| WORKFLOW |
Expand All @@ -48,6 +52,7 @@ workflow EXTRACT_VIRAL_READS {
encoding
fuzzy_match
grouping
single_end
main:
// Get reference paths
viral_genome_path = "${ref_dir}/results/virus-genomes-filtered.fasta.gz"
Expand Down Expand Up @@ -90,7 +95,7 @@ workflow EXTRACT_VIRAL_READS {
human_bbm_ch = BBMAP_HUMAN(other_bt2_ch.reads_unconc, bbm_human_index_path, "human")
other_bbm_ch = BBMAP_OTHER(human_bbm_ch.reads_unmapped, bbm_other_index_path, "other")
// Run Kraken on filtered viral candidates
tax_ch = TAXONOMY(other_bbm_ch.reads_unmapped, kraken_db_ch, true, "F")
tax_ch = TAXONOMY(other_bbm_ch.reads_unmapped, kraken_db_ch, true, "F", single_end)
// Process Kraken output and merge with Bowtie2 output across samples
kraken_output_ch = PROCESS_KRAKEN_VIRAL(tax_ch.kraken_output, virus_db_path, host_taxon)
bowtie2_kraken_merged_ch = MERGE_SAM_KRAKEN(kraken_output_ch.combine(bowtie2_sam_ch, by: 0))
Expand Down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be an empty file? Why is it here?

Empty file.
50 changes: 37 additions & 13 deletions subworkflows/local/profile/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,23 @@
| MODULES AND SUBWORKFLOWS |
***************************/

include { SUBSET_READS_PAIRED_TARGET; SUBSET_READS_PAIRED_TARGET as SUBSET_READS_PAIRED_TARGET_GROUP } from "../../../modules/local/subsetReads"
include { BBDUK } from "../../../modules/local/bbduk"

if (params.single_end) {
include { SUBSET_READS_SINGLE_TARGET as SUBSET_READS_TARGET } from "../../../modules/local/subsetReads"
include { BBDUK_SINGLE as BBDUK } from "../../../modules/local/bbduk"
include { CONCAT_GROUP_SINGLE as CONCAT_GROUP } from "../../../modules/local/concatGroup"
include { SUBSET_READS_SINGLE_TARGET; SUBSET_READS_SINGLE_TARGET as SUBSET_READS_TARGET_GROUP } from "../../../modules/local/subsetReads"
} else {
include { SUBSET_READS_PAIRED_TARGET as SUBSET_READS_TARGET } from "../../../modules/local/subsetReads"
include { SUBSET_READS_PAIRED_TARGET; SUBSET_READS_PAIRED_TARGET as SUBSET_READS_TARGET_GROUP } from "../../../modules/local/subsetReads"
include { BBDUK_PAIRED as BBDUK } from "../../../modules/local/bbduk"
include { CONCAT_GROUP_PAIRED as CONCAT_GROUP } from "../../../modules/local/concatGroup"
}

include { BBDUK_HITS } from "../../../modules/local/bbduk"
include { TAXONOMY as TAXONOMY_RIBO } from "../../../subworkflows/local/taxonomy"
include { TAXONOMY as TAXONOMY_NORIBO } from "../../../subworkflows/local/taxonomy"
include { MERGE_TAXONOMY_RIBO } from "../../../modules/local/mergeTaxonomyRibo"
include { CONCAT_GROUP } from "../../../modules/local/concatGroup"

/****************
| MAIN WORKFLOW |
Expand All @@ -28,23 +39,36 @@ workflow PROFILE {
k
bbduk_suffix
grouping
single_end
main:


// Randomly subset reads to target number
subset_ch = SUBSET_READS_PAIRED_TARGET(reads_ch, n_reads, "fastq")
subset_ch = SUBSET_READS_TARGET(reads_ch, n_reads, "fastq")

if (grouping){
// Join samplesheet with trimmed_reads and update fastq files
subset_group_ch = group_ch.join(subset_ch, by: 0)
.map { sample, group, reads -> tuple(sample, reads[0], reads[1], group) }
.groupTuple(by: 3)
if (single_end) {
subset_group_ch = group_ch.join(subset_ch, by: 0)
.map { sample, group, reads -> tuple(sample, reads, group) }
.groupTuple(by: 2)
// Single-sample groups are already subsetted to target number
single_sample_groups = subset_group_ch.filter { it[0].size() == 1 }
.map { samples, read_list, group -> tuple(group, [read_list[0]]) }

} else {
subset_group_ch = group_ch.join(subset_ch, by: 0)
.map { sample, group, reads -> tuple(sample, reads[0], reads[1], group) }
.groupTuple(by: 3)
single_sample_groups = subset_group_ch.filter { it[0].size() == 1 }
.map { samples, fwd_list, rev_list, group -> tuple(group, [fwd_list[0], rev_list[0]]) }
}
// Split into multi-sample groups, these need to be subsetted to target number
multi_sample_groups = subset_group_ch.filter { it[0].size() > 1 }
// These are already subsetted to target number
single_sample_groups = subset_group_ch.filter { it[0].size() == 1 }
.map { samples, fwd_list, rev_list, group -> tuple(group, [fwd_list[0], rev_list[0]]) }
// Concatenate multi-sample groups
grouped_samples = CONCAT_GROUP(multi_sample_groups)
// Randomly subset multi-sample groups to target number
subset_grouped_ch = SUBSET_READS_PAIRED_TARGET_GROUP(grouped_samples, n_reads, "fastq")
subset_grouped_ch = SUBSET_READS_TARGET_GROUP(grouped_samples, n_reads, "fastq")
// Mix with subsetted multi-sample group with already subsetted single-sample groups
grouped_ch = subset_grouped_ch.mix(single_sample_groups)
} else {
Expand All @@ -54,8 +78,8 @@ workflow PROFILE {
ribo_path = "${ref_dir}/results/ribo-ref-concat.fasta.gz"
ribo_ch = BBDUK(grouped_ch, ribo_path, min_kmer_fraction, k, bbduk_suffix)
// Run taxonomic profiling separately on ribo and non-ribo reads
tax_ribo_ch = TAXONOMY_RIBO(ribo_ch.fail, kraken_db_ch, false, "D")
tax_noribo_ch = TAXONOMY_NORIBO(ribo_ch.reads, kraken_db_ch, false, "D")
tax_ribo_ch = TAXONOMY_RIBO(ribo_ch.fail, kraken_db_ch, false, "D", single_end)
tax_noribo_ch = TAXONOMY_NORIBO(ribo_ch.reads, kraken_db_ch, false, "D", single_end)
// Merge ribo and non-ribo outputs
kr_ribo = tax_ribo_ch.kraken_reports.collectFile(name: "kraken_reports_ribo.tsv.gz")
kr_noribo = tax_noribo_ch.kraken_reports.collectFile(name: "kraken_reports_noribo.tsv.gz")
Expand Down
37 changes: 23 additions & 14 deletions subworkflows/local/taxonomy/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@
| MODULES AND SUBWORKFLOWS |
***************************/

include { JOIN_FASTQ } from "../../../modules/local/joinFastq"
include { BBMERGE } from "../../../modules/local/bbmerge"
willbradshaw marked this conversation as resolved.
Show resolved Hide resolved
include { SUMMARIZE_BBMERGE } from "../../../modules/local/summarizeBBMerge"
include { SUMMARIZE_DEDUP } from "../../../modules/local/summarizeDedup"
include { CLUMPIFY_PAIRED } from "../../../modules/local/clumpify"
include { JOIN_FASTQ } from "../../../modules/local/joinFastq"
include { CLUMPIFY_SINGLE } from "../../../modules/local/clumpify"
include { KRAKEN } from "../../../modules/local/kraken"
include { LABEL_KRAKEN_REPORTS } from "../../../modules/local/labelKrakenReports"
Expand All @@ -29,24 +29,33 @@ workflow TAXONOMY {
kraken_db_ch
dedup_rc
classification_level
single_end
main:
// Deduplicate reads (if applicable)
if ( dedup_rc ){
paired_dedup_ch = CLUMPIFY_PAIRED(reads_ch)
if (single_end) {
// No merging in single read version
summarize_bbmerge_ch = Channel.empty()
single_read_ch = reads_ch
} else {
paired_dedup_ch = reads_ch
// Deduplicate reads (if applicable)
if ( dedup_rc ){
paired_dedup_ch = CLUMPIFY_PAIRED(reads_ch)
} else {
paired_dedup_ch = reads_ch
}
// Prepare reads
merged_ch = BBMERGE(paired_dedup_ch)
// Only want to summarize the merged elements
summarize_bbmerge_ch = SUMMARIZE_BBMERGE(merged_ch.reads.map{sample, files -> [sample, files[0]]})
single_read_ch = JOIN_FASTQ(merged_ch.reads)
}
// Prepare reads
merged_ch = BBMERGE(paired_dedup_ch)
// Only want to summarize the merged elements
summarize_bbmerge_ch = SUMMARIZE_BBMERGE(merged_ch.reads.map{sample, files -> [sample, files[0]]})
joined_ch = JOIN_FASTQ(merged_ch.reads)

// Deduplicate reads (if applicable)
if ( dedup_rc ){
dedup_ch = CLUMPIFY_SINGLE(joined_ch)
} else {
dedup_ch = joined_ch
if (dedup_rc) {
dedup_ch = CLUMPIFY_SINGLE(single_read_ch)
} else {
dedup_ch = single_read_ch
}

// Summarize last of the output
summarize_dedup_ch = SUMMARIZE_DEDUP(dedup_ch)

Expand Down
2 changes: 2 additions & 0 deletions test-data/nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ params {
base_dir = "s3://nao-mgs-wb/test-batch" // Parent for working and output directories (can be S3)
ref_dir = "s3://nao-mgs-wb/index-20241113/output" // Reference/index directory (generated by index workflow)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noting that this is out-of-date, although i'm aware that the current index isn't functional.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't updated in master or dev, unsure if this is something we want to do here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to wait for new index anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the new index now




// Files
sample_sheet = "${launchDir}/samplesheet.csv" // Path to library TSV
adapters = "${projectDir}/ref/adapters.fasta" // Path to adapter file for adapter trimming
Expand Down
2 changes: 1 addition & 1 deletion test-data/single-end-samplesheet.csv
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
sample,fastq
230926Esv_D23-14904-1,s3://nao-testing/gold-standard-test/raw/gold_standard_R1.fastq.gz
gold_standard,s3://nao-testing/gold-standard-test/raw/gold_standard_R1.fastq.gz
1 change: 0 additions & 1 deletion tests/run.config
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ params {
quality_encoding = "phred33" // FASTQ quality encoding (probably phred33, maybe phred64)
fuzzy_match_alignment_duplicates = 0 // Fuzzy matching the start coordinate of reads for identification of duplicates through alignment (0 = exact matching; options are 0, 1, or 2)
host_taxon = "vertebrate"

blast_db_prefix = "nt_others"
}

Expand Down
2 changes: 2 additions & 0 deletions tests/run_dev_se.config
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ params {
quality_encoding = "phred33" // FASTQ quality encoding (probably phred33, maybe phred64)
fuzzy_match_alignment_duplicates = 0 // Fuzzy matching the start coordinate of reads for identification of duplicates through alignment (0 = exact matching; options are 0, 1, or 2)
host_taxon = "vertebrate"

blast_db_prefix = "nt_others"
}

includeConfig "${projectDir}/configs/containers.config"
Expand Down
4 changes: 2 additions & 2 deletions workflows/run.nf
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ workflow RUN {
RAW(samplesheet_ch, params.n_reads_trunc, "2", "4 GB", "raw_concat", params.single_end)
CLEAN(RAW.out.reads, params.adapters, "2", "4 GB", "cleaned", params.single_end)
// Extract and count human-viral reads
EXTRACT_VIRAL_READS(CLEAN.out.reads, group_ch, params.ref_dir, kraken_db_path, params.bt2_score_threshold, params.adapters, params.host_taxon, "1", "24", "viral", "${params.quality_encoding}", "${params.fuzzy_match_alignment_duplicates}", params.grouping)
EXTRACT_VIRAL_READS(CLEAN.out.reads, group_ch, params.ref_dir, kraken_db_path, params.bt2_score_threshold, params.adapters, params.host_taxon, "1", "24", "viral", "${params.quality_encoding}", "${params.fuzzy_match_alignment_duplicates}", params.grouping, params.single_end)
// Process intermediate output for chimera detection
raw_processed_ch = EXTRACT_VIRAL_READS.out.bbduk_match.join(RAW.out.reads, by: 0)
EXTRACT_RAW_READS_FROM_PROCESSED(raw_processed_ch, "raw_viral_subset")
Expand All @@ -64,7 +64,7 @@ workflow RUN {
blast_paired_ch = Channel.empty()
}
// Taxonomic profiling
PROFILE(CLEAN.out.reads, group_ch, kraken_db_path, params.n_reads_profile, params.ref_dir, "0.4", "27", "ribo", params.grouping)
PROFILE(CLEAN.out.reads, group_ch, kraken_db_path, params.n_reads_profile, params.ref_dir, "0.4", "27", "ribo", params.grouping, params.single_end)
// Process output
qc_ch = RAW.out.qc.concat(CLEAN.out.qc)
PROCESS_OUTPUT(qc_ch)
Expand Down
Loading
Loading