-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding single-read functionality to RAW and CLEAN #80
base: harmon_fix_gh_actions_test
Are you sure you want to change the base?
Changes from 77 commits
15354f6
ad2115d
03ee37a
b517340
01ea0c5
ad8faf9
ef0e9c8
2535ccd
cbcb109
c7f8c83
ff0a8be
6048dd3
92270e5
b13ac94
dff2302
64bb7f4
5bd1aec
c8fd3ac
578fde0
59218b9
fd9dc1e
81ff0ba
590b2c3
6a650b4
9f1eb03
9622004
c61ed0c
f8d9c28
8e1c7b5
0ba0552
8bafee8
68c7c50
1656b33
4ec6788
f2bb836
e13acc6
9c62aa4
be46ee9
17d61ff
0ba23fb
118378c
8cd5239
7d3e725
c107e91
2d07ae6
74cb53a
3b0a11c
8f6beda
69f404c
654dd1c
2a01243
e9f7384
793a061
fdf81af
95dcf91
ada8c5e
a448dc9
e9b89be
eb82a32
00ddcfc
e5b5ec5
8e201e7
591138d
27244bd
517961f
c28749f
e132ec4
51b9cf3
12c3fdd
4fd3ce6
d460813
f412b07
3d10bb0
7899979
dd942fa
50c2edc
61ea369
ad640c6
e85dd45
3a6f6b5
a0f5f32
d14da14
3fe2bd2
d0375ab
f5cf80a
3dc323e
1904931
e24d79e
b38b93d
21b15b8
9d717b7
ee7baf4
034914b
c5454b9
4b966d8
7a3a59b
6ad3ce2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,4 +9,4 @@ test/.nextflow* | |
pipeline_report.txt | ||
|
||
.nf-test/ | ||
.nf-test.log | ||
.nf-test.log |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @harmonbhasin to review changes to this file There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @harmonbhasin ping on this.
simonleandergrimm marked this conversation as resolved.
Show resolved
Hide resolved
simonleandergrimm marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if you're interested in this, but if you want to turn this script into python, I wouldn't be mad lol There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👀 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
/************************************************ | ||
| CONFIGURATION FILE FOR NAO VIRAL MGS WORKFLOW | | ||
************************************************/ | ||
|
||
params { | ||
mode = "run_dev_se" | ||
|
||
|
||
// Directories | ||
base_dir = "s3://nao-mgs-simon/test_single_read" // Parent for working and output directories (can be S3) | ||
ref_dir = "s3://nao-mgs-wb/index-20241113/output" // Reference/index directory (generated by index workflow) | ||
|
||
// Files | ||
sample_sheet = "${launchDir}/samplesheet.csv" // Path to library TSV | ||
adapters = "${projectDir}/ref/adapters.fasta" // Path to adapter file for adapter trimming | ||
|
||
// Whether the underlying data is paired-end or single-end | ||
single_end = new File(params.sample_sheet).text.readLines()[0].contains('fastq_2') ? false : true | ||
|
||
// Numerical | ||
grouping = false // Whether to group samples by 'group' column in samplesheet | ||
n_reads_trunc = 0 // Number of reads per sample to run through pipeline (0 = all reads) | ||
n_reads_profile = 1000000 // Number of reads per sample to run through taxonomic profiling | ||
bt2_score_threshold = 20 // Normalized score threshold for HV calling (typically 15 or 20) | ||
blast_hv_fraction = 0 // Fraction of putative HV reads to BLAST vs nt (0 = don't run BLAST) | ||
kraken_memory = "128 GB" // Memory needed to safely load Kraken DB | ||
quality_encoding = "phred33" // FASTQ quality encoding (probably phred33, maybe phred64) | ||
fuzzy_match_alignment_duplicates = 0 // Fuzzy matching the start coordinate of reads for identification of duplicates through alignment (0 = exact matching; options are 0, 1, or 2) | ||
host_taxon = "vertebrate" | ||
} | ||
|
||
includeConfig "${projectDir}/configs/logging.config" | ||
includeConfig "${projectDir}/configs/containers.config" | ||
includeConfig "${projectDir}/configs/resources.config" | ||
includeConfig "${projectDir}/configs/profiles.config" | ||
includeConfig "${projectDir}/configs/output.config" | ||
process.queue = "simon-batch-queue" // AWS Batch job queue |
simonleandergrimm marked this conversation as resolved.
Show resolved
Hide resolved
|
simonleandergrimm marked this conversation as resolved.
Show resolved
Hide resolved
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. More so than for FASTP, I think this would be better done as a single process with a conditional statement, based either on a boolean You could even just do a for loop iterating over every file in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonleandergrimm can you sync up with @harmonbhasin re naming here? I think he's going to rename the test directory anyway due to conflict with nf-test.
FWIW I'd prefer something like
test/single/...
andtest/paired/...
to keep the main directory clean.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also what are these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@harmonbhasin What are your thoughts regarding having a test dataset for paired-end and single-end data? Could you rejig your test dataset by e.g., simply keeping the forward reads?