Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update nextflow_wgs to DSL2 #274

Merged
merged 325 commits into from
Jan 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
325 commits
Select commit Hold shift + click to select a range
19d4e21
simplify bam filter
alkc Dec 19, 2024
85b326a
even more simplifed filter
alkc Dec 19, 2024
7169173
fix wrong var expansion in haplogrep
alkc Dec 19, 2024
c046542
simpler fastq-filter
alkc Dec 19, 2024
db707f6
re-add copy_bam
alkc Dec 19, 2024
091f8a8
fix unbound var error
alkc Dec 19, 2024
2eaaf35
disable ped
alkc Dec 19, 2024
09a5a93
another bad haplo var
alkc Dec 19, 2024
1a155cb
groupTuple for mutect input
alkc Dec 19, 2024
8707d73
unquote subshell and quote other vars
alkc Dec 19, 2024
82b2fcb
add full ped workflow + inher_models
alkc Dec 19, 2024
a272b00
fix unqualified output
alkc Dec 19, 2024
608e6b7
re-add genmodscore and vcf_completion
alkc Dec 19, 2024
153c566
only allow proband vcf to peddy
alkc Dec 19, 2024
87562e7
enable peddy process
alkc Dec 19, 2024
d841351
fastgnomad and upd
alkc Dec 19, 2024
a56508b
fix upd trio param
alkc Dec 19, 2024
5253be1
add roh
alkc Dec 19, 2024
6381b6e
gatk
alkc Dec 19, 2024
d82e77b
debug view
alkc Dec 20, 2024
c8b6a65
add manta
alkc Dec 20, 2024
28e0ed5
rewrite postprocess gatk, update stubs
alkc Dec 20, 2024
364264d
more gatkpostprocess changes
alkc Dec 20, 2024
5bf7ad9
add postprocess gatk + input channel to workflow
alkc Dec 20, 2024
932647b
add svdb merge and tiddit
alkc Dec 20, 2024
d0dec8e
peddy troubleshoot stub
alkc Dec 20, 2024
84e4d0f
disable merge and gatk filter channel
alkc Dec 20, 2024
4f11782
fix evil val->path bug
alkc Dec 20, 2024
8c423f2
add cnvkit panel process
alkc Dec 20, 2024
e8b95c8
svdb merge indentation
alkc Dec 20, 2024
27d760e
rename paths, reenable: svdb_merge, filter_merge_gatk
alkc Dec 20, 2024
23ee786
add smn copy nbr caller
alkc Dec 20, 2024
0ba74d6
str calling
alkc Dec 20, 2024
ec50628
wip: qc
alkc Dec 20, 2024
e117638
more wip
alkc Dec 20, 2024
7a765c2
meta to samplesheet
alkc Dec 20, 2024
3868a96
re-add more qc processes
alkc Jan 7, 2025
2136eed
reenable gatkcov and overviewplot
alkc Jan 7, 2025
ca95aa7
no contamination check for non-wgs assays
alkc Jan 7, 2025
7a866dc
changelog
alkc Jan 7, 2025
26e6003
re-enable melt and other processes
alkc Jan 8, 2025
c78858b
fix dedupdummy control
alkc Jan 8, 2025
775b057
remove unused channel
alkc Jan 8, 2025
1bc6e0d
fix input channel declaration
alkc Jan 8, 2025
333d5b2
cnvkit enable
alkc Jan 8, 2025
e828a30
test stub
alkc Jan 8, 2025
11b8158
control gatkcov from workflow
alkc Jan 8, 2025
3ba34b8
re-add panel merge
alkc Jan 8, 2025
0ec38dc
move statement into correct place
alkc Jan 8, 2025
db8464e
fix output declaration
alkc Jan 8, 2025
0d94c58
if wgs for some wgs specifics
alkc Jan 8, 2025
25025fe
set annotate_only to false to kill annoying warn
alkc Jan 8, 2025
03b5575
mix into bam start channel
alkc Jan 8, 2025
f3fe121
test ifempty to skip proc
alkc Jan 8, 2025
3126978
missing parenthesis
alkc Jan 8, 2025
b416f1f
undo ifempty
alkc Jan 8, 2025
5a0ebf0
fix bam to bams var name
alkc Jan 9, 2025
18a5cfc
add loqusdb
alkc Jan 9, 2025
8969152
view melt qc val for debug
alkc Jan 9, 2025
f120ef2
fix pass process instead of output
alkc Jan 9, 2025
48fedd7
remove view
alkc Jan 9, 2025
e954f80
debug views
alkc Jan 9, 2025
27905ac
output id from svdb_merge_panel
alkc Jan 9, 2025
a64737d
add to loqusdb
alkc Jan 9, 2025
7c0f835
fix order of inputs to add to loqus
alkc Jan 9, 2025
2783591
annotsv
alkc Jan 9, 2025
ce2eafe
add annotsv and vep_sv
alkc Jan 9, 2025
df5cec3
move channel out of sv block
alkc Jan 9, 2025
1597b2e
forgout output sepc
alkc Jan 9, 2025
7a14de4
add sv annotation procs
alkc Jan 9, 2025
7f07acb
add prescore ped channel wizardry
alkc Jan 9, 2025
be0da72
fix join w/ multichannel
alkc Jan 9, 2025
ef38f6a
input order ruins the day again
alkc Jan 9, 2025
0a63b62
more input order fixes
alkc Jan 9, 2025
7c0b79b
view ped prescore
alkc Jan 9, 2025
a4ca400
remove view statements
alkc Jan 9, 2025
d313ebc
reorder map
alkc Jan 9, 2025
735b875
add some debug loggers
alkc Jan 9, 2025
4b26f3e
activate _fa and _ma peds?
alkc Jan 9, 2025
7785e5d
fix parenthesis paralysis
alkc Jan 9, 2025
02b403e
remove mix of already mixed ch
alkc Jan 9, 2025
6552f84
add script block to melt_qc_val
alkc Jan 9, 2025
0da231b
add up tp compound finder
alkc Jan 9, 2025
cde0526
melt_qc_val changes
alkc Jan 9, 2025
a53ccfe
convert path to file
alkc Jan 10, 2025
acd4f96
add bamtoyaml
alkc Jan 10, 2025
379fa6f
merge intersected melt vcf
alkc Jan 10, 2025
e4e029f
respect qc_json as file instead of path
alkc Jan 10, 2025
2adff15
forgot to add melt processes
alkc Jan 10, 2025
bb91add
mix of edits
alkc Jan 10, 2025
ad282fa
melt_qc_val debugging
alkc Jan 10, 2025
5a4d0c1
try emitting and inputting file
alkc Jan 10, 2025
c1cdf52
do not run verifybamid for non-wgs
alkc Jan 10, 2025
23abb1a
back to path and super explicitly to File
alkc Jan 10, 2025
1d9d1a6
try converting path to file again
alkc Jan 10, 2025
0377317
will changing the stagein make any difference?
alkc Jan 10, 2025
0f44f32
work directly on path
alkc Jan 10, 2025
0a740bc
convert to file again
alkc Jan 10, 2025
d2a2ccb
move melt_qc_vals into workflow
alkc Jan 10, 2025
1af32ef
new melt qc channel for cnvkit too
alkc Jan 10, 2025
d981966
debug log + new var name
alkc Jan 10, 2025
1e76485
some more debug loggers
alkc Jan 10, 2025
b20d77b
proof melt val qc code for stub runs
alkc Jan 10, 2025
661ec7b
disable loqusdb dummy sv process
alkc Jan 10, 2025
531e6bd
fix unexpected EOF
alkc Jan 10, 2025
58122ff
fix loqus echo statement
alkc Jan 10, 2025
62edbf5
switch annotsv and sv_vcf
alkc Jan 10, 2025
f9df234
remove melt_qc_val and dummy_svvcf processes
alkc Jan 10, 2025
d6ab7c3
view score_sv output
alkc Jan 13, 2025
9f786a0
workaround for group_score defined after output
alkc Jan 13, 2025
b87381a
fix evil rename of set to tuple? (why even)
alkc Jan 13, 2025
9028049
whitespace and docs
alkc Jan 13, 2025
6c3ac2e
mixed changes
alkc Jan 13, 2025
92c093d
add two info outs for testing
alkc Jan 14, 2025
92018d1
rename id2 to id
alkc Jan 14, 2025
461ac32
add .mix() to channel spec
alkc Jan 14, 2025
7fc995e
add smn to INFO out
alkc Jan 14, 2025
1caa29a
add vcfbreakmulti eh process
alkc Jan 14, 2025
6d4e749
add todo note
alkc Jan 14, 2025
8bbf32f
add missing path specs
alkc Jan 14, 2025
8e6854e
more info channels
alkc Jan 14, 2025
0c00d00
add overview_plot proc
alkc Jan 14, 2025
1c9293f
add remaining INFO outputs
alkc Jan 14, 2025
202a685
don't overload meta channel
alkc Jan 14, 2025
d9c9a4d
fix missing upd input to overview_plot
alkc Jan 14, 2025
cad5cdd
change var name to idx
alkc Jan 14, 2025
81286f5
send qc jsons to merger
alkc Jan 14, 2025
75721aa
add missing meta for qc_to_cdm
alkc Jan 14, 2025
259a84d
comment out cnvkit info
alkc Jan 14, 2025
120037d
add debug prints
alkc Jan 14, 2025
f77758c
emit group from qc merger
alkc Jan 14, 2025
e664ac6
forgot to spec group in qc_to_cdm
alkc Jan 14, 2025
2815168
prevent input filename collisions
alkc Jan 15, 2025
3bf560a
new channel for create_yml meta
alkc Jan 15, 2025
5c739ff
reorganize meta channels
alkc Jan 15, 2025
d883cf6
small whitespace edits
alkc Jan 16, 2025
fafc132
view scout_yaml channel
alkc Jan 16, 2025
1badb0b
remove most views
alkc Jan 16, 2025
d4bc086
join base ped on group only
alkc Jan 16, 2025
8e50de8
rework create_yml input spec + channels
alkc Jan 16, 2025
f05dfd4
create_yml debugging
alkc Jan 16, 2025
c807d26
fix wrong index. omg.
alkc Jan 16, 2025
bfbf05f
fix up gatkcov, expansionhunter meta channels
alkc Jan 16, 2025
5300bd1
fix bug where bam input was duplicated
alkc Jan 16, 2025
1f9b0e0
add first batch of versions
alkc Jan 16, 2025
a0761ac
fix reviewer process code
alkc Jan 16, 2025
18a0edd
fix mix?
alkc Jan 16, 2025
2575268
fix order of group and id
alkc Jan 16, 2025
ed578f1
better name for dedup_metrics
alkc Jan 16, 2025
efdc8b5
fix sentieon bam path declaration
alkc Jan 16, 2025
9d553fe
rename dedup_metrics
alkc Jan 16, 2025
d05b79d
move plot_pod conditions outside of process
alkc Jan 16, 2025
7ae1f8f
plot_pod channel fixes
alkc Jan 16, 2025
54c8ecd
fix correct type idx spec plot_pod
alkc Jan 17, 2025
8441e20
TODO-note edits
alkc Jan 17, 2025
be9a52e
move versions deeper into workflow
alkc Jan 17, 2025
b4dc2b9
view-cleanup
alkc Jan 17, 2025
282eab2
forgot to reassign channel
alkc Jan 17, 2025
e6b122e
view edits in ch_Versions
alkc Jan 17, 2025
cd33de9
debug
alkc Jan 17, 2025
a4f8ada
add .first() to all versions
alkc Jan 17, 2025
f1af84e
try collect
alkc Jan 17, 2025
df880eb
collect without index?
alkc Jan 17, 2025
78a1652
workaround for outputting with group name
alkc Jan 17, 2025
0458fb4
Merge branch 'master' into alkc/dsl2
alkc Jan 17, 2025
660e529
do not redeclare assay
alkc Jan 17, 2025
8e3e7c7
fix expansionhunter input spec
alkc Jan 17, 2025
aff21f6
add onComplete
alkc Jan 17, 2025
3cbc74f
try moving workflow.complete outside of def?
alkc Jan 20, 2025
45f6ef3
move onComplete outside of workflow
alkc Jan 20, 2025
aa9680a
convert csv to file to get that basename
alkc Jan 20, 2025
af86b1b
forgot to remove onComplete from the workflow
alkc Jan 20, 2025
8f52f45
clean up unneeded print/log statements
alkc Jan 20, 2025
cf108a3
conditional error report
alkc Jan 20, 2025
eff1f49
do not print versions
alkc Jan 20, 2025
667537e
move gatkref to top of workflow
alkc Jan 20, 2025
e987cfd
fix missing gatk_filter input to svdb_merge_panel
alkc Jan 20, 2025
8cc4f75
re-add svvcf_to_bed
alkc Jan 20, 2025
4d2adae
only run expansionhunter etc for proband
alkc Jan 20, 2025
a127364
add to-do note
alkc Jan 20, 2025
a60c90c
indent shell block
alkc Jan 20, 2025
984fafc
re-add generate gens data
alkc Jan 20, 2025
7b9bbb0
remote debug print statement
alkc Jan 20, 2025
2ba0549
clean up vcfbreakmulti inputs
alkc Jan 20, 2025
6c8f262
join meta by group and id for expansionhunter
alkc Jan 21, 2025
6ca630e
split_normalize_mito fix proband_id
alkc Jan 21, 2025
f03f312
left-join dedup metrics into process
alkc Jan 21, 2025
507cf16
save id in stub
alkc Jan 21, 2025
fe63e5f
add to-do note
alkc Jan 21, 2025
5e4e44a
minor fixes
alkc Jan 21, 2025
58e868c
join peddy inputs
alkc Jan 21, 2025
124e827
join gatkcov channels
alkc Jan 21, 2025
82b06f6
do not redefine id
alkc Jan 21, 2025
8fb64d2
remove stray comma
alkc Jan 21, 2025
b8f8d97
reorganize overview_plot inputs
alkc Jan 21, 2025
b91db5f
fix generate_gens_data inputs
alkc Jan 21, 2025
93c0238
improve .command.sh readability
alkc Jan 21, 2025
7cdc048
fix add_to_loqusdb inputs
alkc Jan 21, 2025
bc2d0c5
add documentation + todo for mito false positive
alkc Jan 21, 2025
3f4415c
fix missing output name
alkc Jan 21, 2025
083a2ef
group qc json output
alkc Jan 21, 2025
3b0bce4
add to-do note
alkc Jan 21, 2025
54dd75d
fix mother of all diffs
alkc Jan 23, 2025
0a6a927
fix parenthesis
alkc Jan 23, 2025
5834d21
fix slashes and param notation
alkc Jan 23, 2025
a2e9c25
emit multibreak vcf in own output channel
alkc Jan 23, 2025
97cedc7
join eklipse input channels
alkc Jan 23, 2025
c89d319
mixes and fixes
alkc Jan 23, 2025
8bdec04
Merge branch 'master' into alkc/dsl2
alkc Jan 23, 2025
21dafa0
re-add check for reference index
alkc Jan 28, 2025
6f7470c
clean-out re-implemented channels
alkc Jan 28, 2025
996ee73
minor edits
alkc Jan 28, 2025
6e62142
add missing versions
alkc Jan 28, 2025
e612d23
onco versions adjustments
alkc Jan 28, 2025
175cc7d
fix weird transpose
alkc Jan 28, 2025
f0afc94
try re-enabling cnvkit info
alkc Jan 28, 2025
878750c
update changelog
alkc Jan 28, 2025
10d2ecb
uncomment comment
alkc Jan 29, 2025
524a45b
access elements by key rather than idx
alkc Jan 29, 2025
ed1679c
Merge branch 'alkc/dsl2' of github.com:SMD-Bioinformatics-Lund/nextfl…
alkc Jan 29, 2025
c128a7d
convert idx to var name
alkc Jan 29, 2025
b1cbbd3
fix wrong iter var
alkc Jan 29, 2025
d2a12b6
will putting filter first fix it?
alkc Jan 29, 2025
4c50af7
add split normalize meta
alkc Jan 29, 2025
d569272
more meta mixups
alkc Jan 29, 2025
88660e3
debug view
alkc Jan 29, 2025
d46e8cb
one more debug view
alkc Jan 29, 2025
6e897a8
add todo
alkc Jan 29, 2025
97fdf6b
clarify todo
alkc Jan 29, 2025
19ea876
tidy up sentieon qc postprocess inputs
alkc Jan 29, 2025
493d910
another channel spec clean
alkc Jan 29, 2025
454128a
indent reviewer cmd
alkc Jan 29, 2025
133dba6
fix indentation in annotsv docstring
alkc Jan 29, 2025
8fefe52
uncomment more comments
alkc Jan 29, 2025
6021832
uncomment one more comment
alkc Jan 29, 2025
26a06b6
and another one
alkc Jan 29, 2025
9de84b3
remove comment
alkc Jan 29, 2025
b52fe2d
remove another comment
alkc Jan 29, 2025
f2fef0b
rude commit
alkc Jan 30, 2025
7b4fa88
restore single backslashes to reviewer
alkc Jan 30, 2025
c426a62
add todo-note for different version stubs
alkc Jan 30, 2025
ac9e452
somewhat better name for ch_ped_trio
alkc Jan 30, 2025
36958a6
silence unused var warning
alkc Jan 30, 2025
71b8e5e
fix no-sv loqus dummy svvcf file
alkc Jan 31, 2025
7f618cb
another attempt at dsl2 loqus sv dummy
alkc Jan 31, 2025
d138bd8
remove debug .view()
alkc Jan 31, 2025
aebe8e7
clean up of old channel conf setup
alkc Jan 31, 2025
15a8a1d
remove todo
alkc Jan 31, 2025
06b7504
remove stray log
alkc Jan 31, 2025
77bea11
Merge branch 'master' into alkc/dsl2
alkc Jan 31, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,21 @@
# CHANGELOG

### 3.15.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could maybe be version 4.0.0 even? A small step for the output, but a big step for the code.


* Rewrite main.nf to DSL2
* Refactor everything into one workflow `NEXTFLOW_WGS`.
* Remove contamination check for non-wgs profiles
* Remove `melt_qc_val` process (now exists in main workflow)
* Remove `dummy_svvcf_for_loqusdb` process
* Temp disable annotation-only runs (will be re-added later)

### 3.14.5
* Include --case-id flag with group-ID in Gens load command

### 3.14.4
* Routine update of bed intersect file


### 3.14.3
* Fix rankscore parsing in `cnv2bed.pl`

Expand All @@ -15,7 +25,7 @@
### 3.14.1

* Adds basic flake8-based linting
* Removes unused scripts from /
* Removes unused scripts from `bin/`
* Fixes wrong var name in `bin/normalize_caller_names_in_svdb_fields.py`
* Fix wrong var assignment in `bin/normalize_caller_names_in_svdb_fields.py` that led to caller names not being normalized for wgs trios.

Expand Down
6 changes: 3 additions & 3 deletions configs/nextflow.hopper.config
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,7 @@ params {

// SENTIEON CONFIGS //
sentieon_model = '/fs1/resources/ref/sw/sentieon/SentieonDNAscopeModelBeta0.4a-201808.05.model'
bwa_shards = 8
shardbwa = false
bwa_K_size = 100000000
git = "$baseDir/git.hash"

// CPU counts //
Expand Down Expand Up @@ -63,7 +62,7 @@ params {
PHASTCONS = "${refpath}/annotation_dbs/hg38.phastCons100way.bw"

// ANNOTATION DBS GENERAL //
KNOWN = "${refpath}/annotation_dbs/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz"
KNOWN_SITES = "${refpath}/annotation_dbs/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz"
CLINVAR = "${refpath}/annotation_dbs/clinvar38_latest.vcf.gz"
OMIM_GENES = "${refpath}/annotation_dbs/omim/2020-10-05_omim_genes.tsv"

Expand Down Expand Up @@ -113,6 +112,7 @@ params {

run_chanjo2 = true
reanalyze = false
annotate_only = false
}

process {
Expand Down
2 changes: 1 addition & 1 deletion docs/annotation_files.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ To run the pipeline, you need to setup a number of annotation files. Default val
| `rCRS_fasta` | Reference genome | `fasta` | Mitochondrial FASTA reference sequence found [here](https://www.ncbi.nlm.nih.gov/nuccore/251831106) |
| `bwa_shards` | Alignment | integer | Number of shards to split the reads into prior to alignment |
| `shardbwa` | Alignment | boolean | Boolean specifying whether to do alignment in sharded mode |
| `KNOWN` | Alignment | `vcf` | Gold-standard indels used in Sentieon's base quality score recalibration (BQSR) |
| `KNOWN_SITES` | Alignment | `vcf` | Gold-standard indels used in Sentieon's base quality score recalibration (BQSR) |
| `VEP_CACHE` | Annotation (VEP) | folder | VEP files for offline run. Instructions on how to setup a cache can be found [here](https://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#cache). |
| `VEP_FASTA` | Annotation (VEP) | `fasta` | Reference sequence used for optimization within the VEP cache |
| `SYNONYMS` | Annotation (VEP) | `tsv` | Chromosome synonyms used by VEP (for instance to recognize `M` as `MT`) |
Expand Down
Loading