Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PON for gens #17

Merged
merged 17 commits into from
Apr 3, 2024
64 changes: 64 additions & 0 deletions conf/modules/gens_pon.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Config file for defining DSL2 per module options and publishing paths
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Available keys to override module options:
ext.args = Additional arguments appended to command in module.
ext.args2 = Second set of arguments appended to command in module (multi-tool modules).
ext.args3 = Third set of arguments appended to command in module (multi-tool modules).
ext.prefix = File name prefix for output files.
----------------------------------------------------------------------------------------
*/

process {

withName: '.*GENS_PON.*' {
publishDir = [
enabled: false
]
}

withName: '.*GENS_PON:SAMTOOLS_FAIDX' {
ext.when = { params.fai.equals(null) }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/gens_pon/references" },
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: '.*GENS_PON:PICARD_CREATESEQUENCEDICTIONARY' {
ext.when = { params.dict.equals(null) }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/gens_pon/references" },
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: '.*GENS_PON:GATK4_PREPROCESSINTERVALS' {
ext.args = { ["--imr OVERLAPPING_ONLY",
"--bin-length ${params.gens_bin_length}"].join(" ")
}
}

withName: '.*GENS_PON:GATK4_COLLECTREADCOUNTS' {
ext.args = {"--format ${params.gens_readcount_format} --imr OVERLAPPING_ONLY"}
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/gens_pon/readcounts" },
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: '.*GENS_PON:GATK4_CREATEREADCOUNTPANELOFNORMALS' {
ext.args = { ["--minimum-interval-median-percentile 10.0",
"--maximum-chunk-size 29349635"].join(" ")}
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/gens_pon/createreadcountpanelofnormals" },
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

}
6 changes: 3 additions & 3 deletions conf/modules/germlinecnvcaller_cohort.config
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,8 @@ process {

withName: '.*GERMLINECNVCALLER_COHORT:GATK4_PREPROCESSINTERVALS' {
ext.args = { ["--imr OVERLAPPING_ONLY",
"--padding ${params.padding}",
"--bin-length ${params.bin_length}"].join(" ")
"--padding ${params.gcnv_padding}",
"--bin-length ${params.gcnv_bin_length}"].join(" ")
}
}

Expand All @@ -71,7 +71,7 @@ process {
}

withName: '.*GERMLINECNVCALLER_COHORT:GATK4_INTERVALLISTTOOLS' {
ext.args = {"--SUBDIVISION_MODE INTERVAL_COUNT --SCATTER_CONTENT ${params.scatter_content}"}
ext.args = {"--SUBDIVISION_MODE INTERVAL_COUNT --SCATTER_CONTENT ${params.gcnv_scatter_content}"}
}

withName: '.*GERMLINECNVCALLER_COHORT:GATK4_DETERMINEGERMLINECONTIGPLOIDY' {
Expand Down
4 changes: 2 additions & 2 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ params {
tools = 'cnvkit'

//Germlinecnvcaller options
scatter_content = 2
ploidy_priors = "https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/gatk/contig_ploidy_priors_table.tsv"
gcnv_scatter_content = 2
gcnv_ploidy_priors = "https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/gatk/contig_ploidy_priors_table.tsv"

// Small reference genome
genome = null
Expand Down
49 changes: 30 additions & 19 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,25 +107,6 @@ If you wish to share such profile (such as upload as supplementary material for

## Workflow specific arguments

### germlinecnvcaller

If you are running the pipeline to generate references for the GATK's germlinecnvcalling workflow, you should ensure that you have provided all the mandatory options specified in the table below.

| Mandatory | Optional |
| ------------------------- | --------------------------------- |
| fasta/genomes | fai |
| ploidy_priors<sup>1</sup> | dict |
| | target_bed/target_interval_list |
| | exclude_bed/exclude_interval_list |
| | bin_length |
| | mappable_regions |
| | padding |
| | readcount_format |
| | scatter_content |
| | segmental_duplications |

<sup>1</sup> To learn more about this file, see [this comment](https://gatk.broadinstitute.org/hc/en-us/community/posts/360074399831/comments/13441240230299) on GATK forum.<br />

### cnvkit

If you are running the pipeline to generate references for the CNVkit variant calling workflow, you should consider that currently the default method for this pipeline is whole-genome. In order to use the CNVkit default, i.e. hybrid capture, when the user is creating a background for targeted capture sequencing (most commonly, exomes or panels), the user should
Expand All @@ -144,6 +125,36 @@ process {

2. provide the `--cnvkit_target` parameter (optional) as a .bed file for the targets

### gens

If you are running the pipeline to generate references for the gens workflow, you should ensure that you have provided all the mandatory options specified in the table below.

| Mandatory | Optional |
| ------------- | --------------------- |
| fasta/genomes | fai |
| | dict |
| | gens_bin_length |
| | gens_readcount_format |

### germlinecnvcaller

If you are running the pipeline to generate references for the GATK's germlinecnvcalling workflow, you should ensure that you have provided all the mandatory options specified in the table below.

| Mandatory | Optional |
| ------------------------------ | ------------------------------------------- |
| fasta/genomes | fai |
| gcnv_ploidy_priors<sup>1</sup> | dict |
| | gcnv_target_bed/gcnv_target_interval_list |
| | gcnv_exclude_bed/gcnv_exclude_interval_list |
| | gcnv_bin_length |
| | gcnv_mappable_regions |
| | gcnv_padding |
| | gcnv_readcount_format |
| | gcnv_scatter_content |
| | gcnv_segmental_duplications |

<sup>1</sup> To learn more about this file, see [this comment](https://gatk.broadinstitute.org/hc/en-us/community/posts/360074399831/comments/13441240230299) on GATK forum.<br />

## Core Nextflow arguments

:::note
Expand Down
17 changes: 10 additions & 7 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,16 @@ include { getGenomeAttribute } from './subworkflows/local/utils_nfcore_crea

// This is an example of how to use getGenomeAttribute() to fetch parameters
// from igenomes.config using `--genome`
params.fasta = getGenomeAttribute('fasta')
params.fai = getGenomeAttribute('fai')
params.dict = getGenomeAttribute('dict')
params.target_bed = getGenomeAttribute('target_bed')
params.target_interval_list = getGenomeAttribute('target_interval_list')
params.exclude_bed = getGenomeAttribute('exclude_bed')
params.exclude_interval_list = getGenomeAttribute('exclude_interval_list')
params.fasta = getGenomeAttribute('fasta')
params.fai = getGenomeAttribute('fai')
params.dict = getGenomeAttribute('dict')
params.gcnv_exclude_bed = getGenomeAttribute('gcnv_exclude_bed')
params.gcnv_exclude_interval_list = getGenomeAttribute('gcnv_exclude_interval_list')
params.gcnv_mappable_regions = getGenomeAttribute('gcnv_mappable_regions')
params.gcnv_target_bed = getGenomeAttribute('gcnv_target_bed')
params.gcnv_target_interval_list = getGenomeAttribute('gcnv_target_interval_list')
params.gcnv_ploidy_priors = getGenomeAttribute('gcnv_ploidy_priors')
params.gcnv_segmental_duplications = getGenomeAttribute('gcnv_segmental_duplications')

/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@
"git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5",
"installed_by": ["modules"]
},
"gatk4/createreadcountpanelofnormals": {
"branch": "master",
"git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5",
"installed_by": ["modules"]
},
"gatk4/determinegermlinecontigploidy": {
"branch": "master",
"git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5",
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

55 changes: 55 additions & 0 deletions modules/nf-core/gatk4/createreadcountpanelofnormals/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

45 changes: 45 additions & 0 deletions modules/nf-core/gatk4/createreadcountpanelofnormals/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

29 changes: 16 additions & 13 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -23,23 +23,25 @@ params {
tools = null // No default, must be specified

// Germlinecnvcaller options
analysis_type = 'wgs'
bin_length = 1000
mappable_regions = null
padding = 0
ploidy_priors = null
readcount_format = 'HDF5'
scatter_content = 5000
segmental_duplications = null
gcnv_analysis_type = 'wgs'
gcnv_bin_length = 1000
gcnv_padding = 0
gcnv_readcount_format = 'HDF5'
gcnv_scatter_content = 5000
gcnv_segmental_duplications = null

// Gens options
gens_bin_length = 100
gens_readcount_format = 'HDF5'

// CNVkit options
cnvkit_targets = null
cnvkit_targets = null

// MultiQC options
multiqc_config = null
multiqc_title = null
multiqc_logo = null
max_multiqc_email_size = '25.MB'
multiqc_config = null
multiqc_title = null
multiqc_logo = null
max_multiqc_email_size = '25.MB'
multiqc_methods_description = null

// Boilerplate options
Expand Down Expand Up @@ -258,6 +260,7 @@ manifest {
includeConfig 'conf/modules/base.config'
includeConfig 'conf/modules/cnvkit.config'
includeConfig 'conf/modules/germlinecnvcaller_cohort.config'
includeConfig 'conf/modules/gens_pon.config'

// Function to ensure that resource requirements don't go beyond
// a maximum limit
Expand Down
Loading
Loading