Skip to content

Commit

Permalink
Merge pull request #164 from LouisLeNezet/dev
Browse files Browse the repository at this point in the history
Update documentation and miscellanous modification for release
  • Loading branch information
LouisLeNezet authored Dec 2, 2024
2 parents d092479 + dbad1ba commit 05a2a4c
Show file tree
Hide file tree
Showing 38 changed files with 1,064 additions and 543 deletions.
8 changes: 6 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v1.0.0 - Black Labrador [2024-10-28]
## v1.0.0 - Black Labrador [2024-11-30]

Initial release of nf-core/phaseimpute, created with the [nf-core](https://nf-co.re/) template.
Special thanks to [Matthias Hörtenhuber](https://github.com/mashehu) and [Mazzalab](https://github.com/mazzalab) for the review of this release.
Special thanks to [Matthias Hörtenhuber](https://github.com/mashehu), [Mazzalab](https://github.com/mazzalab) and [Sofia Stamouli](https://github.com/sofstam) for the review of this release.

### `Added`

Expand All @@ -25,6 +25,7 @@ Special thanks to [Matthias Hörtenhuber](https://github.com/mashehu) and [Mazza
- [#131](https://github.com/nf-core/phaseimpute/pull/131) - Set normalisation as optional. Fix extension detection function. Add support for validation with vcf files. Concatenate vcf only if more than one file. Change `--phased` to `--phase` for consistency.
- [#143](https://github.com/nf-core/phaseimpute/pull/143) - Improve contigs warning and error logging. The number of chromosomes contigs is summarized if above `max_chr_names`.
- [#146](https://github.com/nf-core/phaseimpute/pull/146) - Add `seed` parameter for `QUILT`.
- [#164](https://github.com/nf-core/phaseimpute/pull/164) - Add additional requirement on input schema `"uniqueEntries": ["panel", "chr"]` and `end` should be greater than `start` in regions.

### `Changed`

Expand Down Expand Up @@ -65,6 +66,7 @@ Special thanks to [Matthias Hörtenhuber](https://github.com/mashehu) and [Mazza
- [#157](https://github.com/nf-core/phaseimpute/pull/157) - Add `chunk_model` as parameter for better control over `GLIMPSE2_CHUNK` and set window size in `GLIMPSE1_CHUNK` and `GLIMPSE2_chunk` to 4mb to reduce number of chunks (empirical).
- [#160](https://github.com/nf-core/phaseimpute/pull/160) - Improve `CHANGELOG.md` and add details to `usage.md`
- [#158](https://github.com/nf-core/phaseimpute/pull/158) - Remove frequency computation and phasing from full test to reduce cost and computational time.
- [#164](https://github.com/nf-core/phaseimpute/pull/164) - Rename `BAM_REGION_SAMTOOLS` to `BAM_EXTRACT_REGION_SAMTOOLS`. Remove `GLIMPSE2_SPLITREFERENCE` as it is not used. Add more steps to `test_all` profile for more exhaustivity.

### `Fixed`

Expand All @@ -78,6 +80,7 @@ Special thanks to [Matthias Hörtenhuber](https://github.com/mashehu) and [Mazza
- [#153](https://github.com/nf-core/phaseimpute/pull/153) - Fix getFileExtension function. Fix image in `usage.md`. Fix small warnings and errors with updated language server. `def` has been added when necessary, `:` use instead of `,` in assertions, `_` added to variables not used in closures, `for` loop replaced by `.each{}`, remove unused code / input.
- [#161](https://github.com/nf-core/phaseimpute/pull/161) - Fix `VCF_SPLIT_BCFTOOLS` when only one sample present by updating `BCFTOOLS_PLUGINSPLIT` and adding `BCFTOOLS_QUERY` to get truth samples names for renaming the resulting files.
- [#162](https://github.com/nf-core/phaseimpute/pull/162) - Fix `fai` usage when provided by `genomes` parameter.
- [#164](https://github.com/nf-core/phaseimpute/pull/164) - Improve documentation writing

### `Dependencies`

Expand Down Expand Up @@ -110,3 +113,4 @@ Special thanks to [Matthias Hörtenhuber](https://github.com/mashehu) and [Mazza
[Nicolas Schcolnicov](https://github.com/nschcolnicov)
[Hemanoel Passarelli](https://github.com/hemanoel)
[Matthias Hörtenhuber](https://github.com/mashehu)
[Sofia Stamouli](https://github.com/sofstam)
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

**nf-core/phaseimpute** is a bioinformatics pipeline to phase and impute genetic data.

<img src="docs/images/metro/phaseimpute.drawio.png" alt="metromap"/>
<img src="docs/images/metro/MetroMap_animated.svg" alt="metromap"/>

The whole pipeline consists of five main steps, each of which can be run separately and independently. Users are not required to run all steps sequentially and can select specific steps based on their needs:

Expand Down Expand Up @@ -103,7 +103,7 @@ We thank the following people for their extensive assistance in the development

## Contributions and Support

If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).
If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md). Further development tips can be found in the [development documentation](docs/development.md).

For further information or help, don't hesitate to get in touch on the [Slack `#phaseimpute` channel](https://nfcore.slack.com/channels/phaseimpute) (you can join with [this invite](https://nf-co.re/join/slack)).

Expand Down
3 changes: 2 additions & 1 deletion assets/schema_chunks.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
"errorMessage": "File with chunks per chromosome must be provided. Must have .txt or .bin extension"
}
},
"required": ["panel", "chr", "file"]
"required": ["panel", "chr", "file"],
"uniqueEntries": ["panel", "chr"]
}
}
3 changes: 2 additions & 1 deletion assets/schema_input_panel.json
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
"errorMessage": "Panel index file must be provided, cannot contain spaces and must have extension '.vcf' or '.bcf' with optional '.gz' extension and with '.csi' or '.tbi' extension"
}
},
"required": ["panel", "chr", "vcf", "index"]
"required": ["panel", "chr", "vcf", "index"],
"uniqueEntries": ["panel", "chr"]
}
}
2 changes: 1 addition & 1 deletion assets/schema_map.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"map": {
"type": "string",
"pattern": "^\\S+\\.(g)?map(\\.gz)?$",
"errorMessage": "Map file must be provided, cannot contain spaces and must have extension '.map' or '.gmap' with optional 'gz' extension"
"errorMessage": "Map file must be provided, cannot contain spaces and must have extension '.map' or '.gmap' with optional '.gz' extension"
}
},
"required": ["chr", "map"]
Expand Down
3 changes: 2 additions & 1 deletion assets/schema_posfile.json
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
"errorMessage": "Legend file can be provided, cannot contain spaces and must have extension '.legend' with '.gz' extension"
}
},
"required": ["panel", "chr"]
"required": ["panel", "chr"],
"uniqueEntries": ["panel", "chr"]
}
}
2 changes: 1 addition & 1 deletion conf/steps/chrcheck.config
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,6 @@ process {
}

withName: 'NFCORE_PHASEIMPUTE:CHRCHECK_.*:VCF_CHR_RENAME_BCFTOOLS:BCFTOOLS_ANNOTATE' {
ext.args = ["-Oz", "--no-version", "--write-index=tbi"].join(' ')
ext.args = ["-Oz", "--no-version", "--write-index=tbi"].join(' ')
}
}
10 changes: 0 additions & 10 deletions conf/steps/panel_prep.config
Original file line number Diff line number Diff line change
Expand Up @@ -206,14 +206,4 @@ process {
]
}

withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:VCF_CHUNK_GLIMPSE:GLIMPSE2_SPLITREFERENCE' {
ext.prefix = { "${meta.id}_${meta.chr}_chunks_glimpse2" }
publishDir = [
path: { "${params.outdir}/prep_panel/chunks/glimpse2/" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
enabled: true
]
}

}
8 changes: 4 additions & 4 deletions conf/steps/simulation.config
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,20 @@

process {
// Optional subworkflow to extract regions
withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:BAM_REGION_SAMTOOLS:.*' {
withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:BAM_EXTRACT_REGION_SAMTOOLS:.*' {
publishDir = [ enabled: false ]
tag = {"${meta.id} ${meta.chr}"}
}
withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:BAM_REGION_SAMTOOLS:SAMTOOLS_VIEW' {
withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:BAM_EXTRACT_REGION_SAMTOOLS:SAMTOOLS_VIEW' {
ext.args = ["--output-fmt bam", "--write-index"].join(' ')
ext.prefix = { "${meta.id}_R${meta.region.replace(':','_')}" }
}

withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:BAM_REGION_SAMTOOLS:SAMTOOLS_MERGE' {
withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:BAM_EXTRACT_REGION_SAMTOOLS:SAMTOOLS_MERGE' {
ext.prefix = { "${meta.id}" }
tag = {"${meta.id} ${meta.chr}"}
}
withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:BAM_REGION_SAMTOOLS:SAMTOOLS_INDEX' {
withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:BAM_EXTRACT_REGION_SAMTOOLS:SAMTOOLS_INDEX' {
ext.args = ""
}

Expand Down
5 changes: 4 additions & 1 deletion conf/test_all.config
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,14 @@ params {

// Genome references
fasta = params.pipelines_testdata_base_path + "hum_data/reference_genome/GRCh38.s.fa.gz"

// Panel preparation
panel = "${projectDir}/tests/csv/panel.csv"
phase = true
normalize = true
compute_freq = false
compute_freq = true
chunk_model = "recursive"
remove_samples = "NA12878,NA12891,NA12892"

// Pipeline steps
steps = "all"
Expand Down
26 changes: 26 additions & 0 deletions docs/development.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Tips for development

## Channel management and combination

All channels need to be identified by a meta map. To follow which information is available, the `meta` argument
is suffixed with a combination of the following capital letters:

- I : individual id
- P : panel id
- R : region used
- M : map used
- T : tool used
- G : reference genome used (is it needed ?)
- S : simulation (depth or genotype array)

Therefore, the following channel operation example includes a meta map containing the panel id with the region and tool used:

```nextflow
ch_panel_for_impute.map {
metaPRT, vcf, index -> ...
}
```

## Release names

The names of releases are composed of a color and a dog breed.
2 changes: 1 addition & 1 deletion docs/images/metro/MetroMap.xml

Large diffs are not rendered by default.

Loading

0 comments on commit 05a2a4c

Please sign in to comment.