Skip to content

Commit

Permalink
Merge pull request #202 from nf-core/dev
Browse files Browse the repository at this point in the history
3.2 release candidate
  • Loading branch information
chris-cheshire authored Aug 31, 2023
2 parents 42502fb + 6afd8a2 commit 506a325
Show file tree
Hide file tree
Showing 151 changed files with 6,670 additions and 2,021 deletions.
2 changes: 1 addition & 1 deletion .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ trim_trailing_whitespace = true
indent_size = 4
indent_style = space

[*.{md,yml,yaml,html,css,scss,js,cff}]
[*.{md,yml,yaml,html,css,scss,js}]
indent_size = 2

# These files are edited and tested upstream in nf-core/modules
Expand Down
1 change: 0 additions & 1 deletion .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,4 +116,3 @@ To get started:
Devcontainer specs:

- [DevContainer config](.devcontainer/devcontainer.json)
- [Dockerfile](.devcontainer/Dockerfile)
4 changes: 2 additions & 2 deletions .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,9 @@ body:
attributes:
label: System information
description: |
* Nextflow version _(eg. 22.10.1)_
* Nextflow version _(eg. 23.04.0)_
* Hardware _(eg. HPC, Desktop, Cloud)_
* Executor _(eg. slurm, local, awsbatch)_
* Container engine: _(e.g. Docker, Singularity, Conda, Podman, Shifter or Charliecloud)_
* Container engine: _(e.g. Docker, Singularity, Conda, Podman, Shifter, Charliecloud, or Apptainer)_
* OS _(eg. CentOS Linux, macOS, Linux Mint)_
* Version of nf-core/cutandrun _(eg. 1.1, 1.5, 1.8.2)_
3 changes: 2 additions & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ Learn more about contributing: [CONTRIBUTING.md](https://github.com/nf-core/cuta

- [ ] This comment contains a description of changes (with reason).
- [ ] If you've fixed a bug or added code that should be tested, add tests!
- [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/cutandrun/tree/master/.github/CONTRIBUTING.md)- [ ] If necessary, also make a PR on the nf-core/cutandrun _branch_ on the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository.
- [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/cutandrun/tree/master/.github/CONTRIBUTING.md)
- [ ] If necessary, also make a PR on the nf-core/cutandrun _branch_ on the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository.
- [ ] Make sure your code lints (`nf-core lint`).
- [ ] Ensure the test suite passes (`nextflow run . -profile test,docker --outdir <OUTDIR>`).
- [ ] Usage Documentation in `docs/usage.md` is updated.
Expand Down
11 changes: 8 additions & 3 deletions .github/workflows/awsfulltest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,18 +14,23 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Launch workflow via tower
uses: nf-core/tower-action@v3
uses: seqeralabs/action-tower-launch@v2
with:
workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }}
access_token: ${{ secrets.TOWER_ACCESS_TOKEN }}
compute_env: ${{ secrets.TOWER_COMPUTE_ENV }}
revision: ${{ github.sha }}
workdir: s3://${{ secrets.AWS_S3_BUCKET }}/work/cutandrun/work-${{ github.sha }}
parameters: |
{
"hook_url": "${{ secrets.MEGATESTS_ALERTS_SLACK_HOOK_URL }}",
"outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/cutandrun/results-${{ github.sha }}"
}
profiles: test_full,aws_tower
profiles: test_full

- uses: actions/upload-artifact@v3
with:
name: Tower debug log file
path: tower_action_*.log
path: |
tower_action_*.log
tower_action_*.json
10 changes: 7 additions & 3 deletions .github/workflows/awstest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,22 @@ jobs:
steps:
# Launch workflow using Tower CLI tool action
- name: Launch workflow via tower
uses: nf-core/tower-action@v3
uses: seqeralabs/action-tower-launch@v2
with:
workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }}
access_token: ${{ secrets.TOWER_ACCESS_TOKEN }}
compute_env: ${{ secrets.TOWER_COMPUTE_ENV }}
revision: ${{ github.sha }}
workdir: s3://${{ secrets.AWS_S3_BUCKET }}/work/cutandrun/work-${{ github.sha }}
parameters: |
{
"outdir": "s3://${{ secrets.AWS_S3_BUCKET }}/cutandrun/results-test-${{ github.sha }}"
}
profiles: test,aws_tower
profiles: test

- uses: actions/upload-artifact@v3
with:
name: Tower debug log file
path: tower_action_*.log
path: |
tower_action_*.log
tower_action_*.json
2 changes: 1 addition & 1 deletion .github/workflows/branch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
- name: Check PRs
if: github.repository == 'nf-core/cutandrun'
run: |
{ [[ ${{github.event.pull_request.head.repo.full_name }} == nf-core/cutandrun ]] && [[ $GITHUB_HEAD_REF = "dev" ]]; } || [[ $GITHUB_HEAD_REF == "patch" ]]
{ [[ ${{github.event.pull_request.head.repo.full_name }} == nf-core/cutandrun ]] && [[ $GITHUB_HEAD_REF == "dev" ]]; } || [[ $GITHUB_HEAD_REF == "patch" ]]
# If the above check failed, post a comment on the PR explaining the failure
# NOTE - this doesn't currently work if the PR is coming from a fork, due to limitations in GitHub actions secrets
Expand Down
13 changes: 10 additions & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
CAPSULE_LOG: none
strategy:
matrix:
NXF_VER: ["22.10.1", ""]
NXF_VER: ["23.04.0", ""]
steps:
- name: Check out pipeline code
uses: actions/checkout@v3
Expand Down Expand Up @@ -70,7 +70,7 @@ jobs:
strategy:
matrix:
# Nextflow versions: check pipeline minimum and current latest
NXF_VER: ["22.10.1", ""]
NXF_VER: ["23.04.0", ""]
steps:
- name: Check out pipeline code
uses: actions/checkout@v3
Expand Down Expand Up @@ -148,7 +148,7 @@ jobs:
strategy:
fail-fast: false
matrix:
NXF_VER: ["22.10.1", ""]
NXF_VER: ["23.04.0", ""]
tags:
- test_genome_options
- test_genome_options_spikein
Expand Down Expand Up @@ -180,13 +180,20 @@ jobs:
- verify_output_skip_fastqc
- verify_output_save_ref
- verify_output_align_only_align
- verify_output_align_only_align_end_to_end
- verify_output_align_only_align_local
- verify_output_align_intermed
- verify_output_align_save_spikein_align
- verify_output_align_save_unaligned
- verify_output_align_duplicates_mark
- verify_output_align_duplicates_remove
- verify_output_align_duplicates_remove_target
- verify_output_align_linear_duplicates_remove
- verify_output_align_linear_duplicates_remove_target
- verify_output_only_filtering
- verify_output_only_filtering_with_mitochondrial_reads
- verify_output_only_filtering_without_mitochondrial_reads
- verify_output_only_filtering_without_mitochondrial_reads_mito_name_null
- verify_output_peak_calling_only_peak_calling
- verify_output_reporting_skip_preseq_false
- verify_output_reporting_skip_preseq_true
Expand Down
24 changes: 24 additions & 0 deletions .github/workflows/clean-up.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: "Close user-tagged issues and PRs"
on:
schedule:
- cron: "0 0 * * 0" # Once a week

jobs:
clean-up:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v7
with:
stale-issue-message: "This issue has been tagged as awaiting-changes or awaiting-feedback by an nf-core contributor. Remove stale label or add a comment otherwise this issue will be closed in 20 days."
stale-pr-message: "This PR has been tagged as awaiting-changes or awaiting-feedback by an nf-core contributor. Remove stale label or add a comment if it is still useful."
close-issue-message: "This issue was closed because it has been tagged as awaiting-changes or awaiting-feedback by an nf-core contributor and then staled for 20 days with no activity."
days-before-stale: 30
days-before-close: 20
days-before-pr-close: -1
any-of-labels: "awaiting-changes,awaiting-feedback"
exempt-issue-labels: "WIP"
exempt-pr-labels: "WIP"
repo-token: "${{ secrets.GITHUB_TOKEN }}"
2 changes: 1 addition & 1 deletion .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ jobs:

- uses: actions/setup-python@v4
with:
python-version: "3.7"
python-version: "3.8"
architecture: "x64"

- name: Install dependencies
Expand Down
5 changes: 5 additions & 0 deletions .gitpod.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
image: nfcore/gitpod:latest
tasks:
- name: Update Nextflow and setup pre-commit
command: |
pre-commit install --install-hooks
nextflow self-update
vscode:
extensions: # based on nf-core.nf-core-extensionpack
Expand Down
5 changes: 5 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
repos:
- repo: https://github.com/pre-commit/mirrors-prettier
rev: "v2.7.1"
hooks:
- id: prettier
44 changes: 40 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,48 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [3.2] - 2023-08-31

### Major Changes

- [[#189](https://github.com/nf-core/cutandrun/pull/189)] - Duplicates arising from linear amplification can be now removed by setting `--remove_linear_duplicates true`. `false` is default. [Linear amplification](https://doi.org/10.1186/1471-2164-4-19) is used in the [TIPseq protocol](https://doi.org/10.1083/jcb.202103078) in which genomic DNA is cut with Tn5 loaded with T7 promoter sequence that gets inserted in the cut DNA fragment. The T7 promoter sequence is then used to perform in vitro transcription to produce copies of the cut fragment. These duplicates are referred to as linear duplicates. Recent iterations of the CUT&Tag protocol, such as [nano-CUT&Tag](https://doi.org/10.1038/s41587-022-01535-4), have also been modified to include a linear amplification step. Credit to teemuronkko for this.
- [[#208](https://github.com/nf-core/cutandrun/issues/208)] - Updated the genome blacklists file to more accurate CUT&RUN specific regions rather than the old ChIP-Seq ENCODE blacklist. This should improve mapping rates and reduce spurious peaks. Credit to Adrija K for this. [[The CUT&RUN suspect list of problematic regions of the genome](https://doi.org/10.1186/s13059-023-03027-3)]

### Enhancements

- Updated pipeline template to nf-core/tools `2.8`.
- [[#189](https://github.com/nf-core/cutandrun/pull/189)] - Mitochondrial reads can be filtered before peak calling by setting `--remove_mitochondrial_reads true`. `false` is default. If using a custom reference genome, user can specify the string that is used to denote the mitochondrial reads in the reference using the `--mito_name` parameter.
- [[#189](https://github.com/nf-core/cutandrun/pull/189)] - The user can now specify explicitly if `end-to-end` vs `local` mode of Bowtie2 should be used by setting `--end_to_end` to `true` or `false`. `true` is default. In the `end-to-end` mode, all read characters are included when optiming an alignment. If the `local` mode is specified, Bowtie2 might exclude characters from one or both ends of the read to maximise alignment scores.
- [[#189](https://github.com/nf-core/cutandrun/pull/189)] - Added the name of the peak caller in the consensus peaks to make it clearer which peaks were used in the downstream reporting steps.
- [[#196](https://github.com/nf-core/cutandrun/pull/196)] - Extended documentation for most common alternative spike-in genomes, i.e. yeast and fruit fly. Credit to smoe for this.
- The Preseq module `lcextrap` was moved from `local` to `nf-core`
- Updated all nf-core modules to latest versions.

### Fixes

- Standardised channel structure for the nf-core Bowtie2 `align` module in the local `align_bowtie2` and `prepare_genome` subworkflows to prevent file errors.
- Fixed error caused by altered channel structure of the nf-core `bedtools_intersect` module.

### Software dependencies

Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own [Biocontainer](https://biocontainers.pro/#/registry). This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.

| Dependency | Old version | New version |
| ---------- | ----------- | ----------- |
| `bedtools` | 1.13 | 1.14 |
| `multiqc` | 2.30.0 | 2.31.0 |
| `samtools` | 1.16.1 | 1.17 |

> **NB:** Dependency has been **updated** if both old and new version information is present.
> **NB:** Dependency has been **added** if just the new version information is present.
> **NB:** Dependency has been **removed** if version information isn't present.
## [3.1] - 2023-02-20

### Major Changes

- IgG controls will now be analysed by the deeptools QC subworkflow giving greater visibility on the quality of control samples.
- Updated the MACS2 default parameters to better process PA-Tn5/PA-Mnase based experiments. The new defaults use the q-value of `0.01` as the default cutoff in place of the p-value. The defaults have also been updated to keep duplicate reads int he peak finding process and also to shift the model to better account for nucleosome positioning `--nomodel --shift -75 --extsize 150 --keep-dup all`
- Updated the MACS2 default parameters to better process PA-Tn5/PA-Mnase based experiments. The new defaults use the q-value of `0.01` as the default cutoff in place of the p-value. The defaults have also been updated to keep duplicate reads in the peak finding process and also to shift the model to better account for nucleosome positioning `--nomodel --shift -75 --extsize 150 --keep-dup all`
- Deeptools plotHeatmap will now run for all samples as well as for singles. This can be disabled using the parameter `--dt_calc_all_matrix false`
- Bowtie2 default parameters have been updated to use the `--dovetail` option. After careful consideration and literature review, we have decided that overlapping mates can occur in CUT&RUN data and are still valid reads. This is also the agreed parameterisation in similar pipelines and also on the 4D nucleome portal.

Expand All @@ -24,7 +60,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Fixed deeptools correlation plots that were showing low levels of correlation even in test data by changing the plot to use Pearson correlation.
- Corrected the SEACR p-value parameter description.
- Fixed output of Picard mark/remove duplicate files so that the sorted, indexed bams for all files always output to the results folder.
- Spikein genome processes and checks no longer run when the normalisation mode is set to something other than `SpikeIn`.
- Spike-in genome processes and checks no longer run when the normalisation mode is set to something other than `SpikeIn`.
- Pipeline will now fail gracefully when single-end reads are detected.

### Software dependencies
Expand Down Expand Up @@ -148,7 +184,7 @@ Note, since the pipeline is now using Nextflow DSL2, each process will be run wi
- Added support for GFF files in IGV session generation
- [[#57](https://github.com/nf-core/cutandrun/issues/57), [#66](https://github.com/nf-core/cutandrun/issues/66)] - Upgraded version reporting in multiqc to support both software version by module and unique software versions. This improves detection of multi-version software usage in the pipeline
- [[#54](https://github.com/nf-core/cutandrun/issues/54)] - Fixed pipeline error where dots in sample ids inside the sample sheet were not correctly handled
- [[#75](https://github.com/nf-core/cutandrun/issues/75)] - Fixed error caused by emtpy peak files being passed to the `CALCULATE_FRIP` and `CALCULATE_PEAK_REPROD` python reporting modules
- [[#75](https://github.com/nf-core/cutandrun/issues/75)] - Fixed error caused by empty peak files being passed to the `CALCULATE_FRIP` and `CALCULATE_PEAK_REPROD` python reporting modules
- [[#83]](https://github.com/nf-core/cutandrun/issues/83) - Fixed error in violin chart generation with cast to int64

### Software dependencies
Expand Down Expand Up @@ -184,7 +220,7 @@ We thank Harshil Patel ([@drpatelh](https://github.com/drpatelh)) and everyone i
5. Alignment to both target and spike-in genomes ([`Bowtie 2`](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml))
6. Filter on quality, sort and index alignments ([`samtools`](https://sourceforge.net/projects/samtools/files/samtools/))
7. Duplicate read marking ([`picard`](https://broadinstitute.github.io/picard/))
8. Create bedGraph files ([`bedtools`](https://github.com/arq5x/bedtools2/)
8. Create bedGraph files ([`bedtools`](https://github.com/arq5x/bedtools2/))
9. Create bigWig coverage files ([`bedGraphToBigWig`](http://hgdownload.soe.ucsc.edu/admin/exe/))
10. Peak calling specifically tailored for low background noise experiments ([`SEACR`](https://github.com/FredHutch/SEACR))
11. Consensus peak merging and reporting ([`bedtools`](https://github.com/arq5x/bedtools2/))
Expand Down
5 changes: 5 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@

- [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)

> Andrews, S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data [Online]. Available online https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)

> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
Expand Down Expand Up @@ -55,5 +57,8 @@
- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

> Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. doi: 10.5555/2600239.2600241.
- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)

> Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
Loading

0 comments on commit 506a325

Please sign in to comment.