Skip to content

Commit

Permalink
Merge pull request #39 from sanger-tol/dev
Browse files Browse the repository at this point in the history
Release 0.10.0
  • Loading branch information
ksenia-krasheninnikova authored Apr 18, 2024
2 parents 300735f + 0276f81 commit 31b508a
Show file tree
Hide file tree
Showing 30 changed files with 793 additions and 245 deletions.
14 changes: 12 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,19 @@ jobs:
name: Run pipeline with test data
# Only run on push if this is the nf-core dev branch (merged PRs)
if: "${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'sanger-tol/genomeassembly') }}"
runs-on: ubuntu2204-16c
runs-on: ubuntu2204-4c
strategy:
matrix:
NXF_VER:
- "22.10.1"
- "latest-everything"
steps:
- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Check out pipeline code
uses: actions/checkout@v3

Expand All @@ -37,9 +43,13 @@ jobs:
with:
version: "${{ matrix.NXF_VER }}"

- name: Set up nextflow secrets
run: |
nextflow secrets set NCBI_API_KEY ${{ secrets.NCBI_API_KEY }}
- name: Download test data
run: |
curl https://darwin.cog.sanger.ac.uk/genomeassembly_test_data.tar.gz | tar xzf -
curl https://tolit.cog.sanger.ac.uk/test-data/resources/genomeassembly/genomeassembly_test_data.tar.gz | tar xzf -
- name: Setup apptainer
uses: eWaterCycle/setup-apptainer@main
Expand Down
23 changes: 23 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,28 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [[0.10.0](https://github.com/sanger-tol/genomeassembly/releases/tag/0.10.0)] - Hideous Zippleback - [2024-04-16]

### Enhancements & fixes

- OATK module is added into the ORGANELLES subworkflow
- ORGANELLES subworkflow is now called once in the main workflow and runs MITOHIFI in read and assembly mode along with OATK
- ORGANELLES module is now tested in github CI
- NCBI API secret introduced to run MITOHIFI_FINDMITOREFERENCE module
- hifiasm haplotigs are not purged anymore
- Longranger container version is updated

### Software dependencies

Note, since the pipeline is using Nextflow DSL2, each process will be run with its own [Biocontainer](https://biocontainers.pro/#/registry). This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.

| Dependency | Old version | New version |
| ---------- | ----------- | ----------- |
| mitohifi | 3.0.0 | 3.1.1 |
| oatk | | 1.0 |

**NB:** Dependency has been **added** if just the new version information is present.

## [[0.9.0](https://github.com/sanger-tol/genomeassembly/releases/tag/0.9.0)] - Night Fury - [2023-12-15]

Initial release of sanger-tol/genomeassembly, created with the [nf-core](https://nf-co.re/) template.
Expand Down Expand Up @@ -50,6 +72,7 @@ Note, since the pipeline is using Nextflow DSL2, each process will be run with i
| fastk | | f18a4e6d2207539f7b84461daebc54530a9559b0 |
| freebayes | | 1.3.6 |
| gatk4 | | 4.4.0.0 |
| genescope | | 380815c420f50171f9234a0fd1ff426b39829b91 |
| gfastats | | 1.3.5 |
| GNU Awk | | 5.1.0 |
| hifiasm | | 0.19.3-r572 |
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,8 @@ While the steps are described in a sequential order, many of them can be execute
1. Illumina 10X reads to the joined primary and alt contigs.
2. polish initial assembly based on the aligment produced in [9i]. Set polished primary contigs as the primary assembly and polished haplotigs as the haplotig assembly.
3. produce numerical stats, BUSCO score and QV, completeness metrics, and kmer spectra for [9ii].
10. Run organelles subworkflow on the joined primary and haplotigs contigs.
10. If <code>organelles_on</code>
1. Run organelles subworkflow on the raw HiFi read data and the joined primary and haplotigs contigs.
11. Map HiC data onto primary contigs.
12. Run scaffolding for primary contigs.
13. Produce numerical stats, BUSCO score and QV, completeness metrics, and kmer spectra for [12].
Expand Down
34 changes: 34 additions & 0 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,40 @@
"errorMessage": "busco lineage to run"
}
}
},
"mito": {
"type": "object",
"properties": {
"species": {
"type": "string",
"errorMessage": "Latin name"
},
"min_length": {
"type": "string",
"errorMessage": "Minimal allowed length of the mito reference"
},
"email": {
"type": "string",
"errorMessage": "email to query NCBI"
},
"code": {
"type": "string",
"errorMessage": "Mitochondrial code"
},
"fam": {
"type": "string",
"errorMessage": "Path to mitochondrial HMM for OATK"
}
}
},
"plastid": {
"type": "object",
"properties": {
"fam": {
"type": "string",
"errorMessage": "Path to plastid HMM for OATK"
}
}
}
},
"required": ["dataset", "busco"]
Expand Down
5 changes: 4 additions & 1 deletion assets/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ dataset:
reads: /lustre/scratch123/tol/resources/nextflow/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/10x/
pacbio:
reads:
- reads: /lustre/scratch123/tol/resources/nextflow/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/pacbio/fasta/HiFi.reads.fasta
- reads: /lustre/scratch124/tol/projects/darwin/users/kk16/development/test/test/HiFi.reads.BIG.fasta
HiC:
reads:
- reads: /lustre/scratch123/tol/resources/nextflow/test-data/Undibacterium_unclassified/genomic_data/baUndUnlc1/hic-arima2/41741_2#7.sub.cram
Expand All @@ -15,3 +15,6 @@ mito:
species: Caradrina clavipalpis
min_length: 15000
code: 5
fam: /lustre/scratch124/tol/projects/darwin/users/cz3/organelle_asm/hmm_db/insecta_mito.fam
plastid:
fam: /lustre/scratch124/tol/projects/darwin/users/cz3/organelle_asm/hmm_db/acrogymnospermae_pltd.fam
1 change: 1 addition & 0 deletions assets/test_github.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ mito:
species: Caradrina clavipalpis
min_length: 15000
code: 5
fam: /home/runner/work/genomeassembly/genomeassembly/Undibacterium_unclassified/hmm_db/insecta_mito.fam
Loading

0 comments on commit 31b508a

Please sign in to comment.