Skip to content

Commit

Permalink
Merge pull request #20 from sanger-tol/docs_and_code_review
Browse files Browse the repository at this point in the history
Merged into dev after review
  • Loading branch information
ksenia-krasheninnikova authored Dec 5, 2023
2 parents 814f6cd + c9d0d05 commit c2abdad
Show file tree
Hide file tree
Showing 64 changed files with 2,995 additions and 773 deletions.
30 changes: 0 additions & 30 deletions .github/workflows/awsfulltest.yml

This file was deleted.

25 changes: 0 additions & 25 deletions .github/workflows/awstest.yml

This file was deleted.

2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,4 @@ jobs:
- name: Run pipeline with test data
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test_github,docker -c conf/hifiasm.config --outdir ./results
nextflow run ${GITHUB_WORKSPACE} -profile test_github,docker -c conf/hifiasm_test.config --outdir ./results
2 changes: 1 addition & 1 deletion .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
run: npm install -g editorconfig-checker

- name: Run ECLint check
run: editorconfig-checker -exclude README.md $(find .* -type f | grep -v '.git\|.py\|.md\|json\|yml\|yaml\|html\|css\|work\|.nextflow\|build\|nf_core.egg-info\|log.txt\|Makefile')
run: editorconfig-checker -exclude README.md $(find .* -type f | grep -v '.git\|.py\|.md\|json\|yml\|yaml\|html\|css\|work\|.nextflow\|build\|nf_core.egg-info\|log.txt\|Makefile\|drawio')

Prettier:
runs-on: ubuntu-latest
Expand Down
29 changes: 29 additions & 0 deletions .github/workflows/sanger_test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: sanger-tol LSF tests

on:
workflow_dispatch:
jobs:
run-tower:
name: Run LSF tests
runs-on: ubuntu-latest
steps:
- name: Launch workflow via tower
uses: seqeralabs/action-tower-launch@v2
with:
workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }}
access_token: ${{ secrets.TOWER_ACCESS_TOKEN }}
compute_env: ${{ secrets.TOWER_COMPUTE_ENV }}
revision: ${{ github.sha }}
workdir: ${{ secrets.TOWER_WORKDIR_PARENT }}/work/${{ github.repository }}/work-${{ github.sha }}
parameters: |
{
"outdir": "${{ secrets.TOWER_WORKDIR_PARENT }}/results/${{ github.repository }}/results-${{ github.sha }}",
}
profiles: test,sanger,singularity,cleanup

- uses: actions/upload-artifact@v3
with:
name: Tower debug log file
path: |
tower_action_*.log
tower_action_*.json
43 changes: 43 additions & 0 deletions .github/workflows/sanger_test_full.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: sanger-tol LSF full size tests

on:
push:
branches:
- main
- dev
workflow_dispatch:
jobs:
run-tower:
name: Run LSF full size tests
runs-on: ubuntu-latest
steps:
- name: Sets env vars for push
run: |
echo "REVISION=${{ github.sha }}" >> $GITHUB_ENV
if: github.event_name == 'push'

- name: Sets env vars for workflow_dispatch
run: |
echo "REVISION=${{ github.sha }}" >> $GITHUB_ENV
if: github.event_name == 'workflow_dispatch'

- name: Launch workflow via tower
uses: seqeralabs/action-tower-launch@v2
with:
workspace_id: ${{ secrets.TOWER_WORKSPACE_ID }}
access_token: ${{ secrets.TOWER_ACCESS_TOKEN }}
compute_env: ${{ secrets.TOWER_COMPUTE_ENV }}
revision: ${{ env.REVISION }}
workdir: ${{ secrets.TOWER_WORKDIR_PARENT }}/work/${{ github.repository }}/work-${{ env.REVISION }}
parameters: |
{
"outdir": "${{ secrets.TOWER_WORKDIR_PARENT }}/results/${{ github.repository }}/results-${{ env.REVISION }}",
}
profiles: test_full,sanger,singularity,cleanup

- uses: actions/upload-artifact@v3
with:
name: Tower debug log file
path: |
tower_action_*.log
tower_action_*.json
84 changes: 76 additions & 8 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,86 @@

## [nf-core](https://pubmed.ncbi.nlm.nih.gov/32055031/)

> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031.
> Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: https://doi.org/10.1038/s41587-020-0439-x. PubMed PMID: 32055031.
## [Nextflow](https://pubmed.ncbi.nlm.nih.gov/28398311/)

> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311.
> Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: https://doi.org/10.1038/nbt.3820. PubMed PMID: 28398311.
## Pipeline tools

- [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
- [Hifiasm](https://hifiasm.readthedocs.io/en/latest/)

- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)
> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
> Cheng, H., Concepcion, G.T., Feng, X. et al. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods 18, 170–175 (2021). doi:
> https://doi.org/10.1038/s41592-020-01056-5
- [purge_dups](https://pubmed.ncbi.nlm.nih.gov/31971576/)

> Guan D, McCarthy SA, Wood J, Howe K, Wang Y, Durbin R. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics. 2020 May 1;36(9):2896-2898. doi: https://doi.org/10.1093/bioinformatics/btaa025. PMID: 31971576; PMCID: PMC7203741.
- [Longranger](https://github.com/10XGenomics/longranger)

- [Freebayes](https://arxiv.org/abs/1207.3907)

> Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv preprint arXiv:1207.3907 [q-bio.GN] 2012
- [bwa-mem2](https://ieeexplore.ieee.org/document/8820962)

> Vasimuddin Md, Sanchit Misra, Heng Li, Srinivas Aluru. Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems. IEEE Parallel and Distributed Processing Symposium (IPDPS), 2019. doi: https://doi.org/10.1109/IPDPS.2019.00041
- [YaHS](https://academic.oup.com/bioinformatics/article/39/1/btac808/6917071)

> Chenxi Zhou and others, YaHS: yet another Hi-C scaffolding tool, Bioinformatics, Volume 39, Issue 1, January 2023, btac808, doi: https://doi.org/10.1093/bioinformatics/btac808
- [Minimap2](https://pubmed.ncbi.nlm.nih.gov/34623391/)

> Li H. New strategies to improve minimap2 alignment accuracy. Bioinformatics. 2021 Oct 8;37(23):4572–4. doi: https://doi.org/10.1093/bioinformatics/btab705. Epub ahead of print. PMID: 34623391; PMCID: PMC8652018.
- [Samtools](https://pubmed.ncbi.nlm.nih.gov/33590861/)

> Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, Li H. Twelve years of SAMtools and BCFtools. Gigascience. 2021 Feb 16;10(2):giab008. doi: https://doi.org/10.1093/gigascience/giab008. PMID: 33590861; PMCID: PMC7931819.
- [Bcftools](https://samtools.github.io/bcftools/bcftools.html)

> Danecek P, Bonfield JK, et al. Twelve years of SAMtools and BCFtools. Gigascience (2021) 10(2):giab008 link
- [GATK4](https://gatk.broadinstitute.org/hc/en-us)

> Van der Auwera GA & O'Connor BD. (2020). Genomics in the Cloud: Using Docker, GATK, and WDL in Terra (1st Edition). O'Reilly Media.
- [Bedtools](https://bedtools.readthedocs.io/en/latest/)

> Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010 Mar 15;26(6):841-2. doi:
> https://doi.org/10.1093/bioinformatics/btq033. Epub 2010 Jan 28. PMID: 20110278; PMCID: PMC2832824.
- [Juicer](https://github.com/aidenlab/juicer)

> Durand NC, Shamim MS, Machol I, Rao SS, Huntley MH, Lander ES, Aiden EL. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 2016 Jul;3(1):95-8. doi: https://doi.org/10.1016/j.cels.2016.07.002. PMID: 27467249; PMCID: PMC5846465.
- [PretextMap](https://github.com/wtsi-hpag/PretextMap)

- [Cooler](https://github.com/open2c/cooler)
> Abdennur N, Mirny LA. Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics. 2020 Jan 1;36(1):311-316. doi: https://doi.org/10.1093/bioinformatics/btz540. PMID: 31290943; PMCID: PMC8205516.
- [MitoHiFi](https://github.com/marcelauliano/MitoHiFi)

> MitoHiFi: a python pipeline for mitochondrial genome assembly from PacBio High Fidelity reads Marcela Uliano-Silva, João Gabriel R. N. Ferreira, Ksenia Krasheninnikova, Darwin Tree of Life Consortium, Giulio Formenti, Linelle Abueg, James Torrance, Eugene W. Myers, Richard Durbin, Mark Blaxter, Shane A. McCarthy bioRxiv 2022.12.23.521667; doi: https://doi.org/10.1101/2022.12.23.521667
- [MitoFinder](https://github.com/RemiAllio/MitoFinder)

> Allio, R, Schomaker‐Bastos, A, Romiguier, J, Prosdocimi, F, Nabholz, B, Delsuc, F. MitoFinder: Efficient automated large‐scale extraction of mitogenomic data in target enrichment phylogenomics. Mol Ecol Resour. 2020; 00: 1– 14. doi: https://doi.org/10.1111/1755-0998.13160
- [MITOS](https://anaconda.org/bioconda/mitos)

> M. Bernt, A. Donath, F. Jühling, F. Externbrink, C. Florentz, G. Fritzsch, J. Pütz, M. Middendorf, P. F. Stadler MITOS: Improved de novo Metazoan Mitochondrial Genome Annotation Molecular Phylogenetics and Evolution 2013, 69(2):313-319.
- [MerquryFK](https://github.com/thegenemyers/MERQURY.FK)

- [BUSCO](https://busco.ezlab.org)

> Mosè Manni, Matthew R Berkeley, Mathieu Seppey, Felipe A Simão, Evgeny M Zdobnov, BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. Molecular Biology and Evolution, Volume 38, Issue 10, October 2021, Pages 4647–4654
- [GFASTATS](https://github.com/vgl-hub/gfastats)
> Giulio Formenti and others, Gfastats: conversion, evaluation and manipulation of genome sequences using assembly graphs, Bioinformatics, Volume 38, Issue 17, September 2022, Pages 4214–4216, doi: https://doi.org/10.1093/bioinformatics/btac460
## Software packaging/containerisation tools

Expand All @@ -23,13 +91,13 @@
- [Bioconda](https://pubmed.ncbi.nlm.nih.gov/29967506/)

> Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7. PubMed PMID: 29967506.
> Grüning B, Dale R, Sjödin A, Chapman BA, Rowe J, Tomkins-Tinch CH, Valieris R, Köster J; Bioconda Team. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods. 2018 Jul;15(7):475-476. doi: https://doi.org/10.1038/s41592-018-0046-7. PubMed PMID: 29967506.
- [BioContainers](https://pubmed.ncbi.nlm.nih.gov/28379341/)

> da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.
> da Veiga Leprevost F, Grüning B, Aflitos SA, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Alvarez RV, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: https://doi.org/10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671.
- [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)

- [Singularity](https://pubmed.ncbi.nlm.nih.gov/28494014/)
> Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
> Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: https://doi.org/10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675.
Loading

0 comments on commit c2abdad

Please sign in to comment.