Skip to content

Commit

Permalink
Merge pull request galaxyproject#200 from AnnaSyme/polishing-wfs
Browse files Browse the repository at this point in the history
Added new workflow to polish assembly
  • Loading branch information
mvdbeek authored Sep 7, 2023
2 parents b01ea1f + a340831 commit ffdc8a0
Show file tree
Hide file tree
Showing 8 changed files with 2,000 additions and 0 deletions.
11 changes: 11 additions & 0 deletions workflows/genome-assembly/polish-with-long-reads/.dockstore.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: 1.2
workflows:
- name: main
subclass: Galaxy
publish: true
primaryDescriptorPath: /Assembly-polishing-with-long-reads.ga
testParameterFiles:
- /Assembly-polishing-with-long-reads-tests.yml
authors:
- name: Anna Syme
orcid: 0000-0002-9906-0673
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
- doc: Test outline for Assembly-polishing-with-long-reads
job:
Assembly to be polished:
class: File
path: test-data/assembly.fasta
filetype: fasta
long reads:
class: File
path: test-data/long_reads.fastqsanger.gz
filetype: fastqsanger.gz
'minimap setting (for long reads) ': map-ont
outputs:
Assembly polished by long reads using Racon:
path: test-data/assembly_polished_by_long_reads.fasta
compare: sim_size
delta_frac: 0.2

Large diffs are not rendered by default.

9 changes: 9 additions & 0 deletions workflows/genome-assembly/polish-with-long-reads/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.1] 2023-07-15
First release.
30 changes: 30 additions & 0 deletions workflows/genome-assembly/polish-with-long-reads/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Assembly polishing with Racon workflow

## Inputs

- Sequencing reads in format: fastq, fastq.gz, fastqsanger.gz or fastqsanger
- Genome assembly to be polished, in fasta format

## What does the workflow do

- After long reads have been assembled into a genome (contigs), this can be polished with the same long reads.
- This workflow uses the tool minimap2 to map the long reads back to the assembly, and then uses Racon to make polishes.
- This is repeated a further 3 times.

In more detail:

- minimap2 : long reads are mapped to assembly => overlaps.paf.
- overaps, long reads, assembly => Racon => polished assembly 1
- using polished assembly 1 as input; repeat minimap2 + racon => polished assembly 2
- using polished assembly 2 as input, repeat minimap2 + racon => polished assembly 3
- using polished assembly 3 as input, repeat minimap2 + racon => polished assembly 4

## Settings

- Run as-is or change parameters at runtime.
- For the input at "minimap settings for long reads", enter (map-pb) for PacBio reads, (map-hifi) for PacBio HiFi reads, or (map-ont) for Oxford Nanopore reads.

## Outputs

There is one output: the polished assembly in fasta format.

1,395 changes: 1,395 additions & 0 deletions workflows/genome-assembly/polish-with-long-reads/test-data/assembly.fasta

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Binary file not shown.

0 comments on commit ffdc8a0

Please sign in to comment.