You must be signed in to change notification settings - Fork 23
#summary Assemble sequences in the stream using IDBA-UD.
assemble_seq_idba assembles sequence in the stream using IDBA-UD and outputs the contig sequences.
An assembly directory must be specified, and assemble_seq_idba leaves the original assembly files in this directory.
Consult the IDBA-UD documentation for more information.
IDBA-UD must be installed in order for assemble_seq_idba to work.
Read more here:
... | assemble_seq_idba [[options]] <dir>
[-? | --help] # Print full usage description.
[-d <dir> | --dir=<dir>] # Assembly directory.
[-k <uint> | --kmer_min=<uint>] # Minimum k-mer value - Default=20
[-K <uint> | --kmer_max=<uint>] # Maximum k-mer value - Default=100
[-c <uint> | --count_min=<uint>] # Filtering threshold for each k-mer - Default=2
[-p <uint> | --pairs_min=<uint>] # Minimum number of pair-end connections to join two contigs - Default=3
[-P <uint> | --prefix_len=<uint>] # Length of the prefix of k-mer used to split k-mer table - Default=3
[-C <uint> | --cpus=<uint>] # Number of CPUs - Default=0 (all)
[-X | --clean] # Remove directory upon completed assembly.
[-I <file!> | --stream_in=<file!>] # Read input from stream file - Default=STDIN
[-O <file> | --stream_out=<file>] # Write output to stream file - Default=STDOUT
[-v | --verbose] # Verbose output.
In the below example illustrates a de-novo assembly of a Lactococcus lactus strain. The sequences are read with read_fastq before being piped to assemble_seq_idba. Following the assembly, the contigs are written to file in FASTA format with write_fasta and finally the contig sequences are analyzed with analyze_assembly:
read_fastq -i Lactococcus_NCDO0505.fq |
trim_seq |
assemble_seq_idba -d IDBA -v |
write_fasta -o Lactococcus_NCDO0505.contigs |
analyze_assembly -x
N50: 5296
MAX: 35366
MIN: 50
MEAN: 533
TOTAL: 2833428
COUNT: 5308
Note that verbose output from assemble_seq_idba is enabled with the -v
Martin Asser Hansen - Copyright (C) - All rights reserved.
August 2012
GNU General Public License version 2
assemble_seq_idba is part of the Biopieces framework.