Skip to content
Gennady Fedonin edited this page Mar 1, 2019 · 8 revisions

VirGenA is a reference guided assembler of highly variable viral genomes, based on iterative mapping and de novo reassembling of highly variable regions, which can handle with distant reference sequence due to specially designed read mapper. VirGenA can separate mixtures of strains of different intraspecies genetic groups (genotypes, subtypes, clades, etc.) and assemble a separate consensus sequence for each group in a mixture.

If provided with multiple sequence alignment (MSA) of target references VirGenA selects optimal reference set, sorts reads to selected references and outputs consensus sequences corresponding to these references. For each consensus sequence the multiple sequence alignment of its constituent reads is printed in BAM format.

If no MSA provided, VirGenA works in single-reference mode and use user-provided reference.

Multi-fragment references are supported in single-reference mode.

You can use VirGenA for full genome assembly or just to find optimal reference set for given fastq files with Illumina paired end reads.

To run VirGenA type:

java -jar "path to VirGenA.jar"

There are multiple subcommands. Below you can find the descriptions.

Assembly

To run the assembler type:

java -jar "path to VirGenA.jar" assemble

and follow the instructions. Configuration is explained here.

Reference selection (genotyping)

You can genotype your sample(s) by performing reference selection procedure with siutable MSA of reference set, containing sequences of different intraspecies genetic groups (genotypes, subtypes, clades, etc.).

To run the reference selector type:

java -jar "path to VirGenA.jar" type

and follow the instructions. Configuration is explained here.

Read mapping

You can use VirGenA's internal read mapper to map your short reads to given reference sequence. For this run:

java -jar "path to VirGenA.jar" map

and follow the instructions. Configuration is explained here.

Scaffolding using SSPACE

To run SSPACE on some VirGenA assembly you need:

  1. download https://github.com/nsoranzo/sspace_basic
  2. run "java -jar "path to VirGenA.jar" tab" and follow the usage instructions. You'll need to provide path to VirGenA assembly folder and path the valid VirGenA config file. In the output folder you should expect 3 files: a TAB file, a LIB file containing absolute path to the TAB file, and a FASTA file with all contigs produced by VirGenA having name ending with "_contig.fasta".
  3. run SSPACE using generated LIB file with contig extension option switched off: "perl pathToSSPACE/SSPACE_Basic_v2.0.pl -l pathToLibFile.lib -s pathToContigs.fasta -x 0"
Clone this wiki locally