Skip to content

iva options

martinghunt edited this page Jul 9, 2015 · 10 revisions

Required options

You must give paired reads either as two separate files with

-f reads_1.fastq -r reads_2.fastq

or in a single interleaved file with

--fr reads.fastq

and you must give an output directory. So the minimal usage is either:

iva -f reads_1.fastq -r reads_2.fastq Ouptut_dir

or:

iva --fr reads.fastq  Ouptut_dir

Input/Output

--keep_files

Keep intermediate files, which could be many! By default, everything is deleted and the only output files are the contigs FASTA file contigs.fasta and a file info.txt that has how IVA was run and versions of all third-party programs it used.

--contigs filename[.gz]

You can give a file of contigs that IVA will try to extend, instead of generating a de novo assembly. This option is incompatible with --reference. See also --make_new_seeds and --ctg_first_trim.

--reference filename[.gz]

You can provide a reference genome, in which case IVA will try to assemble one contig per sequence in this file. For each reference sequence it will generate a seed sequence from the reads mapped to the middle 500bp of the reference sequence. This option is incompatible with --contigs.

--max_contigs INT

Maximum number of contigs allowed in the assembly. No more seeds generated if the cutoff is reached. Default: 50.

-v or --verbose

Be verbose by printing messages to stdout. Use up to four times for increasing verbosity. You probably don't want to go above 2. 4 is meant for debugging.


Miscellaneous options

-i INT or --max_insert INT

Maximum insert size (includes read length). Reads with inferred insert size more than the maximum will not be used to extend contigs. Default: 800.

-t INT or --threads INT

Number of threads to use. More than one thread can be used when preprocessing the reads and when running SMALT mapping. Default: 1.

--version

Show program's version number and exit.


Read trimming

--trimmomatic FILENAME

Provide the location of trimmomatic.jar file to enable read trimming before assembly. This option is required if --adapters is used.

--adapters FILENAME

FASTA file of adapter sequences to be trimmed off reads. If used, you must also use --trimmomatic. Default: use adapters file bundled with IVA.

--min_trimmed_length INT

Minimum length of read after trimming. Only applies if --trimmomatic is used. Default: 50.

--pcr_primers FILENAME

FASTA file of primers. The first perfect match found to a sequence in the primers file will be trimmed off the start of each read. This is run after Trimmomatic (if --trimmomatic used).


Advanced: SMALT mapping

-k INT or --smalt_k INT

The kmer hash length used when mapping reads with SMALT (the -k option in smalt index. Higher numbers will increase speed at the cost of reduced sensitivity. Default: 19.

-s INT or --smalt_s INT

The kmer hash step length when mapping reads with SMALT (the -s option in smalt index). Higher numbers will increase speed at the cost of reduced sensitivity. Default: 11.

-y FLOAT or --smalt_id FLOAT

The minimum identity threshold for mapping to be reported by SMALT (the -y option in smalt map). This must be between 0 and 1. The default of 0.5 means that half of each read must map. This means that up to half of each read could hang off the end of a contig, so half of the read can be used to extend a contig. Default: 0.5.


Advanced: contigs and extending

--ctg_first_trim INT

Number of bases to trim off the end of every contig before extending for the first time. Only relevant if --contigs is used. Default: 25.

--ctg_iter_trim INT

During iterative extension, number of bases to trim off the end of a contig when extension fails (then try extending again before generating a new seed). Default: 10.

--ext_min_cov INT

Minimum kmer depth needed to use that kmer to extend a contig. Default: 5.

--ext_min_ratio FLOAT

Sets N, where kmer for extension must be at least N times more abundant than next most common kmer. Default: 2.

--ext_max_bases INT

Maximum number of bases to try to extend on each iteration. Default: 100.

--ext_min_clip INT

Set minimum number of bases of a read hanging off a contig end for those bases to be used for extension. Default: 3.


Advanced: seed generation

--make_new_seeds

When no more contigs can be extended, generate a new seed. This is forced to be true when --contigs is not used.

--seed_start_length INT

When making a seed sequence, use the most common kmer of this length. Warning: it is not recommended to set this higher than 95. Default: min(median read length, 95).

--seed_stop_length INT

Stop extending seed using perfect matches from reads when this length is reached. Future extensions are then made by treating the seed as a contig. Default: 0.9*max_insert.

--seed_min_kmer_cov INT

Minimum frequency of kmer needed to be used as a seed. Default:25.

--seed_max_kmer_cov INT

Maximum frequency of kmer needed to be used as a seed. Default: 1000000.

--seed_ext_max_bases INT

Maximum number of bases to try to extend on each iteration. Default: 50.

--seed_overlap_length INT

Number of overlapping bases needed between read and seed to use that read to extend. Default: seed_start_length.

--seed_ext_min_cov INT

Minimum kmer depth needed to use that kmer to extend a seed. Default: 5.

--seed_ext_min_ratio FLOAT

Sets N, where kmer for extension must be at least N times more abundant than next most common kmer. Default: 2.