Skip to content

Commit

Permalink
velvet_optimizer: add missing kmer_step to command (galaxyproject#4227)
Browse files Browse the repository at this point in the history
* add kmer_step to command which was missing

also add argument to all inputs

* add validators for odd/even integer params

* add min/max for kmer parameters

same as in velveth
  • Loading branch information
bernt-matthias authored Nov 27, 2021
1 parent e74f0c1 commit 87fc411
Showing 1 changed file with 39 additions and 38 deletions.
77 changes: 39 additions & 38 deletions tools/velvet_optimiser/velvetoptimiser.xml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<tool id="velvetoptimiser" name="VelvetOptimiser" version="2.2.6+galaxy1">
<tool id="velvetoptimiser" name="VelvetOptimiser" version="2.2.6+galaxy2">
<description>Automatically optimize Velvet assemblies</description>
<xrefs>
<xref type="bio.tools">velvetoptimiser</xref>
Expand All @@ -18,6 +18,7 @@
-t "\${GALAXY_SLOTS:-1}"
-s $start_kmer
-e $end_kmer
-x $kmer_step
-d out
-f "
#for $i in $files:
Expand Down Expand Up @@ -64,22 +65,26 @@
#end for
"
##if str($advanced.advanced_select) == "advanced"
$advanced.verbose
-k '$advanced.optFuncKmer'
-c '$advanced.optFuncCov'
#if str($advanced.velvetg_opts) != ""
-o '$advanced.velvetg_opts'
#end if
-m $advanced.minCutoff
-z $advanced.maxCutoff
##end if
$advanced.verbose
-k '$advanced.optFuncKmer'
-c '$advanced.optFuncCov'
#if str($advanced.velvetg_opts) != ""
-o '$advanced.velvetg_opts'
#end if
-m $advanced.minCutoff
-z $advanced.maxCutoff
]]></command>
<inputs>
<param name="start_kmer" type="integer" value="31" label="Start k-mer size" help="Odd integer, Lower limit of k-mer size range to search for optimum value" />
<param name="end_kmer" type="integer" value="191" label="End k-mer size" help="Odd integer, Upper limit of k-mer size range to search for optimum value" />
<param name="kmer_step" type="integer" value="2" label="K-mer search step size" help="Even integer, the k-mer value step size when searching the range" />
<param name="start_kmer" argument="-s" type="integer" value="31" min="11" max="191" label="Start k-mer size" help="Odd integer, Lower limit of k-mer size range to search for optimum value">
<validator type="expression" message="Value needs to be odd">int(value) % 2 == 1</validator>
</param>
<param name="end_kmer" argument="-e" type="integer" value="191" min="11" max="191" label="End k-mer size" help="Odd integer, Upper limit of k-mer size range to search for optimum value">
<validator type="expression" message="Value needs to be odd">int(value) % 2 == 1</validator>
</param>
<param name="kmer_step" argument="-x" type="integer" value="2" min="2" max="189" label="K-mer search step size" help="Even integer, the k-mer value step size when searching the range">
<validator type="expression" message="Value needs to be even">int(value) % 2 == 0</validator>
</param>

<repeat name="files" title="Input files" min="1">
<param name="filetype" label="Input file type" type="select" help="Input file type">
Expand Down Expand Up @@ -111,12 +116,12 @@
</repeat>

<section name="advanced" title="Advanced Options" expanded="false">
<param name="verbose" type="boolean" checked="false" truevalue="-v" falsevalue="" label="Verbose" help="Include verbose velvet output in log file" />
<param name="optFuncKmer" type="text" value="n50" label="K-mer optimisation function" help="See help below for possibilities!"/>
<param name="optFuncCov" type="text" value="Lbp" label="Coverage cutoff optimisation function" help="See help below for possibilities!"/>
<param name="velvetg_opts" type="text" value="" label="Other velvetg options" help="Add any other required velvetg options from the advanced set"/>
<param name="minCutoff" type="integer" value="0" label="Minimum coverage cutoff" help="The minimum coverage cutoff to consider in the optimisation"/>
<param name="maxCutoff" type="float" value="0.8" label="Maximum coverage cutoff" help="The maximum coverage cutoff to consider expressed as a fraction of the calculated expected coverage."/>
<param name="verbose" argument="-v" type="boolean" checked="false" truevalue="-v" falsevalue="" label="Verbose" help="Include verbose velvet output in log file" />
<param name="optFuncKmer" argument="-k" type="text" value="n50" label="K-mer optimisation function" help="See help below for possibilities!"/>
<param name="optFuncCov" argument="-c" type="text" value="Lbp" label="Coverage cutoff optimisation function" help="See help below for possibilities!"/>
<param name="velvetg_opts" argument="-o" type="text" value="" label="Other velvetg options" help="Add any other required velvetg options from the advanced set"/>
<param name="minCutoff" argument="-m" type="integer" value="0" label="Minimum coverage cutoff" help="The minimum coverage cutoff to consider in the optimisation"/>
<param name="maxCutoff" argument="-z" type="float" value="0.8" label="Maximum coverage cutoff" help="The maximum coverage cutoff to consider expressed as a fraction of the calculated expected coverage."/>
</section>
</inputs>

Expand Down Expand Up @@ -239,10 +244,12 @@
The hash length, also known as k-mer length, corresponds to the length, in base pairs, of the words being hashed.
The hash length is the length of the k-mers being entered in the hash table. Firstly, you must observe three technical constraints::
- it must be an odd number, to avoid palindromes. If you put in an even number, Velvet will just decrement it and proceed.
- it must be below or equal to MAXKMERHASH length (cf. 2.3.3, by default 31bp), because it is stored on 64 bits
- it must be strictly inferior to read length, otherwise you simply will not observe any overlaps between reads, for obvious reasons.
The hash length is the length of the k-mers being entered in the hash table. Firstly, you must observe three technical constraints:
- it must be an odd number, to avoid palindromes. If you put in an even number, Velvet will just decrement it and proceed.
- it must be below or equal to MAXKMERHASH length (cf. 2.3.3, by default 31bp), because it is stored on 64 bits
- it must be strictly inferior to read length, otherwise you simply will not observe any overlaps between reads, for obvious reasons.
Now you still have quite a lot of possibilities. As is often the case, it's a trade-off between specificity and sensitivity. Longer kmers bring you more specificity (i.e. less spurious overlaps) but lowers coverage (cf. below)... so there's a sweet spot to be found with time and experience.
We like to think in terms of "k-mer coverage", i.e. how many times has a k-mer been seen among the reads. The relation between k-mer coverage Ck and standard (nucleotide-wise) coverage C is Ck = C * (L - k + 1)/L where k is your hash length, and L you read length.
Expand Down Expand Up @@ -277,21 +284,15 @@
Velvet works mainly with fasta and fastq formats. For paired-end reads, the assumption is that each read is next to its mate
read. In other words, if the reads are indexed from 0, then reads 0 and 1 are paired, 2 and 3, 4 and 5, etc.
Supported file formats are::
fasta
fastq
bam
Read categories are::
short (default)
shortPaired
long (for Sanger, 454 or even reference sequences)
longPaired
reference (for pre-mapped sam or bam files - see Velvet manual for details on how to use this option)
Supported file formats are: fasta, fastq, bam
Read categories are:
- short (default)
- shortPaired
- long (for Sanger, 454 or even reference sequences)
- longPaired
- reference (for pre-mapped sam or bam files - see Velvet manual for details on how to use this option)
]]></help>
<citations>
<citation type="bibtex">@UNPUBLISHED{GLADMAN2012,
Expand Down

0 comments on commit 87fc411

Please sign in to comment.