Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bio.tools IDs to tool wrappers #616

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
3 changes: 3 additions & 0 deletions tool_collections/taxonomy/gi2taxonomy/gi2taxonomy.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
<requirements>
<requirement type="package" version="1.0.0">taxonomy</requirement>
</requirements>
<xrefs>
<xref type="bio.tools">gi2taxonomy</xref>
</xrefs>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this tool should be moved to deprecated instead, it doesn't seem to be installed on big 3 servers and its requirement is not installable any more (no conda package, sources on BitBucket removed).

<command interpreter="python">gi2taxonomy.py $input $giField $idField $out_file1 ${GALAXY_DATA_INDEX_DIR}</command>
<inputs>
<param format="tabular" name="input" type="data" label="Show taxonomic representation for"></param>
Expand Down
3 changes: 3 additions & 0 deletions tool_collections/taxonomy/kraken2tax/kraken2tax.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
<requirement type="package" version="5.1.0">gawk</requirement>
<requirement type="package" version="1.0.1">gb_taxonomy_tools</requirement>
</requirements>
<xrefs>
<xref type="bio.tools">kraken2tax</xref>
</xrefs>
<command>
<![CDATA[
awk '{ print \$${read_name}, \$${tax_id} }' OFS="\t" "${input}" | taxonomy-reader "${ncbi_taxonomy.fields.path}/names.dmp" "${ncbi_taxonomy.fields.path}/nodes.dmp" 1 > "${out_file}"
Expand Down
157 changes: 80 additions & 77 deletions tool_collections/taxonomy/lca_wrapper/lca.xml
Original file line number Diff line number Diff line change
@@ -1,46 +1,49 @@
<tool id="lca1" name="Find lowest diagnostic rank" version="1.0.1">
<description></description>
<requirements>
<requirement type="package" version="1.0.0">taxonomy</requirement>
</requirements>
<requirements>
<requirement type="package" version="1.0.0">taxonomy</requirement>
</requirements>
<xrefs>
<xref type="bio.tools">lca1</xref>
</xrefs>
<command interpreter="python">
lca.py $input1 $out_file1 $rank_bound
</command>
<inputs>
<param format="taxonomy" name="input1" type="data" label="for taxonomy dataset"/>
<param name="rank_bound" label="require the lowest rank to be at least" type="select">
<option value="0">No restriction</option>
<option value="3">Superkingdom</option>
<option value="4">Kingdom</option>
<option value="5">Subkingdom</option>
<option value="6">Superphylum</option>
<option value="7">Phylum</option>
<option value="8">Subphylum</option>
<option value="9">Superclass</option>
<option value="10">Class</option>
<option value="11">Subclass</option>
<option value="12">Superorder</option>
<option value="13">Order</option>
<option value="14">Suborder</option>
<option value="15">Superfamily</option>
<option value="16">Family</option>
<option value="17">Subfamily</option>
<option value="18">Tribe</option>
<option value="19">Subtribe</option>
<option value="20">Genus</option>
<option value="21">Subgenus</option>
<option value="22">Species</option>
<option value="23">Subspecies</option>
<param format="taxonomy" name="input1" type="data" label="for taxonomy dataset"/>
<param name="rank_bound" label="require the lowest rank to be at least" type="select">
<option value="0">No restriction</option>
<option value="3">Superkingdom</option>
<option value="4">Kingdom</option>
<option value="5">Subkingdom</option>
<option value="6">Superphylum</option>
<option value="7">Phylum</option>
<option value="8">Subphylum</option>
<option value="9">Superclass</option>
<option value="10">Class</option>
<option value="11">Subclass</option>
<option value="12">Superorder</option>
<option value="13">Order</option>
<option value="14">Suborder</option>
<option value="15">Superfamily</option>
<option value="16">Family</option>
<option value="17">Subfamily</option>
<option value="18">Tribe</option>
<option value="19">Subtribe</option>
<option value="20">Genus</option>
<option value="21">Subgenus</option>
<option value="22">Species</option>
<option value="23">Subspecies</option>
</param>
</inputs>
<outputs>
<data format="taxonomy" name="out_file1" metadata_source="input1" />
</outputs>
</outputs>
<tests>
<test>
<param name="input1" value="lca_input.taxonomy" ftype="taxonomy"/>
<param name="rank_bound" value="0" />
<output name="out_file1" file="lca_output.taxonomy" ftype="taxonomy"/>
<test>
<param name="input1" value="lca_input.taxonomy" ftype="taxonomy"/>
<param name="rank_bound" value="0" />
<output name="out_file1" file="lca_output.taxonomy" ftype="taxonomy"/>
</test>
<test>
<param name="input1" value="lca_input2.taxonomy" ftype="taxonomy"/>
Expand All @@ -53,48 +56,48 @@
<param name="input1" value="lca_input3.taxonomy" ftype="taxonomy"/>
<param name="rank_bound" value="10" />
<output name="out_file1" file="lca_output3.taxonomy" ftype="taxonomy"/>
</test>
</tests>

<help>

**What it does**

This tool identifies the lowest taxonomic rank for which a mategenomic sequencing read is diagnostic. It takes datasets produced by *Fetch Taxonomic Ranks* tool (aka Taxonomy format) as the input.

-------

**Example**

Suppose you have two reads, **read_1** and **read_2**, with the following taxonomic profiles (scroll sideways to see the entire dataset)::

read_1 1 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 genus1 subgenus1 species1 subspecies1
read_1 2 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 genus2 subgenus2 species2 subspecies2
read_2 3 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum3 subphylum3 superclass3 class3 subclass3 superorder3 order3 suborder3 superfamily3 family3 subfamily3 tribe3 subtribe3 genus3 subgenus3 species3 subspecies3
read_2 4 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum4 subphylum4 superclass4 class4 subclass4 superorder4 order4 suborder4 superfamily4 family4 subfamily4 tribe4 subtribe4 genus4 subgenus4 species4 subspecies4

For **read_1** taxonomic labels are consistent until the genus level, where the taxonomy splits into two branches, one ending with *subspecies1* and the other with *subspecies2*. This implies **that the lowest taxomomic rank read_1 can identify is SUBTRIBE**. Similarly, read_2 is diagnostic up until the **superphylum** level. As a results the output of this tool will be::

read_1 2 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 n n n n
read_2 3 root superkingdom1 kingdom1 subkingdom1 superphylum1 n n n n n n n n n n n n n n n n n

where, **n** means *EMPTY*.

--------

**What's up with the drop down?**

Why do we need the *require the lowest rank to be at least* dropdown? Let's look at the above example again. Suppose you need to find only those reads that are diagnostic on at least phylum level. To do this you need to set the *require the lowest rank to be at least* to **phylum**. As a result your output will look like this::

read_1 2 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 n n n n

.. class:: infomark

Note, that **read_2** is now omitted as it matches two phyla (**phylum3** and **phylum4**) and therefore is not diagnostic (but rather cosmopolitan) on *phylum* level.





</help>
</tool>
</test>
</tests>

<help>
**What it does**
This tool identifies the lowest taxonomic rank for which a mategenomic sequencing read is diagnostic. It takes datasets produced by *Fetch Taxonomic Ranks* tool (aka Taxonomy format) as the input.
-------
**Example**
Suppose you have two reads, **read_1** and **read_2**, with the following taxonomic profiles (scroll sideways to see the entire dataset)::
read_1 1 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 genus1 subgenus1 species1 subspecies1
read_1 2 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 genus2 subgenus2 species2 subspecies2
read_2 3 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum3 subphylum3 superclass3 class3 subclass3 superorder3 order3 suborder3 superfamily3 family3 subfamily3 tribe3 subtribe3 genus3 subgenus3 species3 subspecies3
read_2 4 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum4 subphylum4 superclass4 class4 subclass4 superorder4 order4 suborder4 superfamily4 family4 subfamily4 tribe4 subtribe4 genus4 subgenus4 species4 subspecies4
For **read_1** taxonomic labels are consistent until the genus level, where the taxonomy splits into two branches, one ending with *subspecies1* and the other with *subspecies2*. This implies **that the lowest taxomomic rank read_1 can identify is SUBTRIBE**. Similarly, read_2 is diagnostic up until the **superphylum** level. As a results the output of this tool will be::
read_1 2 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 n n n n
read_2 3 root superkingdom1 kingdom1 subkingdom1 superphylum1 n n n n n n n n n n n n n n n n n
where, **n** means *EMPTY*.
--------
**What's up with the drop down?**
Why do we need the *require the lowest rank to be at least* dropdown? Let's look at the above example again. Suppose you need to find only those reads that are diagnostic on at least phylum level. To do this you need to set the *require the lowest rank to be at least* to **phylum**. As a result your output will look like this::
read_1 2 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 n n n n
.. class:: infomark
Note, that **read_2** is now omitted as it matches two phyla (**phylum3** and **phylum4**) and therefore is not diagnostic (but rather cosmopolitan) on *phylum* level.
</help>
</tool>
3 changes: 3 additions & 0 deletions tool_collections/taxonomy/t2ps/t2ps_wrapper.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
<requirements>
<requirement type="package" version="1.0.0">taxonomy</requirement>
</requirements>
<xrefs>
<xref type="bio.tools">t2ps</xref>
</xrefs>
<command interpreter="python">t2ps_wrapper.py $input $out_file1 $max_tree_level $font_size $max_leaves 1</command>
<inputs>
<param format="taxonomy" name="input" type="data" label="Draw phylogram for"></param>
Expand Down
3 changes: 3 additions & 0 deletions tool_collections/taxonomy/t2t_report/t2t_report.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
<requirements>
<requirement type="package" version="1.0.0">taxonomy</requirement>
</requirements>
<xrefs>
<xref type="bio.tools">t2t_report</xref>
</xrefs>
<command>taxonomy2tree $input 0 /dev/null $out_file1 0</command>
<inputs>
<param format="taxonomy" name="input" type="data" label="Summarize taxonomic representation for"/>
Expand Down
5 changes: 4 additions & 1 deletion tools/cd_hit_dup/cd_hit_dup.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
<requirements>
<requirement type="package" version="0.5-2012-03-07-fix-dan-gh-0.0.1">cd-hit-auxtools</requirement>
</requirements>
<xrefs>
<xref type="bio.tools">cd-hit</xref>
</xrefs>
<stdio>
<exit_code range="1:" />
<exit_code range=":-1" />
Expand Down Expand Up @@ -122,4 +125,4 @@ cd-hit-dup provides a number of options to tune how the duplicates are removed::
<citations>
<citation type="doi">10.1093/bioinformatics/bts565</citation>
</citations>
</tool>
</tool>
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
<tool id="multispecies_orthologous_microsats" name="Extract orthologous microsatellites" version="1.0.0">
<description> for multiple (>2) species alignments</description>
<xrefs>
<xref type="bio.tools">multispecies_orthologous_microsats</xref>
</xrefs>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tool also should probably be moved to deprecated/.

<command interpreter="perl">
multispecies_MicrosatDataGenerator_interrupted_GALAXY.pl
$input1
Expand Down
3 changes: 3 additions & 0 deletions tools/quality_filter/quality_filter.xml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
<requirement type="package" version="0.7.1">bx-python</requirement>
<requirement type="package" version="1.7.1">numpy</requirement>
</requirements>
<xrefs>
<xref type="bio.tools">qualityfilter</xref>
</xrefs>
<command interpreter="python">
quality_filter.py
$input
Expand Down
Loading