Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Integration of ProteoRE galaxy components in Galaxy-P #535

Open
wants to merge 1,061 commits into
base: master
Choose a base branch
from

Conversation

vloux
Copy link
Contributor

@vloux vloux commented Jan 6, 2021

Hello Galaxy-P team,
all the best for this new year.
As discussed in usegalaxy-eu/usegalaxy-eu-tools#289 , I transfered proteore components from http://github.com/ifb-git/ProteoRE to the Galaxy-P repository.

This consists in 19 tools :

  • proteore_id_converter
  • proteore_filter_keywords_values
  • proteore_venn_diagram
  • proteore_heatmap_visualization
  • proteore_prot_features_3orga
  • proteore_prot_features
  • proteore_expression_levels_by_tissue
  • proteore_expression_rnaseq_abbased
  • proteore_tissue_specific_expression_data
  • proteore_ms_observation_pepatlas
  • proteore_get_unique_peptide_srm_method
  • proteore_goprofiles
  • proteore_clusterprofiler
  • proteore_go_terms_enrich_comparison
  • proteore_topgo
  • proteore_reactome
  • proteore_maps_visualization
  • proteore_kegg_pathways_coverage
  • proteore_build_protein_interaction_maps

and one data_manager :

  • proteore_data_manager

I made one "big" PR with all the components, but tell me if you prefer one PR by tool.
Thank you for the review !

yvandenb and others added 30 commits March 7, 2019 14:23
big modif for the make_dotplot function
import of 2 functions from clusterProfiler . Goal : length of the y-legend of the dotplot is ok
UserDoc section now written - please visually check format and read carefully content - before deploiement on dev-migale
transmb domains nb in nextprot 2019 was wrong.
correction og this,  see "get all tm domain" in the code
version number update caus change in the pyhton associated script
@vloux vloux self-assigned this Jan 6, 2021
 ERROR: Invalid XML found in fille: ERROR:SCHEMASV:SCHEMAV_ELEMENT_CONTENT: Element 'requirement': This element is not expected.]
@vloux vloux changed the title Integration of ProteoRE galaxy components in Galaxy-P [WIP] Integration of ProteoRE galaxy components in Galaxy-P Jan 7, 2021
@vloux vloux added the wip label Jan 7, 2021
Copy link
Member

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi all!

What a great start. I have included some very high level and general comments inline.

A few more suggestions and hints here:

  • are those args.rds files needed?
  • tools/proteore_clusterprofiler/test-data/GGO_MF_bar-plot --> what are those files?
  • tools/proteore_clusterprofiler/tool_test_output.html --> should be removed.
  • tools/proteore_filter_keywords_values/tool_test_output.json --> should be removed
  • maybe it's better to create separate PRs for every tool? That could help to review and getting single tools accepted sooner.

Thanks a lot,
Bjoern

@@ -0,0 +1,3 @@
<?xml version="1.0"?>
<tool_dependency>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be removed I think

long_description: A tool for finding tissue in which your proteins are expressed (or not) - Human Protein Atlas
name: proteore_expression_levels_by_tissue
owner: proteore
include:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove the include. Or is there any need for it?

<requirement type="package" version="3.6.1">r-base</requirement>
<requirements>
</requirements>
<stdio>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this simple stdio should be relaced by detect_errors on the command tag.

<stdio>
<exit_code range="1:" />
</stdio>
<command interpreter="Rscript">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interpreter is deprecated, you should use Rscript in the command section.

<exit_code range="1:" />
</stdio>
<command interpreter="Rscript">
$__tool_directory__/get_expression_profiles.R
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

single quotes around all paths, e.g. '$__tool_directory__/get_expression_profiles.R'

</param>
</when>
<when value="file">
<param name="genelist" type="data" format="txt,tabular" label="Select your file" help=""/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labels should be improved ... like Please select your genelist file

</when>
<when value="file">
<param name="genelist" type="data" format="txt,tabular" label="Select your file" help=""/>
<param name="column" type="text" label="Column IDs (e.g : Enter c1 for column n°1)" value="c1">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be of type="data_column" ?

<description>[Reactome]</description>
<requirements>
</requirements>
<stdio>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use detect_errors

</stdio>
<command><![CDATA[

python $__tool_directory__/reactome_analysis.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

single-quotes

@@ -0,0 +1,187 @@
#!/usr/bin/env python2.7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python 2 is depreated ...

Can we replace this tool wth https://github.com/peterjc/pico_galaxy/blob/master/tools/venn_list/venn_list.xml?

@bernt-matthias
Copy link
Collaborator

I would suggest to put all those tools in a single directory. Makes it easier to use macros and also simplifies the shed.yml file.

Copy link
Collaborator

@bernt-matthias bernt-matthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also prefer a tool by tool review. This gigantic effort seem hard to manage for reviewers. I guess we can move forward faster and more focused this way.

@@ -0,0 +1,311 @@
<tool id="build_protein_interaction_maps" name="Build Protein interaction network" version="2020.02.06">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a recent profile version of the tools, e.g. profile="20.01"

<when value="file" >
<param name="file" type="data" format="txt,tabular" label="Select your file" help="" />
<param name="header" type="boolean" checked="true" truevalue="true" falsevalue="false" label="Does file contain header?" />
<param name="ncol" type="text" value="c1" label="Column number of IDs to map" help='For example, fill in "c1" if it is the first column, "c2" if it is the second column and so on'>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a type="data_column" parameter.

<conditional name="database">
<param name="ref" value="bioplex"/>
<param name="id_type" value="UniProt-AC"/>
<param name="ref_file" value="tool-data/Human_bioplex_2019-03-01"/>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should restructure the loc files:

Instead of the columns: id, release, name, species, value you should rename id to value (guessing that id is a unique identifier) and value to path. Then in cheetah you can access the path with $ref_file.fields.path and in the test you can specify the id.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the test data (Human_bioplex_2019-03-01, ...) shoud be in test-data. tool-data should contain loc.sample files

<data name="output" format="data_manager_json"/>
</outputs>

<tests>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are needed.

@vloux
Copy link
Contributor Author

vloux commented Jan 13, 2021

Thank you Bjoern and Mathias for your helpful comments !
I realize that we have quite a lot of polishing to do on all these tools.
Indeed I will withdraw this PR and, after correction, manage to do one PR for each tool.
This will be really simpler to manage for all.

@bernt-matthias
Copy link
Collaborator

after correction, manage to do one PR for each tool

please feel free to just open the PRs before polishing .. I guess we can help a bit with early comments.

Please note that I'm working on kegg_pathway_visualizations .. because one of my users just sent in an error report :) I will PR to your repo.

@bernt-matthias
Copy link
Collaborator

@vloux are you on gitter? I would have qustions regarding kegg_pathway_visualizations

@foellmelanie
Copy link
Contributor

Great to see that you started migrating your tools :)

For the following tools you could consider using already existing and installed on EU server Galaxy tools:

proteore_id_converter: https://toolshed.g2.bx.psu.edu/repository?repository_id=c8774310981b07c5

proteore_filter_keywords_values: https://usegalaxy.eu/root?tool_id=Grep1 and https://usegalaxy.eu/root?tool_id=Filter1, I cannot find them in the toolshed but I think they come automatically with each Galaxy installation.

proteore_venn_diagram: https://toolshed.g2.bx.psu.edu/repository?repository_id=ccb47ef9d02bfc83

HTH,
Melanie

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants