Add a script to reorganize tool data based on the new layout for genomic Data Managers #19728
+375
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In #19013 I mentioned that I would write a script to move data from the old layout used by the most common genomic DMs to the new standardized layout - here it is.
Caveat: The
__dbkeys__
andtwobit
tables do not have the dbkey/value(variant) distinction, in these cases the dbkey would be the variant. So in the (rare) case you have a variant in these tables you would need to do some manual post-processing.DM layout changes were in galaxyproject/tools-iuc#6489
How to test the changes?
(Select all options that apply)
python ./scripts/reorganize_tool_data.py --tool-data-path /path/to/data --prune-dirs ./config/tool_data_table_conf.xml
python ./scripts/reorganize_tool_data.py --tool-data-path /path/to/data --prune-dirs --commit ./config/tool_data_table_conf.xml
I just used this to reorganize the brc.galaxyproject.org CVMFS repo - results can be found at http://datacache.galaxyproject.org/brc/ once publishing is complete and changes propagate.
License