Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a script to reorganize tool data based on the new layout for genomic Data Managers #19728

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from

Conversation

natefoo
Copy link
Member

@natefoo natefoo commented Mar 1, 2025

In #19013 I mentioned that I would write a script to move data from the old layout used by the most common genomic DMs to the new standardized layout - here it is.

Caveat: The __dbkeys__ and twobit tables do not have the dbkey/value(variant) distinction, in these cases the dbkey would be the variant. So in the (rare) case you have a variant in these tables you would need to do some manual post-processing.

DM layout changes were in galaxyproject/tools-iuc#6489

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. Install some data using old versions of data managers
    2. Run python ./scripts/reorganize_tool_data.py --tool-data-path /path/to/data --prune-dirs ./config/tool_data_table_conf.xml
    3. Observe proposed expected changes
    4. Run python ./scripts/reorganize_tool_data.py --tool-data-path /path/to/data --prune-dirs --commit ./config/tool_data_table_conf.xml

I just used this to reorganize the brc.galaxyproject.org CVMFS repo - results can be found at http://datacache.galaxyproject.org/brc/ once publishing is complete and changes propagate.

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@github-actions github-actions bot added this to the 25.0 milestone Mar 1, 2025
@natefoo natefoo changed the title Reorganize tool data Add a script to reorganize tool data based on the new layout for genomic Data Managers Mar 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant