Gen3 Validator is a Python toolkit designed to make working with Gen3 metadata schemas and data validation straightforward for developers.
With this tool, you can:
- Resolve and flatten Gen3 JSON schemas so you can work with them programmatically.
- Validate JSON metadata files against Gen3 schemas, catching schema violations early in your pipeline.
- Check linkage integrity between data nodes (e.g., ensuring all sample-to-subject references are valid).
- Parse Excel-based metadata templates and convert them to JSON for Gen3 ingestion.
- Get detailed validation results and summary stats as Python data structures or pandas DataFrames, making it easy to integrate with your own scripts or reporting tools.
*Note: I recommend you clone this repo, and walk through the examples in the usage page. The usage examples load data from the tests/data
directory so you can see how the data is structured.
pip install gen3_validator
pip show gen3_validator
- Make sure you have poetry installed.
- Clone the repository.
- Run the following command to activate the virtual environment.
eval $(poetry env activate)
- Run the following command to install the dependencies.
poetry install
- Run the following command to run the tests.
pytest -vv tests/
See the license page for more information.