Data Validation Prototyping

Part of the mini-POC. Scope...

Installation

Miniconda is the preferred python version, download from

sha256sum Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
conda update conda
python --version

It's much faster to install required packages using mamba (a c++ rewrite of conda)

conda install mamba -c conda-forge

Create a conda environment called 'valid' and install packages

mamba create -n valid -c conda-forge root pandas pyarrow jupyterlab jupytext matplotlib seaborn plotly streamlit pydantic
conda activate valid

To run the Data Schema Generation tool, from a terminal

streamlit run appwiz.py

Then the app will be served from localhost:8501

The port can be changed by adding --port to the command line.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.idea		.idea
.streamlit		.streamlit
config		config
schemas		schemas
.gitignore		.gitignore
README.md		README.md
UID.py		UID.py
constraints_abc.py		constraints_abc.py
constraints_arrow.py		constraints_arrow.py
datatools.py		datatools.py
desapp.py		desapp.py
destools.py		destools.py
f.py		f.py
generate_metaschema.py		generate_metaschema.py
ingestion.py		ingestion.py
metaschema-diagram.py		metaschema-diagram.py
proot.py		proot.py
run_ingestions.py		run_ingestions.py
schema_tools.py		schema_tools.py
testa.html		testa.html
testmeta.py		testmeta.py
valapp.py		valapp.py
validation.py		validation.py
valtools.py		valtools.py