This project is a collection of blueprints, patterns and toolchain (in the form of python SDK and CLI) to leverage OCI Artifact and containers for ML model and metadata.
Documentation: https://containers.github.io/omlmd
GitHub repository: https://github.com/containers/omlmd
YouTube video playlist: https://www.youtube.com/watch?v=W4GwIRPXE8E&list=PLdbdefeRIj9SRbg6Hkr15GeyPH0qpk_ww
Pypi distribution: https://pypi.org/project/omlmd
Tip
We recommend checking out the Getting Started tutorial in the documentation; below instructions are provided for a quick overview.
In your Python environment, use:
pip install omlmd
Store ML model file model.joblib
and its metadata in the OCI repository at localhost:8080
:
from omlmd.helpers import Helper
omlmd = Helper()
omlmd.push("localhost:8080/matteo/ml-artifact:latest", "model.joblib", name="Model Example", author="John Doe", license="Apache-2.0", accuracy=9.876543210)
Fetch everything in a single pull:
omlmd.pull(target="localhost:8080/matteo/ml-artifact:latest", outdir="tmp/b")
Or fetch only the ML model assets:
omlmd.pull(target="localhost:8080/matteo/ml-artifact:latest", outdir="tmp/b", media_types=["application/x-mlmodel"])
The features can be composed in order to expose higher lever capabilities, such as retrieving only the metadata informatio. Implementation intends to follow OCI-Artifact convention
md = omlmd.get_config(target="localhost:8080/matteo/ml-artifact:latest")
print(md)
Client-side crawling of metadata.
Note: Server-side analogous coming soon/reference in blueprints.
crawl_result = omlmd.crawl([
"localhost:8080/matteo/ml-artifact:v1",
"localhost:8080/matteo/ml-artifact:v2",
"localhost:8080/matteo/ml-artifact:v3"
])
Demonstrate integration of crawling results with querying (in this case using jQ)
Of the crawled ML OCI artifacts, which one exhibit the max accuracy?
import jq
jq.compile( "max_by(.config.customProperties.accuracy).reference" ).input_text(crawl_result).first()
Don't forget to checkout the documentation website for more information!