From 4cd2c29baf8f180905e55af0e33959d310344e48 Mon Sep 17 00:00:00 2001 From: caozhuozi <543481992@qq.com> Date: Mon, 23 Dec 2024 22:54:10 +0800 Subject: [PATCH 1/2] fix some grammar errors and refine some expressions Signed-off-by: caozhuozi <543481992@qq.ccom> --- docs/spec.md | 87 +++++++++++++++++++++++----------------------------- 1 file changed, 39 insertions(+), 48 deletions(-) diff --git a/docs/spec.md b/docs/spec.md index 48ec42d..1a52920 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -1,80 +1,71 @@ # Model Format Specification -The specification defines an open standard Artifacial Intelligence model. It is defined through the artifact extension based on [the OCI image specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification), and extends model features through `artifactType` and `annotations`. Model storage and distribution can be optimized based on artifact extension. +The specification defines an open standard for packaging and distribution Artificial Intelligence models as OCI artifacts, adhering to [the OCI image specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification). -The goal of this specification is to package models in an OCI artifact to take advantage of OCI distribution and ensure efficient model deployment. +The goal of this specification is to outline a blueprint and enable the creation of interoperable solutions for packaging and retrieving AI/ML models by leveraging the existing OCI ecosystem, thereby facilitating efficient model management, deployment and serving in cloud-native environments. -The model specification needs to consider two factors: +## Use Cases -1. The model needs to be stored in the OCI registry and display the parameters of the model. So that the model should use - the [artifact extension](https://github.com/opencontainers/image-spec/blob/main/artifacts-guidance.md) to - packaging content other than OCI image specification. -2. The model needs to be mounted by the container runtime as - [read only volumes based on the OCI Artifacts in Kubernetes 1.31+](https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/). - Container runtimes can only pull OCI artifact that follow the OCI image specification. - -Therefore, the model specification must be defined through the artifact extension based on the [OCI image specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification). It can be better compatible with the kubernetes ecosystem. +* An OCI Registry could storage and manage AI/ML model artifacts with model versions, metadata, and parameters retrievable and displayable. +* A Data Scientist can package models together with their metadata (e.g., format, precision) and upload them to a registry, facilitating collaboration with MLOps Engineers while streamlining the deployment process to efficiently deliver models into production. +* A model serving/deployment platform can read model metadata (e.g., format, precision) from a registry to understand the AI/ML model details, identify the required server runtime + (as well as startup parameters, necessary resources, etc.), and serve the model in Kubernetes by [mounting it directly as a volume source](https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/) + without needing to pre-download it in an init-container or bundle it within the server runtime container. ## Overview -The model specification is defined through the artifact extension based on the OCI image specification, and extend model features through `artifactType` and `annotations`. Model storage and distribution can be optimized based on artifact extension. - -![manifest](./img/manifest.svg) +At a high level, the Model Format Specification is based on the [OCI Image Format Specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification) and incorporates [all its components](https://github.com/opencontainers/image-spec/blob/main/spec.md#understanding-the-specification). The key distinction lies in extending the [OCI Image Manifest Specification](https://github.com/opencontainers/image-spec/blob/main/manifest.md) to accommodate artifact usage specifically tailored for AI/ML models. -## Workflow +### Extended OCI Image Manifest Specification For Model Artifacts -The model specification running workflow is divided into two stages: `BUILD & PUSH` and `PULL & SERVE`. +The image manifest of model artifacts follows the [OCI Image Manifest Specification](https://github.com/opencontainers/image-spec/blob/main/manifest.md) and adheres to the [guidelines for artifacts usage](https://github.com/opencontainers/image-spec/blob/main/manifest.md#guidelines-for-artifact-usage). Specifically, it leverages the extensible `artifactType` and `annotations` properties to define attributes specific to model artifacts. -### BUILD & PUSH +![manifest](./img/manifest.svg) -Use tools(ORAS, Ollama, etc.) to build required resources in the model repository into artifact based on the model specification. Note that the model layer MUST NOT be compressed, because the files of model weight has been compressed. If the model layer is compressed, the container runtime will cost long time to decompress the model layer. Therefore, it's recommended to use the `application/vnd.oci.image.layer.v1.tar` format for the model layer to avoid compression -Next push the artifact to the OCI registry(Harbor, Docker Hub, etc.), and use the functionalities of the OCI registry to manage the model artifact. +- **`artifactType`** _string_ -![build-push](./img/build-and-push.png) + This REQUIRED property MUST be `application/vnd.cnai.model.manifest.v1+json`. -### PULL & SERVE +- **`layers`** _array of objects_ -The container runtime(containerd, CRI-O, etc) pulls the model artifact from the OCI registry, and mounts the model artifact as a read-only volume. Therefore, distributed model can use the P2P technology(Dragonfly, Kraken, etc) to reduce the pressure on the registry and preheat the model artifact into each node. If the model artifact is already present on the node, the container runtime can reuse the model artifact to mount different containers in the same node. + - **`mediaType`** _string_ -![pull-serve](./img/pull-and-serve.png) + This REQUIRED property MUST be one of the [OCI Image Media Types](https://github.com/opencontainers/image-spec/blob/main/media-types.md) designated for [layers](https://github.com/opencontainers/image-spec/blob/main/layer.md). + Otherwise, it will not be compatible with the container runtime. -## Understanding the Specification + - **`artifactType`** _string_ -The model specification is based on the [OCI image specification](https://github.com/opencontainers/image-spec/blob/main/spec.md) and focuses on defining the artifact extension according to the [Artifacts Guidance](https://github.com/opencontainers/image-spec/blob/main/artifacts-guidance.md). + This REQUIRED property MUST be at least the following media types: -### Image Manifest Extension Properties + - `application/vnd.cnai.model.layer.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that contains the model weight file. If the model has multiple weight files, they SHOULD be packaged into separate layers. + - `application/vnd.cnai.model.layer.v1.tar+gzip`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) compressed with [gzip](https://datatracker.ietf.org/doc/html/rfc1952) that contains the model weight file. + If the model has multiple weight files, they SHOULD be packaged in separate layers. + + _Implementers note_: It is recommended to package weight files without compression to avoid unnecessary overhead of decompression by the container runtime as model weight files are typically already compressed. + - `application/vnd.cnai.model.doc.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that includes documentation files like `README.md`, `LICENSE`, etc. + - `application/vnd.cnai.model.config.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that includes additional configuration files such as `config.json`,`tokenizer.json`, `generation_config.json`, etc. -- **`artifactType`** _string_ + - **`annotations`** _string-string map_ - This REQUIRED property MUST contain the media type `application/vnd.cnai.model.manifest.v1+json`. + This OPTIONAL property contains arbitrary attributes for the layer. For metadata specific to models, implementations SHOULD use the predefined annotation keys as outlined in the [Layer Annotation Keys](./annotations.md#layer-annotation-keys). -- **`layers`** _array of objects_ +## Workflow - - **`mediaType`** _string_ +As the model format specification conforms to the [OCI Image Specification](https://github.com/opencontainers/image-spec/blob/main/layer.md), it naturally aligns with the standard [OCI distribution workflow](https://github.com/opencontainers/distribution-spec/blob/main/spec.md). - `mediaType` MUST follow the [OCI image specification](https://github.com/opencontainers/image-spec/blob/main/layer.md), because the model needs to be mounted - by the container runtime as [read only volumes based on the OCI Artifacts in Kubernetes 1.31+](https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/). - Container runtimes can only pull OCI artifact that follow the OCI image specification. +This section outlines the typical workflow for a model OCI artifact, which consists of two main stages: `BUILD & PUSH` and `PULL & SERVE`. - - **`artifactType`** _string_ +### BUILD & PUSH - Implementations MUST support at least the following media types: +Build tools can package required resources into an OCI artifact following the model format specification. - - `application/vnd.cnai.model.layer.v1.tar`: The layer is a tarball that contains the model weight file. If the model has multiple weight files, - need to package them in separate layers. - - `application/vnd.cnai.model.layer.v1.tar+gzip`: The layer is a tarball that contains the model weight file and is compressed by gzip. - If the model has multiple weight files, need to package them in separate layers. But recommended package model weight files without compressed to - avoid the container runtime decompressing the model layer. Because the model weight files have been compressed, the container runtime will - cost long time to decompress the model layer. - - `application/vnd.cnai.model.doc.v1.tar`: The layer is a tarball that contains the model documentation file, such as README.md, LICENSE, etc. - - `application/vnd.cnai.model.config.v1.tar`: The layer is a tarball that contains the model configuration file, - such as config.json, tokenizer.json, generation_config.json, etc. +The generated artifact can then be pushed to OCI registries (e.g., Harbor, DockerHub) for storage and management. - - **`annotations`** _string-string map_ +![build-push](./img/build-and-push.png) - This OPTIONAL property contains arbitrary metadata for the layer. For model specification, SHOULD set the pre-defined annotation keys, refer to the [Layer Annotation Keys](./annotations.md#layer-annotation-keys). +### PULL & SERVE -- **`annotations`** _string-string map_ +Once the model artifact is stored in an OCI registry, the container runtime (e.g., containerd, CRI-O) can pull it from the OCI registry and mount it as a read-only volume during the model serving process, if required. - This OPTIONAL property contains arbitrary metadata for the image manifest. For model specification, SHOULD set the pre-defined annotation keys, refer to the [Manifest Annotation Keys](./annotations.md#manifest-annotation-keys). +![pull-serve](./img/pull-and-serve.png) From f2f734a9c4f4226223f908961f1da5a1ad26ec37 Mon Sep 17 00:00:00 2001 From: caozhuozi <543481992@qq.ccom> Date: Fri, 27 Dec 2024 15:26:33 +0800 Subject: [PATCH 2/2] fix markdown format Signed-off-by: caozhuozi <543481992@qq.com> --- docs/spec.md | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/docs/spec.md b/docs/spec.md index 1a52920..c8db4c9 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -22,31 +22,30 @@ The image manifest of model artifacts follows the [OCI Image Manifest Specificat ![manifest](./img/manifest.svg) - -- **`artifactType`** _string_ +* **`artifactType`** *string* This REQUIRED property MUST be `application/vnd.cnai.model.manifest.v1+json`. -- **`layers`** _array of objects_ +* **`layers`** *array of objects* - - **`mediaType`** _string_ + * **`mediaType`** *string* This REQUIRED property MUST be one of the [OCI Image Media Types](https://github.com/opencontainers/image-spec/blob/main/media-types.md) designated for [layers](https://github.com/opencontainers/image-spec/blob/main/layer.md). Otherwise, it will not be compatible with the container runtime. - - **`artifactType`** _string_ + * **`artifactType`** *string* This REQUIRED property MUST be at least the following media types: - - `application/vnd.cnai.model.layer.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that contains the model weight file. If the model has multiple weight files, they SHOULD be packaged into separate layers. - - `application/vnd.cnai.model.layer.v1.tar+gzip`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) compressed with [gzip](https://datatracker.ietf.org/doc/html/rfc1952) that contains the model weight file. + * `application/vnd.cnai.model.layer.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that contains the model weight file. If the model has multiple weight files, they SHOULD be packaged into separate layers. + * `application/vnd.cnai.model.layer.v1.tar+gzip`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) compressed with [gzip](https://datatracker.ietf.org/doc/html/rfc1952) that contains the model weight file. If the model has multiple weight files, they SHOULD be packaged in separate layers. - - _Implementers note_: It is recommended to package weight files without compression to avoid unnecessary overhead of decompression by the container runtime as model weight files are typically already compressed. - - `application/vnd.cnai.model.doc.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that includes documentation files like `README.md`, `LICENSE`, etc. - - `application/vnd.cnai.model.config.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that includes additional configuration files such as `config.json`,`tokenizer.json`, `generation_config.json`, etc. - - **`annotations`** _string-string map_ + *Implementers note*: It is recommended to package weight files without compression to avoid unnecessary overhead of decompression by the container runtime as model weight files are typically already compressed. + * `application/vnd.cnai.model.doc.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that includes documentation files like `README.md`, `LICENSE`, etc. + * `application/vnd.cnai.model.config.v1.tar`: The layer is a [tar archive](https://en.wikipedia.org/wiki/Tar_(computing)) that includes additional configuration files such as `config.json`,`tokenizer.json`, `generation_config.json`, etc. + + * **`annotations`** *string-string map* This OPTIONAL property contains arbitrary attributes for the layer. For metadata specific to models, implementations SHOULD use the predefined annotation keys as outlined in the [Layer Annotation Keys](./annotations.md#layer-annotation-keys).