From bc37e15972f530ef7ec4a668dbb633baf7a313aa Mon Sep 17 00:00:00 2001 From: Gorkem Ercan Date: Sun, 11 Feb 2024 00:44:09 -0500 Subject: [PATCH 1/6] Add Jozu AI/ML Packaging Manifest Format Reference --- pkg/cmd/build/jozu-file.md | 154 +++++++++++++++++++++++++++++++++++++ 1 file changed, 154 insertions(+) create mode 100644 pkg/cmd/build/jozu-file.md diff --git a/pkg/cmd/build/jozu-file.md b/pkg/cmd/build/jozu-file.md new file mode 100644 index 00000000..4e51e8c6 --- /dev/null +++ b/pkg/cmd/build/jozu-file.md @@ -0,0 +1,154 @@ +# Jozu AI/ML Packaging Manifest Format Reference + +The Jozu manifest for AI/ML is a YAML file designed to encapsulate all the necessary information about the package, including code, datasets, models, and their metadata. This reference documentation outlines the structure and specifications of the manifest format. + +## Overview + +The manifest is structured into several key sections: `version`, `package`, and `artifacts`. Each section serves a specific purpose in describing the AI/ML package components and requirements. + +### `version` + +- **Description**: Specifies the manifest format version. +- **Type**: String +- **Example**: `1.0` + +### `package` + +This section provides general information about the AI/ML project. + +#### `name` + +- **Description**: The name of the AI/ML project. +- **Type**: String + +#### `version` + +- **Description**: The current version of the project. +- **Type**: String +- **Example**: `1.2.3` + +#### `description` + +- **Description**: A brief overview of the project's purpose and capabilities. +- **Type**: String + +#### `authors` + +- **Description**: A list of individuals or entities that have contributed to the project. +- **Type**: Array of Strings + +#### `license` + +- **Description**: The SPDX identifier for the project's license. +- **Type**: String +- **Example**: `MIT`, `Apache-2.0` + +### `artifacts` + +This section details the artifacts included in the AI/ML package, such as code, datasets, and models. + +#### `code` + +- **Description**: Information about the source code. +- **Type**: Object Array + - `path`: Location of the source code within the context. + - `description`: Description of what the code does. + - `license`: SPDX license identifier for the code. + +#### `datasets` + +- **Description**: Information about the datasets used. +- **Type**: Object Array + - `name`: Name of the dataset. + - `path`: Location of the dataset file or directory. + - `description`: Overview of the dataset. + - `source`: Origin of the dataset. + - `license`: SPDX license identifier for the dataset. + - `preprocessing`: Reference to preprocessing steps. + +#### `models` + +- **Description**: Details of the trained models included in the package. +- **Type**: Object Array + - `name`: Name of the model + - `path`: Location of the model + - `framework`: AI/ML framework + - `version`: Version of the model + - `description`: Overview of the model + - `license`: SPDX license identifier for the dataset. + - `training`: + - `dataset`: Name of the dataset + - `parameters`: name value pairs + - `validation`: + - `dataset`: Name of the dataset + - `metrics`: name value pairs + + +## Example + +```yaml +version: 1.0 +package: + name: AIProjectName + version: 1.2.3 + description: >- + A brief description of the AI/ML project. + authors: [Author Name, Contributor Name] + license: MIT +artifacts: + code: + - path: src/ + description: Source code for the AI models. + license: Apache-2.0 + datasets: + - name: DatasetName + path: data/dataset.csv + description: Description of the dataset. + source: URL + license: CC-BY-4.0 + preprocessing: Preprocessing steps. + models: + - name: ModelName + path: models/model.h5 + framework: TensorFlow + version: 1.0 + description: Model description. + license: Apache-2.0 + training: + dataset: DatasetName + parameters: + learning_rate: 0.001 + epochs: 100 + batch_size: 32 + Validation: + - dataset: DatasetName + metrics: + accuracy: 0.95 + f1_score: 0.94 +``` + + +## Future Considerations + +This section is for collecting future ideas. + +### `dependencies` + +**This is a possible future section that may be used for creating BOM.** + +- **Description**: Lists the project's external dependencies. +- **Type**: Object Array + - `name`: Name of the dependency. + - `version`: Version of the dependency. + - `license`: SPDX license identifier for the dependency. + +##### Example for dependencies +```yaml + dependencies: + - name: numpy + version: 1.19.2 + license: BSD-3-Clause + - name: pandas + version: 1.1.3 + license: BSD-3-Clause +``` \ No newline at end of file From 5c237a18f3c2c81374fd801af6d80c20657ac49e Mon Sep 17 00:00:00 2001 From: Gorkem Ercan Date: Sun, 11 Feb 2024 16:09:46 -0500 Subject: [PATCH 2/6] remove artifacts section Remove artifact section and clarify how path values are calculated --- pkg/cmd/build/jozu-file.md | 70 ++++++++++++++++++-------------------- 1 file changed, 33 insertions(+), 37 deletions(-) diff --git a/pkg/cmd/build/jozu-file.md b/pkg/cmd/build/jozu-file.md index 4e51e8c6..fba60fb2 100644 --- a/pkg/cmd/build/jozu-file.md +++ b/pkg/cmd/build/jozu-file.md @@ -4,7 +4,7 @@ The Jozu manifest for AI/ML is a YAML file designed to encapsulate all the neces ## Overview -The manifest is structured into several key sections: `version`, `package`, and `artifacts`. Each section serves a specific purpose in describing the AI/ML package components and requirements. +The manifest is structured into several key sections: `version`, `package`,`code`, `datasets` and `models`. Each section serves a specific purpose in describing the AI/ML package components and requirements. ### `version` @@ -43,15 +43,12 @@ This section provides general information about the AI/ML project. - **Type**: String - **Example**: `MIT`, `Apache-2.0` -### `artifacts` - -This section details the artifacts included in the AI/ML package, such as code, datasets, and models. #### `code` - **Description**: Information about the source code. - **Type**: Object Array - - `path`: Location of the source code within the context. + - `path`: Location of the source cod files or directory relative to the context - `description`: Description of what the code does. - `license`: SPDX license identifier for the code. @@ -60,7 +57,7 @@ This section details the artifacts included in the AI/ML package, such as code, - **Description**: Information about the datasets used. - **Type**: Object Array - `name`: Name of the dataset. - - `path`: Location of the dataset file or directory. + - `path`: Location of the dataset file or directory relative to the context. - `description`: Overview of the dataset. - `source`: Origin of the dataset. - `license`: SPDX license identifier for the dataset. @@ -71,7 +68,7 @@ This section details the artifacts included in the AI/ML package, such as code, - **Description**: Details of the trained models included in the package. - **Type**: Object Array - `name`: Name of the model - - `path`: Location of the model + - `path`: Location of the model file or directory relative to the context - `framework`: AI/ML framework - `version`: Version of the model - `description`: Overview of the model @@ -95,36 +92,35 @@ package: A brief description of the AI/ML project. authors: [Author Name, Contributor Name] license: MIT -artifacts: - code: - - path: src/ - description: Source code for the AI models. - license: Apache-2.0 - datasets: - - name: DatasetName - path: data/dataset.csv - description: Description of the dataset. - source: URL - license: CC-BY-4.0 - preprocessing: Preprocessing steps. - models: - - name: ModelName - path: models/model.h5 - framework: TensorFlow - version: 1.0 - description: Model description. - license: Apache-2.0 - training: - dataset: DatasetName - parameters: - learning_rate: 0.001 - epochs: 100 - batch_size: 32 - Validation: - - dataset: DatasetName - metrics: - accuracy: 0.95 - f1_score: 0.94 +code: + - path: src/ + description: Source code for the AI models. + license: Apache-2.0 +datasets: + - name: DatasetName + path: data/dataset.csv + description: Description of the dataset. + source: URL + license: CC-BY-4.0 + preprocessing: Preprocessing steps. +models: + - name: ModelName + path: models/model.h5 + framework: TensorFlow + version: 1.0 + description: Model description. + license: Apache-2.0 + training: + dataset: DatasetName + parameters: + learning_rate: 0.001 + epochs: 100 + batch_size: 32 + Validation: + - dataset: DatasetName + metrics: + accuracy: 0.95 + f1_score: 0.94 ``` From 5b07b7a2f52a2228aea5f7b5b00e0102ef91789e Mon Sep 17 00:00:00 2001 From: Gorkem Ercan Date: Mon, 12 Feb 2024 13:20:32 -0500 Subject: [PATCH 3/6] Remove some fields in jozu-file --- pkg/cmd/build/jozu-file.md | 9 --------- 1 file changed, 9 deletions(-) diff --git a/pkg/cmd/build/jozu-file.md b/pkg/cmd/build/jozu-file.md index fba60fb2..2859e884 100644 --- a/pkg/cmd/build/jozu-file.md +++ b/pkg/cmd/build/jozu-file.md @@ -37,12 +37,6 @@ This section provides general information about the AI/ML project. - **Description**: A list of individuals or entities that have contributed to the project. - **Type**: Array of Strings -#### `license` - -- **Description**: The SPDX identifier for the project's license. -- **Type**: String -- **Example**: `MIT`, `Apache-2.0` - #### `code` @@ -59,7 +53,6 @@ This section provides general information about the AI/ML project. - `name`: Name of the dataset. - `path`: Location of the dataset file or directory relative to the context. - `description`: Overview of the dataset. - - `source`: Origin of the dataset. - `license`: SPDX license identifier for the dataset. - `preprocessing`: Reference to preprocessing steps. @@ -91,7 +84,6 @@ package: description: >- A brief description of the AI/ML project. authors: [Author Name, Contributor Name] - license: MIT code: - path: src/ description: Source code for the AI models. @@ -100,7 +92,6 @@ datasets: - name: DatasetName path: data/dataset.csv description: Description of the dataset. - source: URL license: CC-BY-4.0 preprocessing: Preprocessing steps. models: From dc8555fe6d44425b4cfe58dad59b36f8435e59e0 Mon Sep 17 00:00:00 2001 From: Gorkem Ercan Date: Mon, 12 Feb 2024 16:56:40 -0500 Subject: [PATCH 4/6] Update pkg/cmd/build/jozu-file.md Co-authored-by: Angel Misevski --- pkg/cmd/build/jozu-file.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pkg/cmd/build/jozu-file.md b/pkg/cmd/build/jozu-file.md index 2859e884..6cfa69a0 100644 --- a/pkg/cmd/build/jozu-file.md +++ b/pkg/cmd/build/jozu-file.md @@ -42,7 +42,7 @@ This section provides general information about the AI/ML project. - **Description**: Information about the source code. - **Type**: Object Array - - `path`: Location of the source cod files or directory relative to the context + - `path`: Location of the source code files or directory relative to the context - `description`: Description of what the code does. - `license`: SPDX license identifier for the code. From f7f3c679c567674a37f088f92c8d1b3cab54eb49 Mon Sep 17 00:00:00 2001 From: Gorkem Ercan Date: Mon, 12 Feb 2024 16:57:05 -0500 Subject: [PATCH 5/6] Update pkg/cmd/build/jozu-file.md Co-authored-by: Angel Misevski --- pkg/cmd/build/jozu-file.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pkg/cmd/build/jozu-file.md b/pkg/cmd/build/jozu-file.md index 6cfa69a0..cc860306 100644 --- a/pkg/cmd/build/jozu-file.md +++ b/pkg/cmd/build/jozu-file.md @@ -107,7 +107,7 @@ models: learning_rate: 0.001 epochs: 100 batch_size: 32 - Validation: + validation: - dataset: DatasetName metrics: accuracy: 0.95 From 7a928bbee350036e72d7aa931d478773d36770e2 Mon Sep 17 00:00:00 2001 From: Gorkem Ercan Date: Tue, 13 Feb 2024 11:02:47 -0500 Subject: [PATCH 6/6] Update manifest version key --- pkg/cmd/build/jozu-file.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pkg/cmd/build/jozu-file.md b/pkg/cmd/build/jozu-file.md index cc860306..dc183ea4 100644 --- a/pkg/cmd/build/jozu-file.md +++ b/pkg/cmd/build/jozu-file.md @@ -6,7 +6,7 @@ The Jozu manifest for AI/ML is a YAML file designed to encapsulate all the neces The manifest is structured into several key sections: `version`, `package`,`code`, `datasets` and `models`. Each section serves a specific purpose in describing the AI/ML package components and requirements. -### `version` +### `ManifestVersion` - **Description**: Specifies the manifest format version. - **Type**: String @@ -77,7 +77,7 @@ This section provides general information about the AI/ML project. ## Example ```yaml -version: 1.0 +manifestVersion: 1.0 package: name: AIProjectName version: 1.2.3