Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Dagger to docs #586

Merged
merged 8 commits into from
Nov 4, 2024
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/.vitepress/config.mts
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ export default defineConfig({
{ text: 'Get Started', link: '/docs/get-started' },
{ text: 'Next Steps', link: '/docs/next-steps' },
{ text: 'Deploy ModelKits', link: '/docs/deploy' },
{ text: 'Kit Dev', link: '/docs/dev-mode' },
{ text: 'Local LLM Dev', link: '/docs/dev-mode' },
{ text: 'Why KitOps?', link: '/docs/why-kitops' },
{ text: 'How it is Used', link: '/docs/use-cases' },
{ text: 'KitOps versus...', link: '/docs/versus' },
Expand Down
33 changes: 17 additions & 16 deletions docs/src/docs/get-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,11 @@ After entering your username and password, you'll see `Log in successful`. If yo

### 3/ Get a Sample ModelKit

Let's use the [unpack command](./cli/cli-reference.md#kit-unpack) to pull a [sample ModelKit from Jozu Hub](https://jozu.ml/browse) to our machine that we can play with. In this case, we'll unpack the whole thing, but one of the great things about Kit is that you can also selectively unpack only the artifacts you need: just the model, the model and dataset, the code, the configuration...whatever you want. Check out the `unpack` [command reference](./cli/cli-reference.md#kit-unpack) for details.
Let's use the [unpack command](./cli/cli-reference.md#kit-unpack) to pull a [sample ModelKit from Jozu Hub](https://jozu.ml/organization/jozu-quickstarts) to our machine that we can play with. In this case, we'll unpack the whole thing, but one of the great things about Kit is that you can also selectively unpack only the artifacts you need: just the model, the model and dataset, the code, the configuration...whatever you want. Check out the `unpack` [command reference](./cli/cli-reference.md#kit-unpack) for details.

You can grab <a href="https://jozu.ml/discover"
If you have a model already on your machine you can use that instead.

You can grab <a href="https://jozu.ml/browse"
v-ga-track="{
category: 'link',
label: 'grab any of the ModelKits',
Expand All @@ -51,22 +53,22 @@ You can grab <a href="https://jozu.ml/discover"
The unpack command will unpack the ModelKit contents to the current directory by default. If you want it unpacked to a specific directory use the `-d /path/to/unpacked`.

```sh
kit unpack jozu.ml/jozu/fine-tuning:latest
kit unpack jozu.ml/jozu-quickstarts/fine-tuning:latest
```

You'll see a set of messages as Kit unpacks the configuration, code, datasets, and serialized model. Now list the directory contents:

```sh
ls
tree

.
├── Kitfile
├── README.md
├── llama3-8b-8B-instruct-q4_0.gguf
├── lora-adapter.gguf
└── training-data.txt* A Kitfile
```

You'll see:
* A Kitfile
* A README file
* A Llama3 model in GGUF format
* A LoRA adapter in GGUF format
* A training dataset

The [Kitfile](./kitfile/kf-overview.md) is the manifest for our ModelKit, the serialized model, and a set of files or directories including the adapter, dataset, and docs. Every ModelKit has a Kitfile and you can use the info and inspect commands to view them from the CLI (there's more on this in our [Next Steps](next-steps.md) doc).

### 4/ Check the Local Repository
Expand Down Expand Up @@ -122,21 +124,20 @@ kit push jozu.ml/brad/quick-start:latest

Note that some registries, like Jozu Hub, don't automatically create a repository. If you receive an error from your `push` command, make sure you have created the repository in your target registry and that you have push rights to the repository.

### ModelKit to Container or Kubernetes

You can build a container or Kubernetes deployment that pulls artifacts directly from the ModelKit. This makes automating container creation and Kubernetes deployment simple. Read more in our [deployment documentation](./deploy.md).

### Congratulations

You've learned how to unpack a ModelKit, pack one up, and push it. Anyone with access to your remote repository can now pull your new ModelKit and start playing with your model using the `kit pull` or `kit unpack` commands.

If you'd like to learn more about using Kit, try our [Next Steps with Kit](./next-steps.md) document that covers:
* Creating a container or Kubernetes deployment from a ModelKit
* Signing your ModeKit
* Making your own Kitfile
* The power of `unpack`
* Tagging ModelKits
* Keeping your registry tidy

Or, if you want to run an LLM-based ModelKit locally try our [dev mode](./dev-mode.md)
Or, if you want to run an LLM-based ModelKit locally try our [dev mode](./dev-mode.md).

Finally, if you're building workflows using Dagger you can use KitOps through our [Daggerverse modules](https://daggerverse.dev/mod/github.com/jozu-ai/daggerverse/kit). Or get the [GitHub Action for Kit](https://github.com/marketplace/actions/setup-kit-cli).

Thanks for taking some time to play with Kit. We'd love to hear what you think. Feel free to drop us an [issue in our GitHub repository](https://github.com/jozu-ai/kitops/issues) or join [our Discord server](https://discord.gg/Tapeh8agYy).
20 changes: 15 additions & 5 deletions docs/src/docs/modelkit/compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,14 @@ ModelKit packages can be pushed to any OCI 1.1-compliant registry, whether in th

ModelKits themselves use standards like JSON, YAML, and TAR files so whatever MLOps or DevOps tools you're using...they'll work with ModelKits.

If you've tried using Kit with your favorite tool and are having trouble, please [open an issue](https://github.com/jozu-ai/kitops/issues/new/choose) in our GitHub repository.

If you've used KitOps with a product or project we've missed, please open a pull request updating this file.

## Compliant OCI Registries

The most fully-featured repository for ModelKits is the [Jozu Hub](https://jozu.ml/), however, many users find it easiest to store their ModelKits in an existing enterprise container registry:

* Amazon Elastic Container Registry (ECR)
* Azure Container Registry
* Docker Hub
Expand All @@ -15,7 +21,16 @@ ModelKits themselves use standards like JSON, YAML, and TAR files so whatever ML
* Harbor
* IBM Cloud Container Registry
* JFrog Artifactory
* Jozu Hub
* Red Hat Quay.io
* Sonatype Nexus

## CI/CD & Pipline Tools

### Pre-Built Workflows

* Dagger: see [Kit modules for Dagger](https://daggerverse.dev/mod/github.com/jozu-ai/daggerverse/kit) in the Daggerverse
* GitHub Actions: Kit CLI for [GitHub Actions](https://github.com/marketplace/actions/setup-kit-cli)

## Other Compatible Tools

Expand Down Expand Up @@ -61,12 +76,7 @@ ModelKits themselves use standards like JSON, YAML, and TAR files so whatever ML
* Red Hat OpenShift
* Red Hat OpenShift AI
* Seldon
* Sonatype Nexus
* Tensorflow Hub
* VMware
* Weights & Biases
* ZenML

If you've tried using Kit with your favorite tool and are having trouble, please [open an issue](https://github.com/jozu-ai/kitops/issues/new/choose) in our GitHub repository.

If you've used KitOps with a product or project we've missed, please open a pull request updating this file.
4 changes: 3 additions & 1 deletion docs/src/docs/modelkit/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@

ModelKit revolutionizes the way AI/ML artifacts are shared and managed throughout the lifecycle of AI/ML projects. As an OCI-compliant packaging format, ModelKit encapsulates datasets, code, configurations, and models into a single, standardized unit. This approach not only streamlines the development process but also ensures broad compatibility and integration with a vast array of tools and platforms.

<!-- Start with a [ModelKit Quick Start](TBD), -->See the [ModelKit spec](./spec.md), or look over the [tool compatibility list](./compatibility.md).
[Get started with ModelKits](../get-started.md) in less than 15 minutes.

[See how security-conscious organization are using ModelKits](../use-cases.md) with their existing tools to develop AI/ML projects faster and safer than ever before.

## Key Features of ModelKit:

Expand Down
5 changes: 5 additions & 0 deletions docs/src/docs/next-steps.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,17 @@
# Next Steps with Kit

In this guide you'll learn how to:
* Deploy a ModelKit
* Sign your ModelKit
* Make your own Kitfile
* The power of `unpack`
* Read the Kitfile or manifest from a ModelKit
* Tag ModelKits and keep your registry tidy

## Deploying a ModelKit

You can create a container or Kubernetes deployment using a ModelKit. See our [deployment instructions](./deploy.md).

## Signing your ModelKit

Because ModelKits are OCI 1.1 artifacts, they can be signed like any other OCI artifact (you may already sign your containers, for example).
Expand Down
25 changes: 21 additions & 4 deletions docs/src/docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,30 +8,47 @@ KitOps is an innovative open-source project designed to enhance collaboration am

At the heart of KitOps is the ModelKit, an OCI-compliant packaging format that enables the seamless sharing of all necessary artifacts involved in the AI/ML model lifecycle. This includes datasets, code, configurations, and the models themselves. By standardizing the way these components are packaged, ModelKit facilitates a more streamlined and collaborative development process that is compatible with nearly any tool. You can even [deploy ModelKits to containers or Kubernetes](./deploy.md).

### 📄 Kitfile
### 📄 Kitfile

Complementing the ModelKit is the Kitfile, a YAML-based configuration file that simplifies the sharing of model, dataset, and code configurations. The Kitfile is designed with both ease of use and security in mind, ensuring that configurations can be efficiently packaged and shared without compromising on safety or governance.

### 🖥️ Kit CLI
### 🖥️ Kit CLI

Bringing everything together is the Kit Command Line Interface (CLI). The Kit CLI is a powerful tool that enables users to create, manage, run, and deploy ModelKits using Kitfiles. Whether you are packaging a new model for development or deploying an existing model into production, the Kit CLI provides the necessary commands and functionalities to streamline your workflow.

## How KitOps is Used

KitOps is a key element in a platform engineering solution for AI/ML projects.

[See how security-conscious organization are using ModelKits](../use-cases.md) with their existing tools to develop AI/ML projects faster and safer than ever before.

## The Goal of KitOps

The primary goal of KitOps is to bridge the gaps between data science, software development, and operational deployment. By providing a standard packaging and versioning solution for AI/ML projects, KitOps drives greater speed, security, and collaboration for teams working with models.
The primary goal of KitOps is to become an open, vendor-neutral standard that simplifies and secures the packaging and versioning of AI/ML projects. In the same way that PDFs have helped people share documents, images, and diagrams between tools, KitOps makes it easy for teams to use the tools they prefer, but share the results safely and securely.

KitOps drives greater speed, security, and collaboration for teams working with models.

### 👩‍💻 For application developers

KitOps clears the path to use AI/ML with your existing tools and applications. No need to be an AI/ML expert, KitOps lets you concentrate on integrating AI/ML models into your applications, while Kit handles the packaging and sharing.

[Get Started](./get-started.md).

### 👷 For DevOps teams

ModelKits fit into your existing processes and the Kit CLI lets you pack or unpack ModelKit artifacts in the pipelines and automation you have proven over the last decade.

[Build a better golden path for AI/ML projects](./use-cases.md).
[Get Started](./get-started.md).


### 👩‍🔬 For data scientists

KitOps enables you to innovate in AI/ML without the usual infrastructure distractions. It simplifies dataset and model management and sharing, fostering closer collaboration with developers. With KitOps, you can spend more time experimenting and less time grappling with traditional software development tools.

[See how to use KitOps with Jupyter Notebooks](https://www.youtube.com/watch?v=OQPp7QEvk7Q).
[Get Started](./get-started.md).

## Benefits of KitOps

KitOps is not just another tool; it's a comprehensive CLI and packaging system specifically designed for the AI/ML workflow. It acknowledges the nuanced needs of AI/ML projects, such as:
Expand All @@ -46,7 +63,7 @@ One of the core strengths of KitOps is its ability to keep data and code version

### 🚀 Deployment Ready

Designed with a focus on deployment, ModelKits package assets in standard formats so you can depoloy them as [containers or to Kubernetes](./deploy.md). They're also [compatible with nearly any tool](./modelkit/compatibility.md) - helping you get your model to production faster and more efficiently.
Designed with a focus on deployment, ModelKits package assets in standard formats so you can depoly them as [containers or to Kubernetes](./deploy.md). They're also [compatible with nearly any tool](./modelkit/compatibility.md) - helping you get your model to production faster and more efficiently.

### 🏭 Standards-Based Approach

Expand Down
32 changes: 19 additions & 13 deletions docs/src/docs/use-cases.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,37 +2,43 @@

KitOps is the market's only open source, standards-based packaging and versioning system designed for AI/ML projects. Using the OCI standard allows KitOps to be painlessly adopted by any organization using containers and enterprise registries today (see a partial list of [compatible tools](./modelkit/compatibility.md)).

Organizations around the world are using KitOps as a "gate" in the [handoff between development and production](#level-1-handoff-from-development-to-production-).
Organizations around the world are using KitOps as a "gate" in the [handoff between development and production](#level-1-handoff-from-development-to-production-). This is often part of establishing golden paths and platform engineering around AI/ML projects.

Those who are concerned about end-to-end auditing of their model development - like those in regulated industries, or under the jurisdiction of the [EU AI Act](https://artificialintelligenceact.eu/) extend KitOps usage to security and development use cases (see [Level 2](#level-2-adding-security-️) and [Level 3](#level-3-storage-for-all-ai-project-versions-) use cases below.

## Level 1: Handoff From Development to Production 🤝

Organizations are having AI teams build a [ModelKit](./modelkit/intro.md) for each version of the AI project that is going to staging, user acceptance testing (UAT), or production.

KitOps is ideally suited to CI/CD pipelines (e.g., using [KitOps in a GitHub Action](https://dev.to/kitops/introducing-the-new-github-action-for-using-kit-cli-on-mlops-pipelines-21ia)) either triggered manually by the model development team when they're ready to send the model to production, or automatically when a model or its artifacts are updated in their respective repositories.
KitOps is ideally suited to CI/CD pipelines (e.g., using KitOps in a GitHub Action, Dagger module, or other CI/CD pipeline flow) either triggered manually by the model development team when they're ready to send the model to production, or automatically when a model or its artifacts are updated in their respective repositories.

Security conscious organizations often can't allow internal teams to use any publicly available model off of Hugging Face because:
* Their licenses would put the organization at risk
* They don't match the organization's security testing requirements
* Their provenance isn't understood

In these cases teams may use a pipeline (with [GitHub Actions](https://github.com/marketplace/actions/setup-kit-cli), [Dagger](https://daggerverse.dev/mod/github.com/jozu-ai/daggerverse/kit), or [another tool](./modelkit/compatibility.md)) to pull models or sample datasets from Hugging Face, run them through a battery of tests, then publish them in tamper-proof and signed ModelKits to their private container registry.

This ensures that:
* __Operations teams have all the assets and information they need__ in order to determine how to test, deploy, audit, and manage these new workloads
* __Everyone has a library of safe, immutable, and signed ModelKits__ speeding development without compromising security
* __[Safe models can be deployed](./deploy.md)__ for development or production use cases
* __Operations teams have all the assets and information they need__ for testing, deploying, auditing, and managing AI/ML projects
* __AI versioned packages are held in the same enterprise registry__ as other production assets like containers making them easier to find, secure, and audit
* __Compliance teams have a catalogue of versioned models__ that can be used for [EU AI Act](https://artificialintelligenceact.eu/) or other regulatory reporting
* __Everyone has a library of immutable and signed ModelKits__ for intellectual property, progress tracking, or other requirements
* __Organizations are protected against vendor shifts__ in their MLOps and Serving Infrastructure domains (this also gives them negotiating leverage with vendors)

Teams working on model development continue to use their disparate repositories during the development cycle at this stage. This is where most organizations start their usage of KitOps, but once they start most continue on...
**Get Started:**
* [Kit Dagger Modules](https://daggerverse.dev/mod/github.com/jozu-ai/daggerverse/kit): Kit Dagger modules make it easy to pack and selectively unpack ModelKits to speed pipelines.
* [Kit GitHub Action](https://github.com/marketplace/actions/setup-kit-cli): Our Kit GitHub Action is used to build hundreds of ModelKits every day as part of pipelines.
* [Learn to pack and unpack ModelKits](./get-started.md)
* [Create containers or Kubernetes deployments directly from ModelKits](./deploy.md)

This is where most organizations start their usage of KitOps, but once they start most continue on...

## Level 2: Adding Security 🛡️

Some organizations want to scan their models either before they enter the development phase (ideal), or before they are promoted beyond development. The open source [ModelScan project](https://github.com/protectai/modelscan) can help here.

### Creating a Curated Model Set 🧑‍🍳

Security-conscious organizations will often restrict the set of models that data science teams can use as a basis for their work. In these cases specific models can be pulled from public repositories like Hugging Face, then scanned with a tool like ModelScan, and finally packaged as a KitOps ModelKit stored in their enterprise registry.

This guarantees that models used by internal teams are safe and tamper-proof. By storing the models in the existing enterprise registry they're also easy for anyone to find or audit.

### Protecting Production 🚦

After model development has been completed, the resulting ModelKit and its artifacts can again by scanned by something like ModelScan and only allowed to move forward if it passes. Using ModelKits here again ensures that a model that passes the scan is packaged and signed so that it cannot be tampered with on its way to production.

Even with this level of scrutiny, however, there remain some risks since the varying repositories and versioning of artifacts during development can invite accidental or malicious tampering. This leads us to Level 3 adoption...
Expand Down