From 74062740416db8572627dda1f87925268ba2f1d3 Mon Sep 17 00:00:00 2001 From: Sam Stoelinga Date: Fri, 6 Dec 2024 09:03:56 -0800 Subject: [PATCH] [Doc] add KubeAI to serving integrations (#10837) Signed-off-by: Sam Stoelinga --- docs/source/serving/deploying_with_kubeai.rst | 17 +++++++++++++++++ docs/source/serving/integrations.rst | 1 + 2 files changed, 18 insertions(+) create mode 100644 docs/source/serving/deploying_with_kubeai.rst diff --git a/docs/source/serving/deploying_with_kubeai.rst b/docs/source/serving/deploying_with_kubeai.rst new file mode 100644 index 0000000000000..ec3c065320fd9 --- /dev/null +++ b/docs/source/serving/deploying_with_kubeai.rst @@ -0,0 +1,17 @@ +.. _deploying_with_kubeai: + +Deploying with KubeAI +===================== + +`KubeAI `_ is a Kubernetes operator that enables you to deploy and manage AI models on Kubernetes. It provides a simple and scalable way to deploy vLLM in production. Functionality such as scale-from-zero, load based autoscaling, model caching, and much more is provided out of the box with zero external dependencies. + + +Please see the Installation Guides for environment specific instructions: + +* `Any Kubernetes Cluster `_ +* `EKS `_ +* `GKE `_ + +Once you have KubeAI installed, you can +`configure text generation models `_ +using vLLM. \ No newline at end of file diff --git a/docs/source/serving/integrations.rst b/docs/source/serving/integrations.rst index f39997e0e44d9..0dd505a739863 100644 --- a/docs/source/serving/integrations.rst +++ b/docs/source/serving/integrations.rst @@ -6,6 +6,7 @@ Integrations run_on_sky deploying_with_kserve + deploying_with_kubeai deploying_with_triton deploying_with_bentoml deploying_with_cerebrium