Add Kubernetes deployment guide #899

terrytangyuan · 2025-01-29T17:23:36Z

This PR moves some content from the recent blog post to here as a more official guide for users who'd like to deploy Llama Stack on Kubernetes.

ellistarn

Hey I did something similar a few days ago: https://github.com/ellistarn/llama-operator/blob/main/manifest/deployment.yaml

docs/source/distributions/kubernetes_deployment.md

ellistarn · 2025-01-29T23:24:24Z

docs/source/distributions/kubernetes_deployment.md

@@ -0,0 +1,192 @@
+# Kubernetes Deployment Guide


There's an open question to me as to whether or not Kubernetes maps to the Llama Stack "distribution" concept. Many of these steps would not hold true for running w/ meta-reference or ollama. This feels like a sibling of https://github.com/meta-llama/llama-stack/blob/39c34dd25f9365b09000a07de5c46dbdba27e3cb/distributions/remote-vllm/compose.yaml.

However, there is substantial overlap between the compose files for each distribution, so I think we can do better. I think we probably need to figure out how to draw the right boundaries around distributions and deployment options. In the meantime, WDYT about moving this there?

We were discussing the same idea on Discord. My initial thought was to first provide a guide so that others can start following similar ways to deploy their selected providers to K8s (and make any necessary changes to meet their needs). Next step is to provide a packaged YAML file that's templated for K8s deployment for each provider (e.g. remote:vllm) and then we can simplify this guide. I believe we'll need a guide anyways to call out specific details.

I'll leave this call to the maintainers :)

docs/source/distributions/kubernetes_deployment.md

terrytangyuan · 2025-01-30T01:58:50Z

@ellistarn Thanks for the review and suggestions! It's great to cross paths here again

Signed-off-by: Yuan Tang <[email protected]>

terrytangyuan requested review from ashwinb, yanxi0830, hardikjshah, dltn, raghotham, dineshyv, vladimirivic and sixianyi0721 as code owners January 29, 2025 17:23

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 29, 2025

ellistarn reviewed Jan 29, 2025

View reviewed changes

terrytangyuan added 3 commits January 29, 2025 21:40

Add Kubernetes deployment guide

abad607

Signed-off-by: Yuan Tang <[email protected]>

Pod -> Deployment, NodePort -> ClusterIP

ddba43f

Signed-off-by: Yuan Tang <[email protected]>

env var and fix logs

3bcc778

Signed-off-by: Yuan Tang <[email protected]>

terrytangyuan force-pushed the k8s-deployment branch from da127c6 to 3bcc778 Compare January 30, 2025 02:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Kubernetes deployment guide #899

Add Kubernetes deployment guide #899

terrytangyuan commented Jan 29, 2025 •

edited

Loading

ellistarn left a comment

ellistarn Jan 29, 2025 •

edited

Loading

terrytangyuan Jan 30, 2025

ellistarn Jan 30, 2025

terrytangyuan commented Jan 30, 2025

Add Kubernetes deployment guide #899

Are you sure you want to change the base?

Add Kubernetes deployment guide #899

Conversation

terrytangyuan commented Jan 29, 2025 • edited Loading

ellistarn left a comment

Choose a reason for hiding this comment

ellistarn Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

terrytangyuan Jan 30, 2025

Choose a reason for hiding this comment

ellistarn Jan 30, 2025

Choose a reason for hiding this comment

terrytangyuan commented Jan 30, 2025

terrytangyuan commented Jan 29, 2025 •

edited

Loading

ellistarn Jan 29, 2025 •

edited

Loading