-
Notifications
You must be signed in to change notification settings - Fork 804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Kubernetes deployment guide #899
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey I did something similar a few days ago: https://github.com/ellistarn/llama-operator/blob/main/manifest/deployment.yaml
@@ -0,0 +1,192 @@ | |||
# Kubernetes Deployment Guide |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an open question to me as to whether or not Kubernetes maps to the Llama Stack "distribution" concept. Many of these steps would not hold true for running w/ meta-reference or ollama. This feels like a sibling of https://github.com/meta-llama/llama-stack/blob/39c34dd25f9365b09000a07de5c46dbdba27e3cb/distributions/remote-vllm/compose.yaml.
However, there is substantial overlap between the compose files for each distribution, so I think we can do better. I think we probably need to figure out how to draw the right boundaries around distributions and deployment options. In the meantime, WDYT about moving this there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were discussing the same idea on Discord. My initial thought was to first provide a guide so that others can start following similar ways to deploy their selected providers to K8s (and make any necessary changes to meet their needs). Next step is to provide a packaged YAML file that's templated for K8s deployment for each provider (e.g. remote:vllm) and then we can simplify this guide. I believe we'll need a guide anyways to call out specific details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave this call to the maintainers :)
@ellistarn Thanks for the review and suggestions! It's great to cross paths here again |
Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: Yuan Tang <[email protected]>
da127c6
to
3bcc778
Compare
This PR moves some content from the recent blog post to here as a more official guide for users who'd like to deploy Llama Stack on Kubernetes.