Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add k8s docs for getting started, K8s Manifest and Helm #179

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

devpramod
Copy link
Contributor

@devpramod devpramod commented Sep 25, 2024

This PR contains the following docs:

  • Getting Started for k8s - Installation, basic introduction to k8s and has a section for helm and k8s manifest. As more k8s deployment modes are added, corresponding sections will be created in this doc

  • Deploy using helm charts, a doc that follows the xeon.md template as much as possible to deploy ChatQnA on k8s using Helm

  • Deploy using K8s Manifest, a doc that follows the xeon.md template as much as possible to deploy ChatQnA on k8s using a K8s manifest yaml

Copy link
Contributor

@dbkinder dbkinder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some suggested edits

Also, when you add new documents, they need to be linked into the table of contents structure. There's an index.rst file in this folder you can edit to add these two documents.

I'd suggest you add an edit to the index.rst doc in this deploy folder, and replace the existing Kubernetes section with this:

Kubernetes
**********

.. toctree::
   :maxdepth: 1

   k8s_getting_started
   TGI on Xeon with Helm Charts <k8s_helm>

* Xeon & Gaudi with GMC
* Xeon & Gaudi without GMC

examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
Signed-off-by: devpramod <[email protected]>

Signed-off-by: devpramod <[email protected]>
examples/ChatQnA/deploy/index.rst Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Show resolved Hide resolved
Signed-off-by: devpramod <[email protected]>
Copy link
Contributor

@dbkinder dbkinder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link

@tylertitsworth tylertitsworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the stuff I see in the docs, is just a tutorial on things that already have docs. Like TGI/TEI, Helm, and Kubernetes. It feels a lot like we're overexplaining a concept that can be answered by a link to the source docs of another tool and a command for how it's relevant to use with ChatQnA.

For reference, this is the most handholding I would do in the case of deploying TGI:


Configure Model Server

Before we deploy a model, we need to configure the model server with information like, what model to use and how many max tokens to use. We will be using the tgi-on-intel helm chart. This chart uses XPU to the serve model normally, but we are going to configure it to use gaudi2 instead.

First, look at the configuration files in the tgi directory and add/remove any configuration options relevant to your workflow:

cd tgi
# Create a new configmap for your model server to use
kubectl apply -f cm.yaml

Tip

Here is the reference to the Huggingface Launcher Environment Variables and the TGI-Gaudi Environment Variables.

Deploy Model Server

Now that we have configured the model server, we can deploy it to Kubernetes. Using the provided config.yaml file in the tgi directory, we can deploy the model server.

Modify any values like resources or replicas in the config.yaml file to suit your needs. Then, deploy the model server:

# Encode HF Token for secret.encodedToken
echo -n '<token>' | base64
# Install Chart
git clone https://github.com/intel/ai-containers
helm install model-server -f config.yaml ai-containers/workflows/charts/tgi
# Check the pod status
kubectl get pod
kubectl logs -f <pod-name>

Please use a tool like markdownlint to ensure consistent styling.

examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Show resolved Hide resolved
@dbkinder
Copy link
Contributor

I've got a script in docs/scripts/checkmd.sh that uses pymarkdown (lint) to scan markdown files, with a bunch of checks disabled. Alas, if I wasn't retiring today, including a markdown linter was on my list to add to the CI checks. :)

Signed-off-by: devpramod <[email protected]>
@devpramod devpramod changed the title add k8s docs for getting started and helm add k8s docs for getting started, K8s Manifest and Helm Nov 19, 2024
Copy link
Contributor

@arun-gupta arun-gupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial setup is confusing as the docs point to set up k8s in multiple ways where as the tested configuration is minikube. We need to clearly document that. Other than that, there are some more clarifications required.

examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_getting_started.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved

Set a new [namespace](#create-and-set-namespace) and switch to it if needed

To enable UI, uncomment the lines `56-62` in `GenAIInfra/helm-charts/chatqna/values.yaml`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be lines: 58-62

56 # If you would like to switch to traditional UI image
 57 # Uncomment the following lines
 58 # chatqna-ui:
 59 #   image:
 60 #     repository: "opea/chatqna-ui"
 61 #     tag: "1.1"
 62 #   containerPort: "5173"

Also, just uncommenting may not be sufficient as that messes up with formatting. There is an additional space that needs to be deleted too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arun-gupta The lines have now changed to 59-63 that need to be uncommented. I'll remove the pointer to the liner numbers and say uncomment the following.
Could you tell me which space needs to be deleted?

examples/ChatQnA/deploy/k8s_helm.md Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_manifest.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_manifest.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_manifest.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_manifest.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_manifest.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@mkbhanda mkbhanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the edits.

Larger question of refactoring using the application reserve for a future refactoring effort!


**Helm Install Command:**

- `helm install [RELEASE_NAME] [CHART_NAME]`: This command deploys a Helm chart into your Kubernetes cluster, creating a new release. It is used to set up all the Kubernetes resources specified in the chart and track the version of the deployment.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean by "creating a new release". or new instance of application? .. Perhaps just delete the sentence: This command deploys a Helm chart into your Kubernetes cluster, creating a new release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkbhanda I used the terminology form here - https://helm.sh/docs/intro/using_helm/#:~:text=for%20Kubernetes%20packages.-,A%20Release,-is%20an%20instance

"A release is a fundamental concept in Helm. When you deploy a Helm chart into your Kubernetes cluster using the helm install command, it creates a new release"

I can change it if it's confusing.

examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Outdated Show resolved Hide resolved
examples/ChatQnA/deploy/k8s_helm.md Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants