Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests_gaudi: Added L2 vllm workload #329

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vbedida79
Copy link
Contributor

PR includes gaudi l2 vllm workload

Signed-off-by: vbedida79 [email protected]

@@ -74,4 +74,83 @@ Welcome to HCCL demo
[BENCHMARK] NW Bandwidth : 258.209121 GB/s
[BENCHMARK] Algo Bandwidth : 147.548069 GB/s
####################################################################################################
```

## VLLM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vLLM

```

## VLLM
VLLM is a serving engine for LLM's. The following workloads deploys a VLLM server with an LLM using Intel Gaudi. Refer to [Intel Gaudi VLLM fork](https://github.com/HabanaAI/vllm-fork.git) for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vLLM

Build the workload container image:
```
$ oc apply -f https://raw.githubusercontent.com/intel/intel-technology-enabling-for-openshift/main/tests/gaudi/l2/vllm_buildconfig.yaml
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add the instruction to let user know whether the building is success. :-)

```
Deploy the workload:
* Update the hugging face token and the pvc according to your cluster setup
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have some detail about setting the hugging face token? and also give some brief introduction about what model we are using. :-)

runPolicy: "Serial"
source:
git:
uri: https://github.com/opendatahub-io/vllm.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After comparing

  1. https://github.com/opendatahub-io/vllm.git - ODH fork vllM
  2. https://github.com/vllm-project/vllm - vLLM upstream
    3.https://github.com/HabanaAI/vllm-fork - Habana fork vLLM
    I think currently we should start from use the 3. with the change in 1 (adding the ubi based docker file for RH OpenShift), and obviously the Intel are upstreaming from 3 to 2. So in the long run we will using 2.
    So I think we need to 1). submit a PR to adding the ubi based docker file for RH, and also add the RH 9.4 support into the documents, and then 2). using repo 3 3) I think the owner of 3 will also help to upstream the ubi based docker file and doc to 2. 4) after that we can switch to use 2 the upstream vLLM.
    @vbedida79 any comments? :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants