vLLM Instance Deployment

Intro

Quick tuto to deploy a vLLM instance using the Official Docker Image.

Pre-requisits

Conda
GPU (Nvidia)

Environment

If you are going to using Python with vLLM, it's best practice to set a dedicated environment. You can follow the following steps to set your Python environment. This was testing on Ubuntu only, but it should work well on MacOS and Windows WSL.

Clone this repo:

git clone https://github.com/pandego/vLLM-deployment.git
cd vllm

Setup the environment:

conda env create -f environment.yml
conda activate vllm

Install dependencies:
```
poetry install --no-root
```

Deploy it...

... `docker compose` it!

The `.env` File

Let's keep things clean. So first, copy default.env into .env:
```
cp default.env .env
```
You might need to edit the contents of the .env file with your HuggingFace Token

Deploy you container:

docker compose --env-file .env up -d --build -d

... using Python!

python -m vllm.entrypoints.openai.api_server \
            --model NousResearch/Meta-Llama-3-8B-Instruct --dtype auto --api-key EMPTY

Test it...

... `curl` it!

You can run the following command to test the vLLM instance. Be sure to change the model if necessary:

Using completions:

curl http://localhost:11435/v1/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "NousResearch/Meta-Llama-3-8B-Instruct",
        "prompt": "San Francisco is a",
        "max_tokens": 7,
        "temperature": 0
    }'

Using chat/completions:

curl http://localhost:11435/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "NousResearch/Meta-Llama-3-8B-Instruct",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Who won the world series in 2020?"}
        ]
    }'

... using Python!

Run the following commands to test the deployed vLLM endpoint:
- LangChain example:
```
python vLLM_example_LangChain.py
```
- OpenAI example:
```
python vLLM_example_OpenAI.py
```

Et Voilà ! 🎈

Need more info?

You can check some more arguments in the helper_args.json file.
Find more info in the vLLM documentaion here.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
default.env		default.env
docker-compose.yml		docker-compose.yml
environment.yml		environment.yml
helper_args.json		helper_args.json
pyproject.toml		pyproject.toml
vLLM_example_LangChain.py		vLLM_example_LangChain.py
vLLM_example_OpenAI.py		vLLM_example_OpenAI.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

vLLM Instance Deployment

Intro

Pre-requisits

Environment

Deploy it...

... `docker compose` it!

The `.env` File

... using Python!

Test it...

... `curl` it!

... using Python!

Need more info?

About

Uh oh!

Releases

Packages

Uh oh!

Languages

pandego/vLLM-deployment

Folders and files

Latest commit

History

Repository files navigation

vLLM Instance Deployment

Intro

Pre-requisits

Environment

Deploy it...

... docker compose it!

The .env File

... using Python!

Test it...

... curl it!

... using Python!

Need more info?

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

... `docker compose` it!

The `.env` File

... `curl` it!

Packages