vLLM

This repo is a fork of the vLLM repo.

Usage

Pull the latest image from ECR:

bash docker/pull.sh vllm:latest

Run the container (with Llama3 8B in this case):

docker run --runtime nvidia --gpus all \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    -p 8000:8000 \
    --ipc=host \
    vllm \
    --model meta-llama/Meta-Llama-3-8B-Instruct

Development

Setup dev mode

Clone the repo and setup the base docker image:

docker run --gpus all -it --rm --ipc=host \
	-v $(pwd):/workspace/vllm \
	-v ~/.cache/huggingface:/root/.cache/huggingface \
	-p 8000:8000 \
	nvcr.io/nvidia/pytorch:23.10-py3

Once done, install vLLM in dev mode and the dev requirements in the container:

cd vllm
pip install -e .
pip install -r requirements-dev.txt
pip install boto3

It will take a while but once done, open another terminal on the host and run:

docker commit <container_id> vllm_dev

This will create a new image vllm_dev with the vLLM code installed. You won't need to install the dev dependencies again each time you start a new container.

From now on, you can exit the initial container and run this command to enter into the dev container:

docker run --gpus all -it --rm --ipc=host \
	-v $(pwd):/workspace/vllm \
	-v ~/.cache/huggingface:/root/.cache/huggingface \
	-p 8000:8000 \
	vllm_dev

Launch the server

Enter into the vllm_dev container and run:

python -m vllm.entrypoints.openai.api_server \
    --model meta-llama/Meta-Llama-3-8B-Instruct

Format the code

Enter into the vllm_dev container and run:

bash format.sh

Build the image

Once your changes are ready, you can build the prod image. Run these commands on the host:

bash docker/build.sh

And deploy it to ECR:

bash docker/deploy.sh <version>

Upgrade version

You can upgrade the version of vLLM by rebasing on the official repo:

git clone https://github.com/lightonai/vllm
git remote add official https://github.com/vllm-project/vllm
git fetch official
git rebase <commit_sha> # Rebase on a specific commit of the official repo (i.e. the commit sha of the last stable release)
git rebase --continue # After resolving conflicts (if any), continue the rebase
git push origin main --force

Deployment

To deploy a model on Sagemaker, follow this README.

Name		Name	Last commit message	Last commit date
Latest commit History 3,445 Commits
.buildkite		.buildkite
.github		.github
benchmarks		benchmarks
cmake		cmake
csrc		csrc
docker		docker
docs		docs
examples		examples
sagemaker		sagemaker
tests		tests
tools		tools
vllm		vllm
.clang-format		.clang-format
.dockerignore		.dockerignore
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
.shellcheckrc		.shellcheckrc
.yapfignore		.yapfignore
CMakeLists.txt		CMakeLists.txt
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DCO		DCO
Dockerfile		Dockerfile
Dockerfile.cpu		Dockerfile.cpu
Dockerfile.hpu		Dockerfile.hpu
Dockerfile.neuron		Dockerfile.neuron
Dockerfile.openvino		Dockerfile.openvino
Dockerfile.ppc64le		Dockerfile.ppc64le
Dockerfile.rocm		Dockerfile.rocm
Dockerfile.tpu		Dockerfile.tpu
Dockerfile.xpu		Dockerfile.xpu
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
SECURITY.md		SECURITY.md
collect_env.py		collect_env.py
find_cuda_init.py		find_cuda_init.py
format.sh		format.sh
pyproject.toml		pyproject.toml
python_only_dev.py		python_only_dev.py
requirements-build.txt		requirements-build.txt
requirements-common.txt		requirements-common.txt
requirements-cpu.txt		requirements-cpu.txt
requirements-cuda.txt		requirements-cuda.txt
requirements-deploy.txt		requirements-deploy.txt
requirements-dev.txt		requirements-dev.txt
requirements-hpu.txt		requirements-hpu.txt
requirements-lint.txt		requirements-lint.txt
requirements-neuron.txt		requirements-neuron.txt
requirements-openvino.txt		requirements-openvino.txt
requirements-rocm.txt		requirements-rocm.txt
requirements-test.in		requirements-test.in
requirements-test.txt		requirements-test.txt
requirements-tpu.txt		requirements-tpu.txt
requirements-xpu.txt		requirements-xpu.txt
setup.py		setup.py
use_existing_torch.py		use_existing_torch.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

vLLM

Usage

Development

Setup dev mode

Launch the server

Format the code

Build the image

Upgrade version

Deployment

About

Uh oh!

Releases 18

Packages

Languages

License

lightonai/vllm

Folders and files

Latest commit

History

Repository files navigation

vLLM

Usage

Development

Setup dev mode

Launch the server

Format the code

Build the image

Upgrade version

Deployment

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 18

Packages 0

Languages

Packages