- Clone vllm repo
git clone https://github.com/vllm-project/vllm.git
cd vllm
git checkout v0.6.1
- Build a docker container.
DOCKER_BUILDKIT=1 docker build -f Dockerfile.rocm -t vllm-rocm .
-
Run docker container
docker run -it \ --network=host \ --group-add=video \ --ipc=host \ --cap-add=SYS_PTRACE \ --security-opt seccomp=unconfined \ --device /dev/kfd \ --device /dev/dri \ -v /home/test/sraskar/huggingface/:/huggingface/ \ -v /home/test/sraskar/:/home/test/sraskar/ \ vllm-rocm \ bash
-
Use provided shell script
run-benchmark.sh
in this directory to runbenchmark_throughput.py
for various configurations of input, output lengths and batch sizes.