We utilized GH200 systems at JLSE testbeds at ALCF. We use apptainers to setup llama.cpp
- Build a container
$ source build-container.sh
This script builds a apptainer image llama-cpp-gh200.sif
using llama-cpp-gh200.def
definition file in the same directory.
- Use provided shell script
rc-llama2-7b.sh
in this directory to run container that runsllama2-7b.sh
to invokellama-bench
for various configurations of input, output lengths and batch sizes.
qsub rc-llama2-7b.sh
Write similar scripts for other llama-like models to benchmark.