Experimentation framework from Graphcore Research, used to explore the machine learning performance of post-training model adaptation for accelerating LLM inference.
See: SparQ Attention.
See scripts/Eval.ipynb and scripts/Quantisation.ipynb for usage.
python3 -m venv .venv
# Append to .venv/bin/activate:
export PYTHONPATH="${PYTHONPATH}:$(dirname ${VIRTUAL_ENV})"
export TOKENIZERS_PARALLELISM=true
source .venv/bin/activate
pip install wheel
# On a CPU-only machine, you may need to run this before `pip install -r requirements.txt`
# pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt
# Optional - notebooks
git clone [email protected]:graphcore-research/llm-inference-research.git --branch notebooks notebooks/
We use a script called dev
to automate building, testing, etc.
./dev
./dev --help
Copyright (c) 2023 Graphcore Ltd. Licensed under the MIT License.
See NOTICE.md for further details.