Skip to content

graphcore-research/llm-inference-research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM inference research

Experimentation framework from Graphcore Research, used to explore the machine learning performance of post-training model adaptation for accelerating LLM inference.

See: SparQ Attention.

Setup

See scripts/Eval.ipynb and scripts/Quantisation.ipynb for usage.

python3 -m venv .venv
# Append to .venv/bin/activate:
    export PYTHONPATH="${PYTHONPATH}:$(dirname ${VIRTUAL_ENV})"
    export TOKENIZERS_PARALLELISM=true

source .venv/bin/activate
pip install wheel
# On a CPU-only machine, you may need to run this before `pip install -r requirements.txt`
# pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements.txt

# Optional - notebooks
git clone [email protected]:graphcore-research/llm-inference-research.git  --branch notebooks notebooks/

Development

We use a script called dev to automate building, testing, etc.

./dev
./dev --help

License

Copyright (c) 2023 Graphcore Ltd. Licensed under the MIT License.

See NOTICE.md for further details.

About

An experimentation platform for LLM inference optimisation

Resources

License

Stars

Watchers

Forks

Packages

No packages published