Skip to content

build llama inference compute from scrath, only using torch/numpy base ops

Notifications You must be signed in to change notification settings

silencelamb/naked_llama

Repository files navigation

Naked LLaMA

Build llama inference compute from scrath, only using torch/numpy base ops

Inspired by karpathy's awesome repo nanoGPT, I re-implemented a simple and clear llama model from scratch.

install

pip install torch >= 2.1.0

# transformers is used for convert model weights and compare results
pip install transformers >= 4.35.2

excute & result

git clone https://github.com/silencelamb/naked_llama.git

# convert huggingface model to npy file
python convert_hf_to_pkl.py  # default model_size is 7b

# default model_size is 7b
python naked_llama_forward.py

# run 70 b
python naked_llama.py --model_size 70b

references

About

build llama inference compute from scrath, only using torch/numpy base ops

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published