A bunny that sits on top of a llama (and controls it).
The bunny-llama project is a tool that uses bun to interact with llama.cpp. It provides a way to quickly develop and test code using llama.cpp bindings. The project supports hot module reloading, which allows for changes in the code to be reflected in the running application without needing to manually restart it.
This also means the model does not have to be reloaded every time you make a change and recompile your custom prompting functions.
under the hood in the llama.cpp dir:
make api-llama.so && curl localhost:1337
bun make
and bun make-cuda
will take care of the curl already.
bun clone
bun make
bun ride
bun clean
curl -fsSL https://bun.sh/install | bash
For people with nvidia gpus:
install conda.
conda create -n bunny
conda activate bunny
conda install cuda -c nvidia
then make the llama with cuda, like so:
bun clone
bun make-cuda
bun ride.ts
now you have a special cuda enabled llama.
if you closed your shell and you want to build the cuda llama again, you need to activate the conda environment first:
conda activate bunny
bun make-cuda