-
-
Notifications
You must be signed in to change notification settings - Fork 107
2.3.40 Satellite llamaswap
av edited this page Mar 22, 2025
·
2 revisions
Handle:
llamaswap
URL: http://localhost:34401
llama-swap is a lightweight, transparent proxy server that provides automatic model swapping to llama.cpp's server.
# [Optional] pre-pull the image
harbor pull llamaswap
# Run the service
harbor up llamaswap
-
llamaswap
image in Harbor will run its own llama.cpp server, that is different from the one running in thellamacpp
service - Harbor will connect
llamaswap
to Open WebUI when run together - Harbor will mount following local caches to be available within llama-swap container:
- Ollama -
/root/.ollama
- Hugging Face -
/root/.cache/huggingface
- llama.cpp -
/root/.cache/llama.cpp
- vLLM -
/root/.cache/vllm
- Ollama -
Expected way to configure llama-swap is by editing the config.yaml
file:
# Open in your default editor
open $(harbor home)/llamaswap/config.yaml
See official configuration example for reference.