Skip to content

Latest commit

 

History

History
65 lines (48 loc) · 1.44 KB

README.md

File metadata and controls

65 lines (48 loc) · 1.44 KB

Run LLM in local for development

Run quickly a LLM in local as backend for development along with a Chat UI.

Using Ollama and LiteLLM.

All installed via docker compose.

Requirements

Install

  1. Configure .env.
  • COMPOSE_PROFILES. gpu (you need nvidia-container-toolkit installed) or cpu.
  1. Run docker compose.
docker compose up -d

Access to the services

Other interesting commands

Common docker compose commands useful in daily execution:

  1. Download a ollama model from cli:
docker compose exec ollama-gpu ollama pull <model_name>
  1. Stop.
docker compose stop
  1. Show logs.
docker compose logs -f
  1. Remove all.
docker compose down -v

Use your local LLM as Open AI replacement

Example using Langchain:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(openai_api_base="http://localhost:11434/v1", openai_api_key="ignored", model=<model>)

print(llm.invoke("Who are you?"))

Run it with uv:

export MODEL=qwen2.5:0.5b
docker compose exec ollama-gpu ollama pull $MODEL
uv run --with langchain[openai] test/simple.py