Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile for llm and service in docker compose #98

Merged
merged 15 commits into from
Feb 25, 2024
Merged
12 changes: 9 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,12 +71,18 @@ Upon app startup, OpenAI-compatible embedding API will be available at:

Check the docs here: <http://172.16.3.101:5001/docs>

#### llamacpp
#### Download llm model (must have for servis llm to work !!!)

Download models (this can take >1h):
Download model (size of file 3.6GB ):

```sh
wget https://huggingface.co/TheBloke/sheep-duck-llama-2-70B-v1.1-GGUF/resolve/main/sheep-duck-llama-2-70b-v1.1.Q4_K_S.gguf
curl -o ./llm/models/llama-2-7b.Q3_K_L.gguf -L https://huggingface.co/TheBloke/Llama-2-7B-GGUF/resolve/main/llama-2-7b.Q3_K_L.gguf
```

or

```sh
wget -P ./llm/models/llama-2-7b.Q3_K_L.gguf https://huggingface.co/TheBloke/Llama-2-7B-GGUF/resolve/main/llama-2-7b.Q3_K_L.gguf
```

#### Starting app
Expand Down
13 changes: 11 additions & 2 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
services:
discord-bot:
profiles: ["dev","prod"]
profiles: [ "dev", "prod" ]
build: ./discord_bot
env_file: .env

Expand All @@ -16,7 +16,7 @@ services:
- .env

api:
profiles: ["dev","prod"]
profiles: [ "dev", "prod" ]
build:
context: ./api/
env_file:
Expand All @@ -27,3 +27,12 @@ services:
- "8000:8000"
depends_on:
- db

llm:
profiles: [ "dev", "prod" ]
build:
context: ./llm/
volumes:
- ./llm/models:/models
ports:
- "9000:9000"
5 changes: 5 additions & 0 deletions llm/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
models/
*.ipynb
llama-cpp-python/
.pytest_cache/
__pycache__/
28 changes: 28 additions & 0 deletions llm/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
FROM python:3.11-buster as builder

RUN pip install poetry==1.6.1
RUN apt-get update && apt-get install -y git
RUN git clone --recurse-submodules https://github.com/abetlen/llama-cpp-python.git

ENV POETRY_NO_INTERACTION=1 \
POETRY_VIRTUALENVS_IN_PROJECT=1 \
POETRY_VIRTUALENVS_CREATE=1 \
POETRY_CACHE_DIR=/tmp/poetry_cachee

WORKDIR /app

COPY pyproject.toml poetry.lock ./
RUN touch README.md

RUN --mount=type=cache,target=$POETRY_CACHE_DIR poetry install

FROM python:3.11-slim-buster as runtime

ENV VIRTUAL_ENV=/app/.venv \
PATH="/app/.venv/bin:$PATH"

COPY --from=builder ${VIRTUAL_ENV} ${VIRTUAL_ENV}

RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python[server]

ENTRYPOINT ["python3", "-m", "llama_cpp.server", "--host", "0.0.0.0", "--port", "9000", "--model", "models/llama-2-7b.Q3_K_L.gguf", "--n_gpu_layers", "9999999"]
Loading