Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: Frontend docling library with fastapi and build the container im… #159

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,11 @@ ps-image: Containerfile.ps ## Build continaer image for the pathservice
$(CMD_PREFIX) docker build -f Containerfile.ps -t ghcr.io/instructlab/ui/pathservice:$(TAG) .
$(CMD_PREFIX) docker tag ghcr.io/instructlab/ui/pathservice:$(TAG) ghcr.io/instructlab/ui/pathservice:main

duckbill-image: duckbill/Containerfile ## Build continaer image for the pathservice
$(ECHO_PREFIX) printf " %-12s duckbill/Containerfile\n" "[docker]"
$(CMD_PREFIX) docker build -f duckbill/Containerfile --platform linux/amd64 -t ghcr.io/instructlab/ui/duckbill:$(TAG) ./duckbill
$(CMD_PREFIX) docker tag ghcr.io/instructlab/ui/duckbill:$(TAG) quay.io/instructlab-ui/docling:main

##@ Local Dev - Run the stack (UI and PathService) on your local machine
.PHONY: stop-dev-local
stop-dev-local: ## Stop the npm and pathservice local instances
Expand Down
34 changes: 34 additions & 0 deletions duckbill/Containerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
FROM python:3.11-slim-bookworm

WORKDIR /duckbill
COPY ./requirements.txt /duckbill/requirements.txt

RUN pip install --upgrade pip
RUN pip install --no-cache-dir --upgrade -r /duckbill/requirements.txt

ENV GIT_SSH_COMMAND="ssh -o StrictHostKeyChecking=no"

RUN apt-get update \
&& apt-get install -y libgl1 libglib2.0-0 curl wget git \
&& apt-get clean

# This will install torch with *only* cpu support
# Remove the --extra-index-url part if you want to install all the gpu requirements
# For more details in the different torch distribution visit https://pytorch.org/.
#RUN pip install --no-cache-dir docling --extra-index-url https://download.pytorch.org/whl/cpu
#RUN pip install --no-cache-dir docling fastapi

ENV HF_HOME=/tmp/
ENV TORCH_HOME=/tmp/

RUN python -c 'from deepsearch_glm.utils.load_pretrained_models import load_pretrained_nlp_models; load_pretrained_nlp_models(verbose=True);'
RUN python -c 'from docling.document_converter import DocumentConverter; artifacts_path = DocumentConverter.download_models_hf(force=True);'

# On container environments, always set a thread budget to avoid undesired thread congestion.
ENV OMP_NUM_THREADS=4

COPY ./apiserver /duckbill/apiserver

EXPOSE 5000

CMD ["fastapi", "run", "apiserver/main.py", "--port", "5000"]
Empty file added duckbill/apiserver/__init__.py
Empty file.
24 changes: 24 additions & 0 deletions duckbill/apiserver/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
from typing import Union

from fastapi import FastAPI
from pydantic import BaseModel
from docling.document_converter import DocumentConverter


app = FastAPI()

class UrlRequest(BaseModel):
url: str

@app.get("/")
def read_root():
return {"Docling Service is up and running."}


@app.post("/simpleconvert")
def simpleconvert(request: UrlRequest):
url = request.url
converter = DocumentConverter()
doc = converter.convert_single(url)
print(doc.render_as_markdown())
return {doc.render_as_markdown()}
3 changes: 3 additions & 0 deletions duckbill/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
fastapi[standard]>=0.113.0,<0.114.0
pydantic>=2.7.0,<3.0.0
docling>=1.9.0