This repo contains source code for deploying Llama2 via a REST API endpoint using FastAPI and Docker.
This project was created to provide foundation ground for deploying LLMs using FastAPI in production.
The assumption is that you have Pipenv
installed on your computer.
Follow the steps below
- Clone this repository
- Run
pipenv install
command to install the dependencies - Copy the
.env-sample
and create a.env
file then fill in the details. - Run the system using the docker command
docker-compose up -d
- Send a prompt to the API endpoint
/prompt
to get a response.