Deploying machine learning models using as a RESTful API via BentoML
*Currently supported translation: en-id & id-en*Models from huggingface hub: https://huggingface.co/Helsinki-NLP*
- Clone the repository
- Create a virtual environment and activate it
python -m venv venv
source venv/Scritps/activate
- Install required dependencies from requirements.txt
pip install -r requirements-dev.txt
- Fetch models from huggingface hub and save them as BentoML models. This step is only required once.
cd src
python save_model.py
- Start the BentoML developement server
bentoml serve service:svc --api-workers 2
use --api-workers 2 to run to avoid extensively hammering the CPU. It can be changed to any number depending on your machine's CPU cores.
- Open http://127.0.0.1:3000 in your browser and send test request from the web UI. Make sure to specify three important components in the request body. For example:
{
"text": "Hello world",
"source_language": "en",
"target_language": "id"
}
Requirements
- Docker
- Follow the first step in the API section above.
- Build the bento to prepare the deployment.
bentoml build
- Containerize the API as a docker image
bentoml containerize multi-translator:latest
- Run the docker image
docker run -p 3000:3000 multi-translator:hokdnioehogid2ci serve --production --api-workers 2
hokdnioehogid2ci is a unique ID to the bento object. You may want to change that to suit your bento unique ID which can be found after the containerization process.
- Run the web UI
cd gradio
python app.py
Advantages of BentoML:
- Easy to deploy
- Easy to integrate with other cloud providers (AWS, GCP, Azure)
- Easy to integrate with other machine learning frameworks (PyTorch, Tensorflow, Keras, etc)
- Handle heavy-lifting of model deployment which can be contolled through command line (adaptive-batching, kubernetes, docker, etc)