This is an ad-hoc project to quickly recap how to deploy a machine learning model. This project use pre-trained language model (bert-base-NER) for NER task. See more from the model source : https://huggingface.co/dslim/bert-base-NER.
No fine-tuning is performed to the model and transformer pipelines is used for model inference for the sake of time (see more : https://huggingface.co/docs/transformers/main_classes/pipelines). Below is the output from the output with the input of "RGU is a university located in Aberdeen, south east part of Scotland". ( PS: I know it's north east part of Scotland XD )
The colour set as follow : ORG(green)
, LOC(red)
, PER(blue)
and MISC(purple)
Flask
is used as web application framework andGunicorn
is used as the server gateway.
- 1.1. Create a form in
index.html
(front-end) which serve as the text input - 1.2. Create an
app.py
(back-end) which take the input from front end
- Load the pre-trained model
AutoModelForTokenClassification
and tokenizerAutoTokenizer
by using HuggingFace. - Tokenize the input and feed to the model for inference by using
pipeline
from transformer library. - Post-process the output result (see postProcessing function)
Raw result as follow:
[{'entity': 'B-ORG', 'score': 0.99754566, 'index': 1, 'word': 'R', 'start': 0, 'end': 1}, {'entity': 'I-ORG', 'score': 0.9739576, 'index': 2, 'word': '##G', 'start': 1, 'end': 2}, {'entity': 'I-ORG', 'score': 0.9781341, 'index': 3, 'word': '##U', 'start': 2, 'end': 3}, {'entity': 'B-LOC', 'score': 0.99568594, 'index': 9, 'word': 'Aberdeen', 'start': 31, 'end': 39}, {'entity': 'B-LOC', 'score': 0.9995809, 'index': 15, 'word': 'Scotland', 'start': 60, 'end': 68}]
- Transform the processed JSON result to HTML by highlighting the keyword with different colours (see highlightWord function).
- Return the transformed HTML
- Test the model from local host (
flask run
command from terminal) - Once inference made successfully, ready to deploy.
- THe easiest way is deploy via heroku as the process is very straight forward. However, the web server will crash due to the memory limitation in free tier. Error as below:
-
Therefore,
AWS Elastic BeanStalk
(EBS) is an ideal choice thanks to the scalability and simplicity. See more: https://aws.amazon.com/elasticbeanstalk/ -
To deploy flask web-app to EBS,
AWS code pipeline
service is used to automate continuous delivery pipeline (via GitHub). -
To do so, first push the flask application to GitHub.
-
Create an EBS environment and application for the flask application.
-
Create a code pipeline, with the input (source) from GitHub, and output (deploy) to the EBS application which created earlier.
-
Scale up the resource (EC2 instance) if the web-server crash (Web application crash with free-tier instance [see error below], works well with t2.medium instance).
web: RuntimeError: [enforce fail at alloc_cpu.cpp:75] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 2359296 bytes. Error code 12 (Cannot allocate memory)
- Configure the DNS if needed.