Note: This repo is now archived and won't see further development
This repo offers detailed How-to about serving your tensorflow models with tensorflow-serving
.
It can be followed as a step-by-step tutorial to serve your tensorflow
model or other models with some adjustments.
I do not cover the parts of training or exporting models here. I focus solely on serving a model for inference.
Two tutorials:
-
A basic tutorial to get you quickly serve an object detection model with
tensorflow-serving
. Expect to get such an awesome object-detection up and running in less than 10 mins. -
An advanced tutorial to deploy your tensorflow server docker image on Google Cloud Platform. Unleash the power of GCP to build a scalable machine learning server running in a
kubernetes
cluster.
The proposed object detection model is here to get you started quickly, feel free to use yours for more fun!
The reader is expected to be familiar with tensorflow, knowing how to export a model for inference will be helpful (tensorflow-server
works with savedModel
).
Knowing docker or kubernetes are not pre-requisites since the commands used are simple and explained when needed.
Those tutorials are highly inspired by tensorflow-serving official documentation with some tips and more detail on the installation process.
The installation process is thoroughly described in the docs/setup.md. It covers everything you need to do prior being able to serve a model with tensorflow-serving.
Play Time!
To make sure the installation went smoothly, get your first inference result from the object detection model tensorflow serving. Follow the first tutorial to serve an object detection model on your machine docs/tf_server_local.md.
Now that we made sure our inference server works great on local, let's deploy it on the cloud at production scale. Let's dive in the docs/tf_server_k8s.md.
Additional information you may find useful
Google proposes a managed solution - Google Cloud ML Engine - to serve your saved_models.pb
models.
I do not focus on it since google documentation is
a more comprehensive source of information.
Pros
- Deploy new models easily
Cons
- Less flexibility
- As of today, you are limited in size for your
savedModel.pb
file to250 MB
. (That may change)
The object_detection
directory comes from the
tensorflow-model repository.
It offers useful utils
functions to tag the image returned from the model.
Feel free to investigate the models on the tensorflow-model
repo since they are well documented and often comes with useful tutorials.