Add GPU inference example (GoogleCloudPlatform#64)

* Add GPU inference example Add GPU inference example * Feedback * Noah' feedback
halio-g · Sep 6, 2019 · ef9c655 · ef9c655
1 parent b8df3d5
commit ef9c655
Show file tree

Hide file tree

Showing 2 changed files with 105 additions and 6 deletions.
diff --git a/prediction/tensorflow/README.md b/prediction/tensorflow/README.md
@@ -1,17 +1,66 @@
 # TensorFlow Estimator - Deploy model
 
 The purpose of this directory is to provide a sample for how you can deploy a
-TensorFlow trained model in AI Platform.
+TensorFlow trained model in AI Platform with GPU.
 
-*   Run the training example under /training/base/core/tensorflow using the
-    `aiplatform-submit-train-job.sh` or `local-train.sh` scripts.
-*   Run `aiplatform-deploy-model.sh`
+*   Run the training example under `/training/tensorflow/structured/base/scripts` using the
+    `train-cloud.sh` or `train-local.sh` scripts.
+*   Run either:
+    - `cloud-deploy-model.sh`
+    - `cloud-deploy-model-gpu.sh`
 
+## GPU support
 
-## Scripts:
+Now if you want to deploy a new model and use GPUs, now is as simple as 
+define the machine type and select which accelerator you want to use for
+your new model. 
 
-  [cloud-deploy-model.sh](scripts/cloud-deploy-model.sh)  This script deploys a model in 
+Upgrade to the latest version of Google Cloud SDK
+
+```
+gcloud components update
+```
+
+Define the machine-type which will be handling these requests. In this case we enabled a `n1-standard-4` which is a Standard machine type with 4 vCPUs and 15 GB of memory. The full list is available [here](https://cloud.google.com/compute/docs/machine-types). 
+
+After you update to new gcloud SDK version you will see the `--accelerator` option available. 
+The type of the accelerator can only be one of the following: 
+
+```
+nvidia-tesla-k80
+nvidia-tesla-p100
+nvidia-tesla-p4
+nvidia-tesla-t4 
+nvidia-tesla-v100
+tpu-v2 (Not covered in this document)
+```
+
+Create a new model deployment with GPU:
+
+```
+gcloud alpha ai-platform versions create gpu_v1 \
+ --model=model_inference \
+ --runtime-version=1.14 \
+ --python-version=3.5 \
+ --framework=tensorflow \
+ --machine-type="n1-standard-4" \
+ --accelerator=count=4,type=nvidia-tesla-t4 \
+ --origin=gs://google_cloud_bucket/model/
+```
+
+**Note:** This feature is in Alpha. If you want to get access contact: <[email protected]>
+
+## Scripts
+
+  [cloud-deploy-model.sh](structured/scripts/cloud-deploy-model.sh)  This script deploys a model in 
   AI platform Prediction. It expects a Saved Model in Google Cloud Storage.
+
+  [cloud-deploy-model-gpu.sh](structured/scripts/cloud-deploy-model-gpu.sh) This script deploys a model in 
+  AI platform Prediction using GPU. It expects a Saved Model in Google Cloud Storage.
 
 ## Versions
 Suitable for TensorFlow v1.13.1+
+
+## Feedback
+
+We’re happy to hear from you if we need to enable additional Compute Engine machine types. If you desire a machine type not available here, please contact <[email protected]>
diff --git a/prediction/tensorflow/structured/scripts/cloud-deploy-model-gpu.sh b/prediction/tensorflow/structured/scripts/cloud-deploy-model-gpu.sh
@@ -0,0 +1,50 @@
+#!/bin/bash
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+REGION="us-central1" # choose a GCP region, e.g. "us-central1". Choose from https://cloud.google.com/ml-engine/docs/tensorflow/regions
+BUCKET="your-bucket-name" # change to your bucket name, e.g. "my-bucket"
+
+MODEL_NAME="you_model_name_gpu" # change to your model name, e.g. "estimator"
+MODEL_VERSION="v1" # change to your model version, e.g. "v1"
+
+# Model Binaries corresponds to the tf.estimator.FinalExporter configuration in trainer/experiment.py
+MODEL_BINARIES=$(gsutil ls gs://${BUCKET}/models/${MODEL_NAME}/export/estimate | tail -1)
+RUNTIME_VERSION=1.14
+GPU_TYPE="nvidia-tesla-t4"
+
+gsutil ls ${MODEL_BINARIES}
+
+# Delete model version, if previous model version exist.
+gcloud ai-platform versions delete ${MODEL_VERSION} --model=${MODEL_NAME}
+
+# Delete model, if previous model exist.
+gcloud ai-platform models delete ${MODEL_NAME}
+
+# Deploy model to GCP
+gcloud ai-platform models create ${MODEL_NAME} --regions=${REGION}
+
+# Deploy model version
+gcloud alpha ai-platform versions create ${MODEL_VERSION} \
+ --model=${MODEL_NAME} \
+ --runtime-version=${RUNTIME_VERSION} \
+ --python-version 3.5 \
+ --framework tensorflow \
+ --machine-type "n1-standard-4" \
+ --accelerator=count=4,type=${GPU_TYPE} \
+ --origin=${MODEL_BINARIES}
+
+
+# Invoke deployed model to make prediction given new data instances
+gcloud ai-platform predict --model=${MODEL_NAME} --version=${MODEL_VERSION} --json-instances=data/new-data.json