Skip to content

Latest commit

 

History

History
231 lines (164 loc) · 8.74 KB

TUTORIAL.md

File metadata and controls

231 lines (164 loc) · 8.74 KB

Instructions

Introduction

This repo shows an end-to-end example on how to use the Vision API Product Search. A high level overview of the solution is provided on the README.md file.

Follow the steps in this guide to deploy your own Vision API Product Search solution!

Click the Start button to move to the next step.

Install Nodejs v12

From cloud shell, execute the following to install nodejs v12:

nvm install 12
nvm use 12

Verify that this is installed correctly by executing:

node -v

Install Terraform v0.14.4

From Cloud Shell, execute the following to install 0.14.4 of Terraform:

export TERRAFORM_VERSION="0.14.4"
curl https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip > terraform_${TERRAFORM_VERSION}_linux_amd64.zip && \
    sudo unzip -o terraform_${TERRAFORM_VERSION}_linux_amd64.zip -d /usr/local/bin && \
    rm -f terraform_${TERRAFORM_VERSION}_linux_amd64.zip

Verify that this is installed correctly by executing:

terraform -v

Set the PROJECT_ROOT environment variable

Execute the following to set an environment variable that will be used throughout the rest of these instructions

export PROJECT_ROOT=$(pwd)

Create a variables.json file in the repository root

Open the variables.json.renameMe file.

The file should be in this format, the fields to fill in are quite self-explanatory:

{
    "project_id": "PROJECT_PREFIX",
    "billing_account_id": "BILLING_ACCOUNT_ID",
    "region": "GCP_REGION",
    "app_engine_region": "APP_ENGINE_REGION"
}

When you have finished editing this file, save a copy of this file as variables.json.

Note: there are two variables for region and app_engine_region, because sometimes they are not the same. You can get a list of GCP regions by running gcloud compute regions list, and you can get a list of App Engine regions by running gcloud app regions list.

Deploy the infrastructure

We will use Terraform to automate the deployment of the infrastructure. Simply execute the following:

cd $PROJECT_ROOT/terraform
terraform init
terraform plan

Double check that the infrastructure to be deployed makes sense, and then execute:

terraform apply -auto-approve

Generate service account credentials for later use

As part of the deployed infrastructure, a service account and corresponding key has been generated. Execute the following to export the key for later use:

terraform output -raw vision_product_search_service_account_key | base64 --decode > $PROJECT_ROOT/firestore-migrator/credentials.json

Deploy Firebase Function for image download and processing

Now we will deploy the Firebase Function which will download the required product images for us.

Execute the following to set the project context of the Firebase SDK, so it knows which project to deploy the function into:

cd $PROJECT_ROOT/firebase
sed -E "s/ADD_YOUR_PROJECT_HERE/$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)/" .firebaserc.renameMe > .firebaserc

Your .firebaserc should now look like the following:

{
  "projects": {
    "default": "PROJECT_ID"
  }
}

Next, install node dependencies:

cd $PROJECT_ROOT/firebase/functions
npm install

Now we're almost ready to deploy, but first we need to configure which Cloud Storage bucket the images should be downloaded into. This bucket was created by Terraform already.

cd $PROJECT_ROOT/firebase
firebase functions:config:set imagebucket.name="$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)_images" 

Now we can deploy the Firebase Function:

firebase deploy --only functions

Import CSV data into Firestore

We will use the firestore-migrator command line tool to import the product data into a Firestore collection called products. We will need to compile and build the tool first:

cd $PROJECT_ROOT/firestore-migrator
npm install
npm run-script build
npm link

Now we can use the fire-migrate CLI to import all the records into Firestore, which will then kick off the Firebase Function to download the images into GCS. Note that the import process will take a few minutes, which is normal.

fire-migrate import $PROJECT_ROOT/data/products_0.csv products
fire-migrate import $PROJECT_ROOT/data/products_1.csv products
fire-migrate import $PROJECT_ROOT/data/products_2.csv products

Download and Deploy the Test Harness App

Run the following to download and deploy the Test Harness App:

cd $PROJECT_ROOT
git clone https://github.com/zinjiggle/google-product-search-simple-ui
cd $PROJECT_ROOT/google-product-search-simple-ui
gcloud app deploy -q --project "$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)"

When the app is successfully deployed, it should be accessible from:

http://<PROJECT_ID>ae.uc.r.appspot.com

Process and upload the bulk upload CSV file

We now need to do some processing on the products CSV files (because Vision API requires GCS URIs for the images). So run the following command to generate a new set of CSV files.

cd $PROJECT_ROOT/data
sed -E "s/http:\/\//gs:\/\/$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)_images\//" products_0.csv > products_gcs_0.csv
sed -E "s/http:\/\//gs:\/\/$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)_images\//" products_1.csv > products_gcs_1.csv
sed -E "s/http:\/\//gs:\/\/$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)_images\//" products_2.csv > products_gcs_2.csv

The first few lines of products_gcs_0.csv should be something like this:

image-uri,image-id,product-set-id,product-id,product-category,product-display-name,labels,bounding-poly
gs://<image-bucket>/img.bbystatic.com/BestBuy_US/images/products/4390/43900_sa.jpg,,products,43900,general-v1,Duracell - AAA Batteries (4-Pack),,
gs://<image-bucket>/img.bbystatic.com/BestBuy_US/images/products/4853/48530_sa.jpg,,products,48530,general-v1,Duracell - AA 1.5V CopperTop Batteries (4-Pack),,
gs://<image-bucket>/img.bbystatic.com/BestBuy_US/images/products/1276/127687_sa.jpg,,products,127687,general-v1,Duracell - AA Batteries (8-Pack),,

Check that the generated CSV files look fine, and then upload them into the image bucket by running:

gsutil cp $PROJECT_ROOT/data/products_gcs_* $(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate vision_product_search_buckload_bucket_url)

Index the Product Set using the Test Harness App

  • Download your credentials by right-clicking on the firestore-migrator/credentials.json in the Editor, then click on "Download"
  • Browse to the Test Harness application (https://<project-id>ae.uc.r.appspot.com)
  • Click on the yellow "Upload service account json file" button, select the service account key that you just downloaded
  • Choose an appropriate location for the model (this should be close to the region you've chosen earlier)
  • In the "Index images from CSVs" section, click on the + arrow twice to ensure there are three lines available
  • For each of the lines, enter the GCS URI for the bulk upload CSV files and click on each "Import" button. The GCS URIs should be:
    gs://<project-id>_bulkload/products_gcs_0.csv
    gs://<project-id>_bulkload/products_gcs_1.csv
    gs://<project-id>_bulkload/products_gcs_2.csv
    

Note: The indexing will take approximately 15-30 minutes for the operation to be "complete". It can also take potentially another 30-60 minutes for the machine learning model to train in the background.

Perform a Product Search using a test image

  • Browse to the Test Harness application (https://<project-id>ae.uc.r.appspot.com)
  • Click on the "Search" link on the top navigation
  • Click on the yellow "Upload service account json file" button, select the service account key that you just downloaded
  • Choose the same location as the previous step
  • In the "Provide product set id to search" textbox, input products
  • Click on the "Upload an image to search" button, and upload an image of your choice
  • For the "Category", select general
  • (Optional) Draw a bounding box around your image for better accuracy
  • Click on the Search button

Cleanup the infrastructure

When you have finished exploring with this, execute the following to destroy the GCP project so that you won't be billed for usage anymore:

cd $PROJECT_ROOT/terraform
terraform destroy -auto-approve