This repo shows an end-to-end example on how to use the Vision API Product Search. A high level overview of the solution is provided on the README.md file.
Follow the steps in this guide to deploy your own Vision API Product Search solution!
Click the Start button to move to the next step.
From cloud shell, execute the following to install nodejs v12:
nvm install 12
nvm use 12
Verify that this is installed correctly by executing:
node -v
From Cloud Shell, execute the following to install 0.14.4 of Terraform:
export TERRAFORM_VERSION="0.14.4"
curl https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip > terraform_${TERRAFORM_VERSION}_linux_amd64.zip && \
sudo unzip -o terraform_${TERRAFORM_VERSION}_linux_amd64.zip -d /usr/local/bin && \
rm -f terraform_${TERRAFORM_VERSION}_linux_amd64.zip
Verify that this is installed correctly by executing:
terraform -v
Execute the following to set an environment variable that will be used throughout the rest of these instructions
export PROJECT_ROOT=$(pwd)
Open the variables.json.renameMe file.
The file should be in this format, the fields to fill in are quite self-explanatory:
{
"project_id": "PROJECT_PREFIX",
"billing_account_id": "BILLING_ACCOUNT_ID",
"region": "GCP_REGION",
"app_engine_region": "APP_ENGINE_REGION"
}
When you have finished editing this file, save a copy of this file as variables.json
.
Note: there are two variables for region
and app_engine_region
, because sometimes they are not the same. You can get a list of GCP regions by running gcloud compute regions list
, and you can get a list of App Engine regions by running gcloud app regions list
.
We will use Terraform to automate the deployment of the infrastructure. Simply execute the following:
cd $PROJECT_ROOT/terraform
terraform init
terraform plan
Double check that the infrastructure to be deployed makes sense, and then execute:
terraform apply -auto-approve
As part of the deployed infrastructure, a service account and corresponding key has been generated. Execute the following to export the key for later use:
terraform output -raw vision_product_search_service_account_key | base64 --decode > $PROJECT_ROOT/firestore-migrator/credentials.json
Now we will deploy the Firebase Function which will download the required product images for us.
Execute the following to set the project context of the Firebase SDK, so it knows which project to deploy the function into:
cd $PROJECT_ROOT/firebase
sed -E "s/ADD_YOUR_PROJECT_HERE/$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)/" .firebaserc.renameMe > .firebaserc
Your .firebaserc
should now look like the following:
{
"projects": {
"default": "PROJECT_ID"
}
}
Next, install node dependencies:
cd $PROJECT_ROOT/firebase/functions
npm install
Now we're almost ready to deploy, but first we need to configure which Cloud Storage bucket the images should be downloaded into. This bucket was created by Terraform already.
cd $PROJECT_ROOT/firebase
firebase functions:config:set imagebucket.name="$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)_images"
Now we can deploy the Firebase Function:
firebase deploy --only functions
We will use the firestore-migrator
command line tool to import the product data into a Firestore collection called products
. We will need to compile and build the tool first:
cd $PROJECT_ROOT/firestore-migrator
npm install
npm run-script build
npm link
Now we can use the fire-migrate
CLI to import all the records into Firestore, which will then kick off the Firebase Function to download the images into GCS. Note that the import process will take a few minutes, which is normal.
fire-migrate import $PROJECT_ROOT/data/products_0.csv products
fire-migrate import $PROJECT_ROOT/data/products_1.csv products
fire-migrate import $PROJECT_ROOT/data/products_2.csv products
Run the following to download and deploy the Test Harness App:
cd $PROJECT_ROOT
git clone https://github.com/zinjiggle/google-product-search-simple-ui
cd $PROJECT_ROOT/google-product-search-simple-ui
gcloud app deploy -q --project "$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)"
When the app is successfully deployed, it should be accessible from:
http://<PROJECT_ID>ae.uc.r.appspot.com
We now need to do some processing on the products CSV files (because Vision API requires GCS URIs for the images). So run the following command to generate a new set of CSV files.
cd $PROJECT_ROOT/data
sed -E "s/http:\/\//gs:\/\/$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)_images\//" products_0.csv > products_gcs_0.csv
sed -E "s/http:\/\//gs:\/\/$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)_images\//" products_1.csv > products_gcs_1.csv
sed -E "s/http:\/\//gs:\/\/$(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate project_id)_images\//" products_2.csv > products_gcs_2.csv
The first few lines of products_gcs_0.csv
should be something like this:
image-uri,image-id,product-set-id,product-id,product-category,product-display-name,labels,bounding-poly
gs://<image-bucket>/img.bbystatic.com/BestBuy_US/images/products/4390/43900_sa.jpg,,products,43900,general-v1,Duracell - AAA Batteries (4-Pack),,
gs://<image-bucket>/img.bbystatic.com/BestBuy_US/images/products/4853/48530_sa.jpg,,products,48530,general-v1,Duracell - AA 1.5V CopperTop Batteries (4-Pack),,
gs://<image-bucket>/img.bbystatic.com/BestBuy_US/images/products/1276/127687_sa.jpg,,products,127687,general-v1,Duracell - AA Batteries (8-Pack),,
Check that the generated CSV files look fine, and then upload them into the image bucket by running:
gsutil cp $PROJECT_ROOT/data/products_gcs_* $(terraform output -raw -state=$PROJECT_ROOT/terraform/terraform.tfstate vision_product_search_buckload_bucket_url)
- Download your credentials by right-clicking on the
firestore-migrator/credentials.json
in the Editor, then click on "Download" - Browse to the Test Harness application (
https://<project-id>ae.uc.r.appspot.com
) - Click on the yellow "Upload service account json file" button, select the service account key that you just downloaded
- Choose an appropriate location for the model (this should be close to the region you've chosen earlier)
- In the "Index images from CSVs" section, click on the + arrow twice to ensure there are three lines available
- For each of the lines, enter the GCS URI for the bulk upload CSV files and click on each "Import" button. The GCS URIs should be:
gs://<project-id>_bulkload/products_gcs_0.csv gs://<project-id>_bulkload/products_gcs_1.csv gs://<project-id>_bulkload/products_gcs_2.csv
Note: The indexing will take approximately 15-30 minutes for the operation to be "complete". It can also take potentially another 30-60 minutes for the machine learning model to train in the background.
- Browse to the Test Harness application (
https://<project-id>ae.uc.r.appspot.com
) - Click on the "Search" link on the top navigation
- Click on the yellow "Upload service account json file" button, select the service account key that you just downloaded
- Choose the same location as the previous step
- In the "Provide product set id to search" textbox, input
products
- Click on the "Upload an image to search" button, and upload an image of your choice
- For the "Category", select
general
- (Optional) Draw a bounding box around your image for better accuracy
- Click on the
Search
button
When you have finished exploring with this, execute the following to destroy the GCP project so that you won't be billed for usage anymore:
cd $PROJECT_ROOT/terraform
terraform destroy -auto-approve