Table of Contents (🔎 Click to expand/collapse)
Quest Outline (🔎 Click to expand/collapse)
Level | Code | Name | Note |
---|---|---|---|
VIDEO | What is the What-If Tool? | ||
VIDEO | Getting Started with the What-if Tool | ||
Introductory | GSP076 |
AI Platform: Qwik Start | EN |
Fundamental | GSP710 |
Using the What-If Tool with Image Recognition Models | EN |
VIDEO | Using the What-if Tool Performance & Fairness features | ||
Fundamental | GSP709 |
Identifying Bias in Mortgage Data using Cloud AI Platform and the What-if Tool | EN |
Fundamental | GSP684 |
Compare Cloud AI Platform Models using the What-If Tool to Identify Potential Bias | EN |
Advanced | GSP324 |
Explore Machine Learning Models with Explainable AI: Challenge Lab |
This lab is recommended for students who have enrolled in the Explore Machine Learning Models with Explainable AI quest.
Topics tested:
- Launching an AI Platform Notebook
- Downloading and exploring a sample dataset
- Building and training two different TensorFlow models
- Deploying models to the Cloud AI Platform
- Using the What-If Tool to compare the models
You are a curious coder who wants to explore biases in public datasets using the What-If Tool. You decide to pull some mortgage data to train a couple of machine learning models to predict whether an applicant will be granted a loan. You specifically want to investigate how the two models perform when they are trained on different proportions of males and females in the datasets, and visualize their differences in the What-If Tool.
You are expected to have the skills and knowledge for these tasks, so don’t expect step-by-step guides.
- Click Navigation Menu > AI Platform > Notebooks.
- In the Notebooks Instance page, click NEW INSTANCE.
- In the Customize instance menu, select the version of notebook instance.
- TensorFlow Enterprise 2.3 > Without GPUs.
- Region:
us-central1 (Iowa)
- Click CREATE, the AI Platform console will display your instance name after a few minutes.
- Click OPEN JUPYTERLAB. Your notebook is now set up.
A notification should pop up asking you to migrate your notebook instance to the new Notebooks API. Click Enable Notebooks API to give your notebook additional functionality.
-
In the notebook, click the terminal.
-
Clone the repo with following command:
$ git clone https://github.com/GoogleCloudPlatform/training-data-analyst
-
Open the notebook file
training-data-analyst/quests/dei/what-if-tool-challenge.ipynb
. -
Run the cells to download and import the dataset
hmda_2017_ny_all-records_labels
.
Use TensorFlow to build two models: one trained on the complete dataset, and one trained on the limited dataset. You should use the model Sequential()
. Documentation is here.
IMPORTANT: To accurately check your progress, the first model should be saved in the location saved_complete_model/saved_model.pb and the second in saved_limited_model/saved_model.pb.
-
Navigate to the Create and train your TensorFlow models section in notebook.
-
Train the first model with following code:
# Train the first model on the complete dataset. model = Sequential() model.add(layers.Dense(8, input_dim=input_size)) model.add(layers.Dense(1, activation='sigmoid')) model.compile(optimizer='sgd', loss='mse') model.fit(train_data, train_labels, batch_size=32, epochs=10)
-
Train the second model with following code:
# Train your second model on the limited dataset. limited_model = Sequential() limited_model.add(layers.Dense(8, input_dim=input_size)) limited_model.add(layers.Dense(1, activation='sigmoid')) limited_model.compile(optimizer='sgd', loss='mse') limited_model.fit(limited_train_data, limited_train_labels, batch_size=32, epochs=10)
Now, you'll deploy your models to the AI Platform.
Hint: You need to first create a storage bucket to store your models in.
- Create the Complete AI Platform model
- Model Name =
complete_model
- Version Name = v1
- Python version = 3.7
- Framework = TensorFlow
- Framework version = 2.3.1
- ML Runtime version = 2.3
- Model Name =
- Create the Limited AI Platform model
- Model Name =
limited_model
- Version Name = v1
- Python version = 3.7
- Framework = TensorFlow
- Framework version = 2.3.1
- ML Runtime version = 2.3
- Model Name =
Before deploy our modesl to the AI Platform, we have to create a Cloud Storage bucket for storing the models.
- Click Navigation Menu > Cloud Stroage > Browser.
- Click CREATE BUCKET and fill up the fields.
- Bucket Name: use a globally unique name, e.g.
<Project_ID>
- Bucket Name: use a globally unique name, e.g.
- Click CREATE
After created the bucket, switch back to the JupyterLab page and deploy models. Run the cells after modifing following part:
-
Define the environment variables in Deploy your models to the AI Platform section.
GCP_PROJECT = '<PROJECT_ID>' MODEL_BUCKET = 'gs://<BUCKET_NAME>' # recommend: as the same as the <PROJECT_ID> MODEL_NAME = 'complete_model' # do not modify LIM_MODEL_NAME = 'limited_model' # do not modify VERSION_NAME = 'v1' REGION = 'us-central1'
-
Add cells in Create your first AI Platform model:
saved_complete_model
section.!gcloud ai-platform models create $MODEL_NAME --regions $REGION !gcloud ai-platform versions create $VERSION_NAME \ --model=$MODEL_NAME \ --framework='TensorFlow' \ --runtime-version=2.3 \ --origin=$MODEL_BUCKET/saved_complete_model \ --staging-bucket=$MODEL_BUCKET \ --python-version=3.7 \ --project=$GCP_PROJECT
-
Add cells in Create your second AI Platform model:
saved_limited_model
section.!gcloud ai-platform models create $LIM_MODEL_NAME --regions $REGION !gcloud ai-platform versions create $VERSION_NAME \ --model=$LIM_MODEL_NAME \ --framework='TensorFlow' \ --runtime-version=2.3 \ --origin=$MODEL_BUCKET/saved_limited_model \ --staging-bucket=$MODEL_BUCKET \ --python-version=3.7 \ --project=$GCP_PROJECT
After the models are deployed to the AI Platform, we can use the following code to explore them in the What-If Tool in the notebook.
config_builder = (WitConfigBuilder(
examples_for_wit[:num_datapoints],feature_names=column_names)
.set_custom_predict_fn(limited_custom_predict)
.set_target_feature('loan_granted')
.set_label_vocab(['denied', 'accepted'])
.set_compare_custom_predict_fn(custom_predict)
.set_model_name('limited')
.set_compare_model_name('complete'))
WitWidget(config_builder, height=800)
-
In the Performance and Fairness tab, slice by sex (
applicant_sex_name_Female
). How does the complete model compare to the limited model for females?The complete model has equal performance accross sexes, whereas the limited model is much worse on females.
-
Click on one of the datapoints in the middle of the arc. In the datapoint editor, change
(applicant_sex_name_Female)
to0
, and(applicant_sex_name_Male)
to1
. Now run the inference again. How does the model change?The limited model has a significantly larger delta than the complete model, whereas the complete model has almost no change.
-
In the Performance and Fairness tab, use the fairness buttons to see the thresholds for the sexes for demographic parity between males and females. How does this change the thresholds for the limited model?
The thresholds have to be wildly different for the limited model.