This document has instructions for training Wide and Deep using a large dataset using Intel-optimized TensorFlow.
The Large Kaggle Display Advertising Challenge Dataset will be used for training Wide and Deep. The data is from Criteo and has a field indicating if an ad was clicked (1) or not (0), along with integer and categorical features.
Download the Large Kaggle Display Advertising Challenge Dataset from Criteo Labs in $DATASET_DIR
.
If the evaluation/train dataset were not available in the above link, it can be downloaded as follow:
export DATASET_DIR=<location where dataset files will be saved>
mkdir $DATASET_DIR && cd $DATASET_DIR
wget https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/eval.csv
wget https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/train.csv
The DATASET_DIR
environment variable will be used as the dataset directory when running quickstart scripts.
Script name | Description |
---|---|
training_check_accuracy.sh |
Trains the model for a specified number of steps (default is 500) and then compare the accuracy against the accuracy specified in the TARGET_ACCURACY env var (ex: export TARGET_ACCURACY=0.75 ). If the accuracy is not met, then script exits with error code 1. The CHECKPOINT_DIR environment variable can optionally be defined to start training based on previous set of checkpoints. |
training.sh |
Trains the model for 10 epochs. The CHECKPOINT_DIR environment variable can optionally be defined to start training based on previous set of checkpoints. |
training_demo.sh |
A short demo run that trains the model for 100 steps. |
Setup your environment using the instructions below, depending on if you are using AI Kit:
Setup using AI Kit | Setup without AI Kit |
---|---|
To run using AI Kit you will need:
|
To run without AI Kit you will need:
|
After the setup is complete, set environment variables for the path to your
dataset directory and an output directory where logs will be written. You can
optionally provide a directory where checkpoint files will be read and
written. Navigate to your model zoo directory, then select a
quickstart script to run. Note that some quickstart
scripts might use other environment variables in addition to the ones below,
like STEPS
and TARGET_ACCURACY
for the fp32_training_check_accuracy.sh
script.
# cd to your model zoo directory
cd models
export DATASET_DIR=<path to the dataset directory>
export PRECISION=fp32
export OUTPUT_DIR=<path to the directory where the logs and the saved model will be written>
export CHECKPOINT_DIR=<Optional directory where checkpoint files will be read and written>
# For a custom batch size, set env var `BATCH_SIZE` or it will run with a default value.
export BATCH_SIZE=<customized batch size value>
./quickstart/recommendation/tensorflow/wide_deep_large_ds/training/cpu/<script name>.sh
- To run more advanced use cases, see the instructions for the available precisions FP32 for calling the
launch_benchmark.py
script directly. - To run the model using docker, please see the Intel® Developer Catalog
workload container:
https://software.intel.com/content/www/us/en/develop/articles/containers/wide-deep-large-dataset-fp32-training-tensorflow-container.html.