Wide and Deep Large Dataset training

This document has instructions for training Wide and Deep using a large dataset using Intel-optimized TensorFlow.

Dataset

The Large Kaggle Display Advertising Challenge Dataset will be used for training Wide and Deep. The data is from Criteo and has a field indicating if an ad was clicked (1) or not (0), along with integer and categorical features.

Download the Large Kaggle Display Advertising Challenge Dataset from Criteo Labs in $DATASET_DIR. If the evaluation/train dataset were not available in the above link, it can be downloaded as follow:

 export DATASET_DIR=<location where dataset files will be saved>
 mkdir $DATASET_DIR && cd $DATASET_DIR
 wget https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/eval.csv
 wget https://storage.googleapis.com/dataset-uploader/criteo-kaggle/large_version/train.csv

The DATASET_DIR environment variable will be used as the dataset directory when running quickstart scripts.

Quick Start Scripts

Script name	Description
`training_check_accuracy.sh`	Trains the model for a specified number of steps (default is 500) and then compare the accuracy against the accuracy specified in the `TARGET_ACCURACY` env var (ex: `export TARGET_ACCURACY=0.75`). If the accuracy is not met, then script exits with error code 1. The `CHECKPOINT_DIR` environment variable can optionally be defined to start training based on previous set of checkpoints.
`training.sh`	Trains the model for 10 epochs. The `CHECKPOINT_DIR` environment variable can optionally be defined to start training based on previous set of checkpoints.
`training_demo.sh`	A short demo run that trains the model for 100 steps.

Run the model

Setup your environment using the instructions below, depending on if you are using AI Kit:

Setup using AI Kit

Setup without AI Kit

To run using AI Kit you will need:

numactl
wget
Activate the `tensorflow` conda environment
```
conda activate tensorflow
```

To run without AI Kit you will need:

Python 3
intel-tensorflow>=2.5.0
numactl
git
wget

A clone of the Model Zoo repo

git clone https://github.com/IntelAI/models.git

After the setup is complete, set environment variables for the path to your dataset directory and an output directory where logs will be written. You can optionally provide a directory where checkpoint files will be read and written. Navigate to your model zoo directory, then select a quickstart script to run. Note that some quickstart scripts might use other environment variables in addition to the ones below, like STEPS and TARGET_ACCURACY for the fp32_training_check_accuracy.sh script.

# cd to your model zoo directory
cd models

export DATASET_DIR=<path to the dataset directory>
export PRECISION=fp32
export OUTPUT_DIR=<path to the directory where the logs and the saved model will be written>
export CHECKPOINT_DIR=<Optional directory where checkpoint files will be read and written>
# For a custom batch size, set env var `BATCH_SIZE` or it will run with a default value.
export BATCH_SIZE=<customized batch size value>

./quickstart/recommendation/tensorflow/wide_deep_large_ds/training/cpu/<script name>.sh

Additional Resources

To run more advanced use cases, see the instructions for the available precisions FP32 for calling the launch_benchmark.py script directly.
To run the model using docker, please see the Intel® Developer Catalog workload container:
https://software.intel.com/content/www/us/en/develop/articles/containers/wide-deep-large-dataset-fp32-training-tensorflow-container.html.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Wide and Deep Large Dataset training

Dataset

Quick Start Scripts

Run the model

Additional Resources

Files

README.md

Latest commit

History

README.md

File metadata and controls

Wide and Deep Large Dataset training

Dataset

Quick Start Scripts

Run the model

Additional Resources