Name		Name	Last commit message	Last commit date
parent directory ..
imgs		imgs
01-Download-Convert.ipynb		01-Download-Convert.ipynb
02-ETL-with-NVTabular.ipynb		02-ETL-with-NVTabular.ipynb
03-Training-with-HugeCTR.ipynb		03-Training-with-HugeCTR.ipynb
03-Training-with-Merlin-Models-TensorFlow.ipynb		03-Training-with-Merlin-Models-TensorFlow.ipynb
04-Triton-Inference-with-Merlin-Models-TensorFlow.ipynb		04-Triton-Inference-with-Merlin-Models-TensorFlow.ipynb
README.md		README.md

README.md

Scaling Large Datasets with Criteo

Criteo provides the largest publicly available dataset for recommender systems with a size of 1TB of uncompressed click logs that contain 4 billion examples.

We demonstrate how to scale NVTabular, as well as:

Use multiple GPUs and nodes with NVTabular for feature engineering.
Train recommender system models with the Merlin Models for TensorFlow.
Train recommender system models with HugeCTR using multiple GPUs.
Inference with the Triton Inference Server and Merlin Models for TensorFlow.

Our recommendation is to use our latest stable Merlin containers for the examples. Each notebook provides the required container.

Explore the following notebooks:

Training and Deployment with TensorFlow:

Training with HugeCTR:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scaling-criteo

scaling-criteo

README.md

Scaling Large Datasets with Criteo

Files

scaling-criteo

Directory actions

More options

Directory actions

More options

Latest commit

History

scaling-criteo

Folders and files

parent directory

README.md

Scaling Large Datasets with Criteo