Skip to content

Failing Torchbench Models: tracking issue #5932

@ysiraichi

Description

@ysiraichi

Summary of Contributions (9th Feb)

  1. Improve the number of models in TorchBench that work with Dynamo as a tracer: These passing rates are now comparable to those from torch.compile using Inductor. Some of the fixes also improved the previous tracer that PyTorch/XLA used to use.

    Inference Training
    Inductor 87 63
    Dynamo 60 to 82 41 to 53
    Non-Dynamo 79 to 82 54 to 56
  2. Improve the benchmarking tools used by Google: The initial Google runs benchmarking these models showed a discrepancy of about 15 models with the results reported. We identified and fixed 10+ issues that helped reconcile Google's benchmarks with those reported and, in turn, with the PyTorch HUD.

Current State

This post has two lists:

  • Failing inference models
  • Failing training models

Each of them shows the failing models:

  • Tracing without Dynamo (Eager-mode)
  • Tracing with Dynamo into openxla (Dynamo+openxla)

These lists were created using the benchmarking scripts that currently live in the upstream. The following command was executed:

python xla/benchmarks/experiment_runner.py \
       --suite-name torchbench \
       --accelerator cuda \
       --xla PJRT --xla None \
       --dynamo openxla --dynamo inductor --dynamo None \
       --test eval --test train \
       --repeat 30 --iterations-per-run 5 \
       --print-subprocess \
       --no-resume

Environment

  • GPU: A100 40GB

Inference

Non-Dynamo. Pass rate: 78/81 - 96% (against inductor)

Dynamo+openxla. 78/81 - 96% (against inductor)

Models also Failing on Inductor

Inference Failing on Inductor CUDA with the Same Error

Benchmarks that raise the same error on inductor:

  • hf_clip
    • 'str' object has no attribute 'shape'
  • mobilenet_v2_quantized_qat
  • resnet50_quantized_qat

Inference Failing on Inductor CUDA with Different Errors

Training

Non-Dynamo. Pass rate: 64/66 - 96% (against inductor)

Dynamo+openxla. Pass rate: 55/66 - 83% (against inductor)

Models also Failing on Inductor

No Training Support on Inductor CUDA

Benchmarks that raise the error: Model's DEFAULT_TRAIN_BSIZE is not implemented.

  • cm3leon_generate
  • detectron2_fcos_r_50_fpn
  • doctr_det_predictor
  • doctr_reco_predictor
  • hf_T5_generate
  • llama
  • phi_1_5
  • pyhpc_equation_of_state
  • pyhpc_isoneutral_mixing
  • pyhpc_turbulent_kinetic_energy
  • sam
  • simple_gpt
  • simple_gpt_tp_manual

Training Failing on Inductor CUDA with the Same Error

Benchmarks that raise the same error on inductor:

Training Failing on Inductor CUDA with Different Errors

cc @JackCaoG @miladm

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions