Failing Torchbench Models: tracking issue

## Summary of Contributions (9th Feb)

1) **Improve the number of models in TorchBench that work with Dynamo as a tracer:** These passing rates are now comparable to those from torch.compile using Inductor. Some of the fixes also improved the previous tracer that PyTorch/XLA used to use.

    |            | Inference | Training |
    |------------|-----------|----------|
    | Inductor    | 87 | 63 |
    | Dynamo     | 60 to 82  | 41 to 53 |
    | Non-Dynamo | 79 to 82  | 54 to 56 |


2) **Improve the benchmarking tools used by Google:** The initial Google runs benchmarking these models showed a discrepancy of about 15 models with the results reported. We identified and fixed 10+ issues that helped reconcile Google's benchmarks with those reported and, in turn, with the PyTorch HUD.

## Current State

This post has two lists:
- Failing inference models
- Failing training models

Each of them shows the failing models:
- Tracing without Dynamo (Eager-mode)
- Tracing with Dynamo into openxla (Dynamo+`openxla`)

These lists were created using the benchmarking scripts that currently live in the upstream. The following command was executed:

```bash
python xla/benchmarks/experiment_runner.py \
       --suite-name torchbench \
       --accelerator cuda \
       --xla PJRT --xla None \
       --dynamo openxla --dynamo inductor --dynamo None \
       --test eval --test train \
       --repeat 30 --iterations-per-run 5 \
       --print-subprocess \
       --no-resume
```

## Environment

- **GPU:** A100 40GB

## Inference
### Non-Dynamo. Pass rate: 78/81 - 96% (against inductor)

- ~[x] DALLE2_pytorch~
    - Issue: #6010 
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - Moved to canary models: https://github.com/pytorch/benchmark/pull/2311
- [ ] cm3leon_generate
    - Issue: #6004
- [x] hf_Longformer
    - Issue: #5835
        - PyTorch/XLA PR: #6624 
- [ ] hf_T5_generate
    - Issue: #6004
- [ ] moco
    - Issue: #6083 
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - Issue: #7636
    - Issue: #7647
- [x] nvidia_deeprecommender
    - Issue: #6006 
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
- [x] pytorch_CycleGAN_and_pix2pix
    -  Issue: #6007 
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
- ~[ ] simple_gpt~
    - RTX 2060 doesn't support BF16
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - SKIP (only work with multiprocess enabled -- _torchbench.yaml_)
- ~[ ] simple_gpt_tp_manual~
    - RTX 2060 doesn't support BF16
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - SKIP (inside skip list -- _torchbench.yaml_)
- ~[ ] tacotron2~
    - Issue: #6112 
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - SKIP (inside skip list -- _torchbench.yaml_)
- [x] timm_efficientdet
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
- [x] vision_maskrcnn
    - PyTorch/XLA PR: #5743
    - PyTorch PR: https://github.com/pytorch/pytorch/pull/112202
    - Issue: #6557
        - PyTorch/XLA PR: #7113 

### Dynamo+`openxla`. 78/81 - 96% (against inductor)

- ~[x] DALLE2_pytorch~
    - Issue: #6010 
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - Moved to canary models: https://github.com/pytorch/benchmark/pull/2311
- [x] Super_SloMo
    - PyTorch/XLA PR: #5707
    - PyTorch/benchmark PR: https://github.com/pytorch/benchmark/pull/2038
- [ ] cm3leon_generate
    - Issue: #5967
- [x] detectron2_fasterrcnn_r_101_c4
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] detectron2_fasterrcnn_r_101_dc5
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] detectron2_fasterrcnn_r_101_fpn
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] detectron2_fasterrcnn_r_50_c4
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] detectron2_fasterrcnn_r_50_dc5
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] detectron2_fasterrcnn_r_50_fpn
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] detectron2_fcos_r_50_fpn
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] detectron2_maskrcnn
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] detectron2_maskrcnn_r_101_c4
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] detectron2_maskrcnn_r_101_fpn
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] detectron2_maskrcnn_r_50_c4
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] detectron2_maskrcnn_r_50_fpn
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] dlrm
    - PyTorch/XLA PR: #5743
    - PyTorch PR: https://github.com/pytorch/pytorch/pull/112202
- [x] hf_BigBird
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] hf_GPT2
    - PyTorch/XLA PR: #5922
- [x] hf_GPT2_large
    - PyTorch/XLA PR: #5922
- [x] hf_Longformer
    - Issue: #5835
        - PyTorch/XLA PR: #6624 
- [x] hf_Reformer
    - Issue: #5837
        - PyTorch PR: https://github.com/pytorch/pytorch/pull/121007
- [ ] hf_T5_generate
    - Issue: #5967
- [ ] moco
    - Issue: #6083 
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - Issue: #7636
    - Issue: #7647
- [x] nvidia_deeprecommender
    - Issue: #6006 
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
- [x] pyhpc_isoneutral_mixing
    - PyTorch/XLA PR: #5743
    - PyTorch PR: https://github.com/pytorch/pytorch/pull/112202
- [x] pyhpc_turbulent_kinetic_energy
    - PyTorch/XLA PR: #5743
    - PyTorch PR: https://github.com/pytorch/pytorch/pull/112202
- [x] pytorch_CycleGAN_and_pix2pix
    -  Issue: #6007 
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
- [x] speech_transformer
    - PyTorch/XLA PR: #5823
- [x] timm_efficientdet
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071

### Models also Failing on Inductor

#### Inference Failing on Inductor CUDA with the Same Error

Benchmarks that raise the same error on inductor:

- [ ] hf_clip
    - 'str' object has no attribute 'shape'
- [ ] mobilenet_v2_quantized_qat
- [ ] resnet50_quantized_qat

#### Inference Failing on Inductor CUDA with Different Errors

- [ ] simple_gpt
    - RTX 2060 doesn't support BF16
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - SKIP (only work with multiprocess enabled -- _torchbench.yaml_)
- [ ] simple_gpt_tp_manual
    - RTX 2060 doesn't support BF16
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - SKIP (inside skip list -- _torchbench.yaml_)
- [ ] tacotron2
    - Issue: #6005
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - SKIP (inside skip list -- _torchbench.yaml_)

## Training
### Non-Dynamo. Pass rate: 64/66 - 96% (against inductor)

- ~[ ] DALLE2_pytorch~
    - Issue: #6084 
    - Issue: #6010 
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - Moved to canary models: https://github.com/pytorch/benchmark/pull/2311
- [x] demucs
    - Issue: #6003
- [x] densenet121
    - Issue: #6003
- [x] detectron2_fasterrcnn_r_101_c4
    - Issue: #6004
- [x] detectron2_fasterrcnn_r_101_dc5
    - Issue: #6004
- [x] detectron2_fasterrcnn_r_101_fpn
    - Issue: #6004
- [x] detectron2_fasterrcnn_r_50_c4
    - Issue: #6004
- [x] detectron2_fasterrcnn_r_50_dc5
    - Issue: #6004
- [x] detectron2_fasterrcnn_r_50_fpn
    - Issue: #6004
- [ ] detectron2_fcos_r_50_fpn
    - Skipped by the benchmarking script
- [x] detectron2_maskrcnn_r_101_c4
    - Issue: #6004
- [x] detectron2_maskrcnn_r_101_fpn
    - Issue: #6004
- [x] detectron2_maskrcnn_r_50_c4
    - Issue: #6004
- [x] detectron2_maskrcnn_r_50_fpn
    - Issue: #6004
- [x] dlrm
    - Issue: #6008 
        - PyTorch/XLA PR: #7584
- [x] hf_GPT2_large
    - Issue: #6003
- [x] hf_Longformer
    - Issue: #5835
        - PyTorch/XLA PR: #6624 
- [x] hf_T5_base
    - Issue: #6003
- ~[ ] llama_v2_7b_16h~
    - Issue: #6003
    - SKIP (training not supported -- _torchbench.yaml_)
- [ ] moco
    - Issue: #6083 
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - Issue: #7636
    - Issue: #7647
- [x] nvidia_deeprecommender
    - RTX 2060 OOM
    - Issue: #6006 
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
- [x] pytorch_CycleGAN_and_pix2pix
    -  Issue: #6007 
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
- [x] stable_diffusion_unet
    - Issue: #6003
- ~[ ] tacotron2~
    - Issue: #6112 
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - SKIP (inside skip list -- _torchbench.yaml_)
- [x] timm_efficientdet
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
- [x] timm_nfnet
    - Issue: #6003
- [x] timm_vision_transformer_large
    - Issue: #6003
- [x] yolov3
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071

### Dynamo+`openxla`. Pass rate: 55/66 - 83% (against inductor)

- [ ] demucs
    - Issue: #6003
- [x] densenet121
    - Issue: #6003
- [x] dlrm
    - Issue: #6008 
        - PyTorch/XLA PR: #7584
- [x] hf_BigBird
    - Issue: #5966
        - PyTorch/XLA PR: #6170
- [x] hf_GPT2
    - PyTorch/XLA PR: #5922
- [x] hf_GPT2_large
    - PyTorch/XLA PR: #5922
- [x] hf_Longformer
    - Issue: #5835
        - PyTorch/XLA PR: #6624 
- [x] hf_Reformer
    - Issue: #6009
        - PyTorch PR: https://github.com/pytorch/pytorch/pull/121007
- [ ] moco
    - Issue: #6083 
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - Issue: #7636
    - Issue: #7647
- [x] nvidia_deeprecommender
    - Issue: #6084 
    - Issue: #6006 
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
- [x] pytorch_CycleGAN_and_pix2pix
    -  Issue: #6007 
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
- [ ] stable_diffusion_unet
    - Issue: #6003
- [x] timm_efficientdet
    - Issue: #6003 
    - Issue: #6011
        - PyTorch/XLA PR: #6296
        - PyTorch/XLA PR: #6076
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
- [x] timm_vision_transformer
    - Issue: #6003
- [ ] timm_vision_transformer_large
    - Issue: #6003
- [x] torch_multimodal_clip
    - Issue: #6005 
- [x] yolov3
    - Issue: #6010
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071

### Models also Failing on Inductor

#### No Training Support on Inductor CUDA

Benchmarks that raise the error: `Model's DEFAULT_TRAIN_BSIZE is not implemented`.

- [ ] cm3leon_generate
- [ ] detectron2_fcos_r_50_fpn
- [ ] doctr_det_predictor
- [ ] doctr_reco_predictor
- [ ] hf_T5_generate
- [ ] llama
- [ ] phi_1_5
- [ ] pyhpc_equation_of_state
- [ ] pyhpc_isoneutral_mixing
- [ ] pyhpc_turbulent_kinetic_energy
- [ ] sam
- [ ] simple_gpt
- [ ] simple_gpt_tp_manual

#### Training Failing on Inductor CUDA with the Same Error

Benchmarks that raise the same error on inductor:

- [ ] DALLE2_pytorch
    - Issue: #6084 
    - Issue: #6010 
        - PyTorch/XLA PR: #6060 
        - PyTorch/XLA PR: #6071
    - Moved to canary models: https://github.com/pytorch/benchmark/pull/2311
- [ ] llama_v2_7b_16h
    - Issue: #6003
    - SKIP (training not supported -- _torchbench.yaml_)
- [ ] maml
    - Issue: #6084 
    - SKIP (training not supported -- _torchbench.yaml_)
- [ ] vision_maskrcnn
    - targets should not be none when in training mode
    - Fix https://github.com/pytorch/pytorch/pull/114774

#### Training Failing on Inductor CUDA with Different Errors
- [x] detectron2_fasterrcnn_r_101_c4
    - Issue: #5966 
        - PyTorch/XLA PR: #6170
- [x] detectron2_fasterrcnn_r_101_dc5
    - Issue: #5966 
        - PyTorch/XLA PR: #6170
- [x] detectron2_fasterrcnn_r_101_fpn
    - Issue: #5966 
        - PyTorch/XLA PR: #6170
- [x] detectron2_fasterrcnn_r_50_c4
    - Issue: #5966 
        - PyTorch/XLA PR: #6170
- [x] detectron2_fasterrcnn_r_50_dc5
    - Issue: #5966 
        - PyTorch/XLA PR: #6170
- [x] detectron2_fasterrcnn_r_50_fpn
    - Issue: #5966 
        - PyTorch/XLA PR: #6170
- [x] detectron2_maskrcnn
    - Issue: #5966 
        - PyTorch/XLA PR: #6170
- [x] detectron2_maskrcnn_r_101_c4
    - Issue: #5966 
        - PyTorch/XLA PR: #6170
- [x] detectron2_maskrcnn_r_101_fpn
    - Issue: #5966 
        - PyTorch/XLA PR: #6170
- [x] detectron2_maskrcnn_r_50_c4
    - Issue: #5966 
        - PyTorch/XLA PR: #6170
- [x] detectron2_maskrcnn_r_50_fpn
    - Issue: #5966 
        - PyTorch/XLA PR: #6170
- [ ] opacus_cifar10
    - Issue: #5967

cc @JackCaoG @miladm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Failing Torchbench Models: tracking issue #5932

Summary of Contributions (9th Feb)

Current State

Environment

Inference

Non-Dynamo. Pass rate: 78/81 - 96% (against inductor)

Dynamo+`openxla`. 78/81 - 96% (against inductor)

Models also Failing on Inductor

Inference Failing on Inductor CUDA with the Same Error

Inference Failing on Inductor CUDA with Different Errors

Training

Non-Dynamo. Pass rate: 64/66 - 96% (against inductor)

Dynamo+`openxla`. Pass rate: 55/66 - 83% (against inductor)

Models also Failing on Inductor

No Training Support on Inductor CUDA

Training Failing on Inductor CUDA with the Same Error

Training Failing on Inductor CUDA with Different Errors

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	Inference	Training
Inductor	87	63
Dynamo	60 to 82	41 to 53
Non-Dynamo	79 to 82	54 to 56

Failing Torchbench Models: tracking issue #5932

Description

Summary of Contributions (9th Feb)

Current State

Environment

Inference

Non-Dynamo. Pass rate: 78/81 - 96% (against inductor)

Dynamo+openxla. 78/81 - 96% (against inductor)

Models also Failing on Inductor

Inference Failing on Inductor CUDA with the Same Error

Inference Failing on Inductor CUDA with Different Errors

Training

Non-Dynamo. Pass rate: 64/66 - 96% (against inductor)

Dynamo+openxla. Pass rate: 55/66 - 83% (against inductor)

Models also Failing on Inductor

No Training Support on Inductor CUDA

Training Failing on Inductor CUDA with the Same Error

Training Failing on Inductor CUDA with Different Errors

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Dynamo+`openxla`. 78/81 - 96% (against inductor)

Dynamo+`openxla`. Pass rate: 55/66 - 83% (against inductor)