Regression build failing with modern CUDA #24

tgrogers · 2025-01-23T14:21:38Z

Now that regressions are setup, we need to get them to pass with workloads of importance.
I have prioritized the following workloads to start:

gpu-app-collection/src/Makefile

Line 7 in fec0850

    
           all:  GPU_Microbenchmark microbench rodinia_2.0-ft mnist_cudnn cutlass mlperf_inference mlperf_training rodinia-3.1 pannotia proxy-apps heterosync ispass-2009 lonestargpu-2.0 polybench custom_apps

Right now, I think all of the following are failing:

mnist_cudnn cutlass mlperf_inference mlperf_training

Docker makes it much easier to reproduce the errors - what the regressions system does is listed here:

https://github.com/accel-sim/gpu-app-collection/blob/dev/.github/workflows/test-build.yml

It starts with the docker image

nvidia/cuda:12.6.3-cudnn-devel-ubuntu24.04

gpu-app-collection/.github/workflows/test-build.yml

Line 27 in fec0850

apt-get update

You can test this on any machine that has docker installed. You don't even need a GPU.

tgrogers assigned Zhaoyu-Jin, Connie120 and Anunalla Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression build failing with modern CUDA #24

Regression build failing with modern CUDA #24

tgrogers commented Jan 23, 2025 •

edited

Loading

Regression build failing with modern CUDA #24

Regression build failing with modern CUDA #24

Comments

tgrogers commented Jan 23, 2025 • edited Loading

tgrogers commented Jan 23, 2025 •

edited

Loading