Skip to content

Conversation

kshyatt
Copy link
Member

@kshyatt kshyatt commented Oct 8, 2025

No description provided.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Benchmark suite Current: 4812987 Previous: f7deec6 Ratio
latency/precompile 56763592928.5 ns 56924777734.5 ns 1.00
latency/ttfp 8388724229 ns 8417873332.5 ns 1.00
latency/import 4509004428 ns 4531361015 ns 1.00
integration/volumerhs 9628927 ns 9625377 ns 1.00
integration/byval/slices=1 146968 ns 146827 ns 1.00
integration/byval/slices=3 426118 ns 425931 ns 1.00
integration/byval/reference 145043 ns 144949 ns 1.00
integration/byval/slices=2 286546 ns 286317 ns 1.00
integration/cudadevrt 103524.5 ns 103600 ns 1.00
kernel/indexing 14190 ns 14225 ns 1.00
kernel/indexing_checked 14880 ns 15087 ns 0.99
kernel/occupancy 666.8451612903226 ns 679.8954248366013 ns 0.98
kernel/launch 2188.8888888888887 ns 2150.3333333333335 ns 1.02
kernel/rand 15969 ns 14810 ns 1.08
array/reverse/1d 20201.5 ns 20182 ns 1.00
array/reverse/2dL_inplace 66796 ns 66832.5 ns 1.00
array/reverse/1dL 70434.5 ns 70358 ns 1.00
array/reverse/2d 22129 ns 21865 ns 1.01
array/reverse/1d_inplace 9629 ns 11480 ns 0.84
array/reverse/2d_inplace 13283 ns 13272 ns 1.00
array/reverse/2dL 74211 ns 73906 ns 1.00
array/reverse/1dL_inplace 66770.5 ns 66817 ns 1.00
array/copy 21070.5 ns 20949 ns 1.01
array/iteration/findall/int 158872 ns 157295 ns 1.01
array/iteration/findall/bool 140755 ns 139923.5 ns 1.01
array/iteration/findfirst/int 162420 ns 161193 ns 1.01
array/iteration/findfirst/bool 163256 ns 162272 ns 1.01
array/iteration/scalar 72491 ns 73738 ns 0.98
array/iteration/logical 217109 ns 214452.5 ns 1.01
array/iteration/findmin/1d 111541.5 ns 50889.5 ns 2.19
array/iteration/findmin/2d 109281.5 ns 96643 ns 1.13
array/reductions/reduce/Int64/1d 44330 ns 43989 ns 1.01
array/reductions/reduce/Int64/dims=1 45355.5 ns 44879 ns 1.01
array/reductions/reduce/Int64/dims=2 61954 ns 61825 ns 1.00
array/reductions/reduce/Int64/dims=1L 89305 ns 89232 ns 1.00
array/reductions/reduce/Int64/dims=2L 88461.5 ns 88384 ns 1.00
array/reductions/reduce/Float32/1d 37532 ns 37163 ns 1.01
array/reductions/reduce/Float32/dims=1 51358.5 ns 47666 ns 1.08
array/reductions/reduce/Float32/dims=2 60142 ns 59848 ns 1.00
array/reductions/reduce/Float32/dims=1L 52443 ns 52408 ns 1.00
array/reductions/reduce/Float32/dims=2L 72543 ns 72122.5 ns 1.01
array/reductions/mapreduce/Int64/1d 43945.5 ns 43666 ns 1.01
array/reductions/mapreduce/Int64/dims=1 44948 ns 47028 ns 0.96
array/reductions/mapreduce/Int64/dims=2 61902 ns 61661 ns 1.00
array/reductions/mapreduce/Int64/dims=1L 89150 ns 88863 ns 1.00
array/reductions/mapreduce/Int64/dims=2L 88467 ns 88192 ns 1.00
array/reductions/mapreduce/Float32/1d 37797 ns 37065 ns 1.02
array/reductions/mapreduce/Float32/dims=1 52170.5 ns 42446.5 ns 1.23
array/reductions/mapreduce/Float32/dims=2 60091 ns 60229 ns 1.00
array/reductions/mapreduce/Float32/dims=1L 52664 ns 52761 ns 1.00
array/reductions/mapreduce/Float32/dims=2L 72804 ns 72561 ns 1.00
array/broadcast 20148 ns 20011 ns 1.01
array/copyto!/gpu_to_gpu 11348 ns 11355.5 ns 1.00
array/copyto!/cpu_to_gpu 218691 ns 216192 ns 1.01
array/copyto!/gpu_to_cpu 283366 ns 283975.5 ns 1.00
array/accumulate/Int64/1d 125063 ns 125034.5 ns 1.00
array/accumulate/Int64/dims=1 83798 ns 83398 ns 1.00
array/accumulate/Int64/dims=2 158223 ns 157817 ns 1.00
array/accumulate/Int64/dims=1L 1710340 ns 1708490 ns 1.00
array/accumulate/Int64/dims=2L 967011 ns 966251 ns 1.00
array/accumulate/Float32/1d 109553 ns 109114 ns 1.00
array/accumulate/Float32/dims=1 81093.5 ns 80351 ns 1.01
array/accumulate/Float32/dims=2 147756.5 ns 147295.5 ns 1.00
array/accumulate/Float32/dims=1L 1619562.5 ns 1618020.5 ns 1.00
array/accumulate/Float32/dims=2L 698804 ns 698067 ns 1.00
array/construct 1268.8 ns 1296.7 ns 0.98
array/random/randn/Float32 46089 ns 48838.5 ns 0.94
array/random/randn!/Float32 25027 ns 24912 ns 1.00
array/random/rand!/Int64 27258 ns 27275 ns 1.00
array/random/rand!/Float32 8876.333333333334 ns 8805.333333333334 ns 1.01
array/random/rand/Int64 29951.5 ns 30044 ns 1.00
array/random/rand/Float32 13476.5 ns 13354 ns 1.01
array/permutedims/4d 59981 ns 60446 ns 0.99
array/permutedims/2d 54243 ns 54105.5 ns 1.00
array/permutedims/3d 54968 ns 54893 ns 1.00
array/sorting/1d 2758013 ns 2756483.5 ns 1.00
array/sorting/by 3347816 ns 3368977 ns 0.99
array/sorting/2d 1088878 ns 1088064.5 ns 1.00
cuda/synchronization/stream/auto 1008.5833333333334 ns 1030.1 ns 0.98
cuda/synchronization/stream/nonblocking 7512.700000000001 ns 7504.6 ns 1.00
cuda/synchronization/stream/blocking 825.3168316831683 ns 801.2842105263157 ns 1.03
cuda/synchronization/context/auto 1154.6 ns 1179.3 ns 0.98
cuda/synchronization/context/nonblocking 7388.4 ns 7293.5 ns 1.01
cuda/synchronization/context/blocking 913.765306122449 ns 909.9636363636364 ns 1.00

This comment was automatically generated by workflow using github-action-benchmark.

@maleadt
Copy link
Member

maleadt commented Oct 9, 2025

[61eb1bfa] GPUCompiler v1.1.0

Why did this install an ancient version of GPUCompiler...

@kshyatt
Copy link
Member Author

kshyatt commented Oct 9, 2025

No idea. But now that this is an official release we should anyway use Resolver. Let me add that.

@kshyatt
Copy link
Member Author

kshyatt commented Oct 9, 2025

Oh wait, but with nightly gone from the version tag, it is using Resolver...

@kshyatt
Copy link
Member Author

kshyatt commented Oct 9, 2025

Oh it's doing it because now it is using Resolver and our compat entry for GPUCompiler says GPUCompiler = "1.1" lol

@maleadt
Copy link
Member

maleadt commented Oct 9, 2025

I'll check which version of GPUCompiler.jl works on 1.12 and edit the compat bounds in General.
EDIT: JuliaRegistries/General#140014

@maleadt
Copy link
Member

maleadt commented Oct 14, 2025

Let's try increasing the bound on GPUCompiler.jl (e.g. from v1.6.0 as imposed by General to v1.7?) to see if this is a bug that's been fixed since.

@kshyatt
Copy link
Member Author

kshyatt commented Oct 14, 2025

Probably did this wrong but it's still using 1.6...

@maleadt
Copy link
Member

maleadt commented Oct 14, 2025

Ugh, so this looks like a legitimate bug.

@maleadt
Copy link
Member

maleadt commented Oct 14, 2025

I can't reproduce the crash easily, but looking into it I came across a hang (JuliaLang/julia#59834), so first looking at that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants