Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enzyme support older versions #537

Merged
merged 4 commits into from
Oct 4, 2024
Merged

Enzyme support older versions #537

merged 4 commits into from
Oct 4, 2024

Conversation

wsmoses
Copy link
Collaborator

@wsmoses wsmoses commented Oct 4, 2024

No description provided.

@wsmoses wsmoses requested a review from vchuravy October 4, 2024 14:40
@@ -0,0 +1,342 @@
# https://github.com/EnzymeAD/Enzyme.jl/issues/1516
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name these files after the EnzymeCore version?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@vchuravy
Copy link
Member

vchuravy commented Oct 4, 2024

Can you also fix runic?

diff --git a/EnzymeCore08Ext.jl b/EnzymeCore08Ext.jl
index 265a897..8bdd8ee 100644
--- a/EnzymeCore08Ext.jl
+++ b/EnzymeCore08Ext.jl
@@ -12,7 +12,7 @@ function gpu_fwd(ctx, config, f, args...)
 end
 
 function EnzymeRules.forward(
-	config,
+        config,
         func::Const{<:Kernel{CPU}},
         ::Type{Const{Nothing}},
         args...;
@@ -27,7 +27,7 @@ function EnzymeRules.forward(
 end
 
 function EnzymeRules.forward(
-	config,
+        config,
         func::Const{<:Kernel{<:GPU}},
         ::Type{Const{Nothing}},
         args...;

Copy link
Contributor

github-actions bot commented Oct 4, 2024

Benchmark Results

main 46d2fcb... main/46d2fcb10b5713...
saxpy/default/Float16/1024 2.81 ± 0.2 μs 2.79 ± 0.2 μs 1.01
saxpy/default/Float16/1048576 2.08 ± 0.0059 ms 2.08 ± 0.0091 ms 0.999
saxpy/default/Float16/16384 0.0328 ± 0.00015 ms 0.0328 ± 0.00015 ms 1
saxpy/default/Float16/2048 5.22 ± 0.047 μs 5.23 ± 0.092 μs 0.999
saxpy/default/Float16/256 0.967 ± 0.11 μs 0.979 ± 0.12 μs 0.988
saxpy/default/Float16/262144 0.524 ± 0.0094 ms 0.525 ± 0.0096 ms 1
saxpy/default/Float16/32768 0.065 ± 0.00018 ms 0.065 ± 0.00018 ms 1
saxpy/default/Float16/4096 10.1 ± 0.05 μs 10.1 ± 0.05 μs 1
saxpy/default/Float16/512 1.57 ± 0.16 μs 1.57 ± 0.16 μs 0.999
saxpy/default/Float16/64 0.631 ± 0.016 μs 0.62 ± 0.017 μs 1.02
saxpy/default/Float16/65536 0.129 ± 0.00035 ms 0.129 ± 0.00043 ms 1
saxpy/default/Float32/1024 1.02 ± 0.015 μs 1.02 ± 0.011 μs 1.01
saxpy/default/Float32/1048576 0.966 ± 0.0084 ms 0.964 ± 0.0091 ms 1
saxpy/default/Float32/16384 15.5 ± 0.13 μs 15.4 ± 0.1 μs 1
saxpy/default/Float32/2048 1.72 ± 0.023 μs 1.71 ± 0.02 μs 1
saxpy/default/Float32/256 0.535 ± 0.12 μs 0.526 ± 0.13 μs 1.02
saxpy/default/Float32/262144 0.238 ± 0.0096 ms 0.239 ± 0.0099 ms 0.998
saxpy/default/Float32/32768 30.3 ± 0.17 μs 30.3 ± 0.16 μs 1
saxpy/default/Float32/4096 3.01 ± 0.025 μs 3.02 ± 0.024 μs 0.999
saxpy/default/Float32/512 0.691 ± 0.11 μs 0.691 ± 0.12 μs 1
saxpy/default/Float32/64 0.41 ± 0.0061 μs 0.408 ± 0.0074 μs 1
saxpy/default/Float32/65536 0.0601 ± 0.00026 ms 0.0601 ± 0.00043 ms 1
saxpy/default/Float64/1024 1.07 ± 0.022 μs 1.06 ± 0.018 μs 1.01
saxpy/default/Float64/1048576 1.02 ± 0.032 ms 1 ± 0.026 ms 1.02
saxpy/default/Float64/16384 15.7 ± 0.13 μs 15.7 ± 0.12 μs 1
saxpy/default/Float64/2048 1.75 ± 0.028 μs 1.74 ± 0.019 μs 1
saxpy/default/Float64/256 0.516 ± 0.013 μs 0.518 ± 0.013 μs 0.995
saxpy/default/Float64/262144 0.243 ± 0.0096 ms 0.244 ± 0.01 ms 0.998
saxpy/default/Float64/32768 31.1 ± 0.75 μs 31.1 ± 0.59 μs 1
saxpy/default/Float64/4096 3.06 ± 0.093 μs 3.04 ± 0.078 μs 1.01
saxpy/default/Float64/512 0.699 ± 0.12 μs 0.702 ± 0.11 μs 0.997
saxpy/default/Float64/64 0.396 ± 0.011 μs 0.385 ± 0.0076 μs 1.03
saxpy/default/Float64/65536 0.0614 ± 0.00084 ms 0.0615 ± 0.00085 ms 0.998
saxpy/static workgroup=(1024,)/Float16/1024 2.1 ± 0.22 μs 2.11 ± 0.21 μs 0.994
saxpy/static workgroup=(1024,)/Float16/1048576 0.166 ± 0.012 ms 0.161 ± 0.0098 ms 1.03
saxpy/static workgroup=(1024,)/Float16/16384 4.28 ± 0.21 μs 4.28 ± 0.21 μs 1
saxpy/static workgroup=(1024,)/Float16/2048 2.13 ± 0.23 μs 2.14 ± 0.22 μs 0.993
saxpy/static workgroup=(1024,)/Float16/256 2.65 ± 0.039 μs 2.64 ± 0.038 μs 1
saxpy/static workgroup=(1024,)/Float16/262144 0.0433 ± 0.0024 ms 0.0431 ± 0.0019 ms 1
saxpy/static workgroup=(1024,)/Float16/32768 6.54 ± 0.17 μs 6.6 ± 0.21 μs 0.992
saxpy/static workgroup=(1024,)/Float16/4096 2.42 ± 0.033 μs 2.45 ± 0.043 μs 0.987
saxpy/static workgroup=(1024,)/Float16/512 3.14 ± 0.095 μs 3.16 ± 0.097 μs 0.996
saxpy/static workgroup=(1024,)/Float16/64 2.27 ± 0.026 μs 2.24 ± 0.021 μs 1.01
saxpy/static workgroup=(1024,)/Float16/65536 12.2 ± 0.34 μs 12.3 ± 0.29 μs 0.994
saxpy/static workgroup=(1024,)/Float32/1024 1.96 ± 0.026 μs 1.97 ± 0.031 μs 0.996
saxpy/static workgroup=(1024,)/Float32/1048576 0.253 ± 0.019 ms 0.251 ± 0.019 ms 1.01
saxpy/static workgroup=(1024,)/Float32/16384 4.11 ± 0.89 μs 4.1 ± 0.95 μs 1
saxpy/static workgroup=(1024,)/Float32/2048 2.29 ± 0.31 μs 2.31 ± 0.23 μs 0.994
saxpy/static workgroup=(1024,)/Float32/256 2.8 ± 1.6 μs 2.84 ± 1 μs 0.984
saxpy/static workgroup=(1024,)/Float32/262144 0.0664 ± 0.0036 ms 0.066 ± 0.004 ms 1.01
saxpy/static workgroup=(1024,)/Float32/32768 6.98 ± 0.41 μs 6.95 ± 0.24 μs 1
saxpy/static workgroup=(1024,)/Float32/4096 2.57 ± 0.19 μs 2.56 ± 0.22 μs 1
saxpy/static workgroup=(1024,)/Float32/512 2.51 ± 0.22 μs 2.49 ± 0.23 μs 1.01
saxpy/static workgroup=(1024,)/Float32/64 2.45 ± 0.051 μs 2.45 ± 0.056 μs 0.999
saxpy/static workgroup=(1024,)/Float32/65536 17.7 ± 2.1 μs 17.6 ± 1.7 μs 1.01
saxpy/static workgroup=(1024,)/Float64/1024 2.05 ± 0.03 μs 2.05 ± 0.03 μs 1
saxpy/static workgroup=(1024,)/Float64/1048576 0.552 ± 0.058 ms 0.529 ± 0.047 ms 1.04
saxpy/static workgroup=(1024,)/Float64/16384 6.91 ± 0.78 μs 6.99 ± 1.4 μs 0.988
saxpy/static workgroup=(1024,)/Float64/2048 2.54 ± 0.28 μs 2.54 ± 0.26 μs 1
saxpy/static workgroup=(1024,)/Float64/256 2.4 ± 0.055 μs 2.41 ± 0.053 μs 0.997
saxpy/static workgroup=(1024,)/Float64/262144 0.123 ± 0.0089 ms 0.123 ± 0.0087 ms 0.999
saxpy/static workgroup=(1024,)/Float64/32768 17.4 ± 1.7 μs 17.1 ± 1.9 μs 1.02
saxpy/static workgroup=(1024,)/Float64/4096 3.1 ± 0.34 μs 3.11 ± 0.35 μs 0.997
saxpy/static workgroup=(1024,)/Float64/512 2.39 ± 0.045 μs 2.4 ± 0.049 μs 0.998
saxpy/static workgroup=(1024,)/Float64/64 2.38 ± 0.087 μs 2.37 ± 0.1 μs 1
saxpy/static workgroup=(1024,)/Float64/65536 0.0356 ± 0.003 ms 0.0352 ± 0.0028 ms 1.01
time_to_load 0.313 ± 0.0015 s 0.311 ± 0.0015 s 1.01

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@wsmoses
Copy link
Collaborator Author

wsmoses commented Oct 4, 2024

is there a similar clang-format like command I can run to auto fix?

@vchuravy
Copy link
Member

vchuravy commented Oct 4, 2024

Yeah you can https://github.com/fredrikekre/Runic.jl

julia --project=@runic -e 'using Pkg; Pkg.add(url = "https://github.com/fredrikekre/Runic.jl")'
alias runic="julia --project=@runic -e 'using Runic; exit(Runic.main(ARGS))' --"
 runic -i ext src

@vchuravy vchuravy merged commit e59ab6f into main Oct 4, 2024
22 of 30 checks passed
@vchuravy vchuravy deleted the oldenz branch October 4, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants