-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarks with Julia v1.5 #8
Comments
I was a little disappointed to still see that 7% difference there, but I was unable to reproduce it with my 6-core skylake laptop:
Is that 7% in your demo just noise? I'd be very interested to know if it's real. |
I think so, I'll make a few more in-depth check with thread-pinning, etc. |
@mbauman, ran it again with numactl -C 0-63 julia using Base.Threads, LinearAlgebra
using UnsafeArrays
using BenchmarkTools
function colnorms!(dest::AbstractVector, A::AbstractMatrix)
@threads for i in axes(A, 2)
dest[i] = norm(view(A, :, i))
end
dest
end
colnorms_with_uviews!(dest, A) = @uviews A colnorms!(dest, A)
A = rand(50, 10^5);
dest = similar(A, size(A, 2));
colnorms!(dest, A)
colnorms_with_uviews!(dest, A)
julia> versioninfo()
Julia Version 1.5.0-beta1.0
Commit 6443f6c95a (2020-05-28 17:42 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: AMD EPYC 7702P 64-Core Processor
julia> nthreads()
64
julia> @benchmark colnorms!($dest, $A)
BenchmarkTools.Trial:
memory estimate: 46.61 KiB
allocs estimate: 321
--------------
minimum time: 95.621 μs (0.00% GC)
median time: 110.751 μs (0.00% GC)
mean time: 121.822 μs (2.68% GC)
maximum time: 4.075 ms (91.13% GC)
--------------
samples: 10000
evals/sample: 1
julia> @benchmark colnorms_with_uviews!($dest, $A)
BenchmarkTools.Trial:
memory estimate: 46.63 KiB
allocs estimate: 321
--------------
minimum time: 89.120 μs (0.00% GC)
median time: 105.310 μs (0.00% GC)
mean time: 116.689 μs (2.70% GC)
maximum time: 4.001 ms (90.93% GC)
--------------
samples: 10000
evals/sample: 1 These numbers seem fairly stable, when I run it multiple times. So a very small difference remains, but that can't really be due to memory allocation - on 64 threads, any difference in memory allocation frequency should result in a clear performance difference. |
And the pure absolute difference between Julia v1.4 and v1.5 (without UnsafeArrays), on 64 threads: Julia v1.4: julia> @benchmark colnorms!($dest, $A)
BenchmarkTools.Trial:
memory estimate: 4.62 MiB
allocs estimate: 100323
--------------
minimum time: 257.731 μs (0.00% GC)
median time: 617.504 μs (0.00% GC)
mean time: 9.535 ms (93.55% GC)
maximum time: 3.384 s (99.97% GC)
--------------
samples: 758
evals/sample: 1 Julia v1.5 julia> @benchmark colnorms!($dest, $A)
BenchmarkTools.Trial:
memory estimate: 46.61 KiB
allocs estimate: 321
--------------
minimum time: 95.621 μs (0.00% GC)
median time: 110.751 μs (0.00% GC)
mean time: 121.822 μs (2.68% GC)
maximum time: 4.075 ms (91.13% GC)
--------------
samples: 10000
evals/sample: 1 A mean time of 122 μs vs. 9.5 ms before! My deepest thanks to the compiler team for this. I think JuliaLang/julia#34126 will boost heavily multi-threaded applications a lot. After all, benchmark mean time is usually the number with the strongest influence on application wall-clock time. |
Julia v1.5 enables inline allocation of structs with pointers (JuliaLang/julia#34126), this should make
UnsafeArrays
unnecessary in most cases. New benchmarks - using the test caseWith Julia v1.4:
With Julia v1.5-beta1:
Very little difference in the mean runtime with and without
@uviews
- in contrast to v1.4, where we see a strong difference. Also, a very nice gain in speed in general.Test system: AMD EPYC 7702P 64-core CPU.
The text was updated successfully, but these errors were encountered: