Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] CuArrays in Julia1.5 is faster! But why? #59

Closed
GiggleLiu opened this issue Jun 7, 2020 · 4 comments
Closed

[Discussion] CuArrays in Julia1.5 is faster! But why? #59

GiggleLiu opened this issue Jun 7, 2020 · 4 comments

Comments

@GiggleLiu
Copy link
Member

GiggleLiu commented Jun 7, 2020

Julia 1.4.1

julia> using CuYao

(@v1.4) pkg> st CuYao
Status `~/.julia/environments/v1.4/Project.toml`
  [b48ca7a8] CuYao v0.2.2 [`~/.julia/dev/CuYao`]

julia> using CuArrays

julia> using BenchmarkTools

julia> reg = rand_state(25) |> cu
ArrayReg{1, Complex{Float64}, CuArray...}
    active qubits: 25/25

(@v1.4) pkg> st CuArrays
Status `~/.julia/environments/v1.4/Project.toml`
  [3a865a2d] CuArrays v2.2.0

julia> @benchmark @CuArrays.sync $reg |> $(put(25, 5=>X))
BenchmarkTools.Trial: 
  memory estimate:  3.31 KiB
  allocs estimate:  95
  --------------
  minimum time:     3.137 ms (0.00% GC)
  median time:      3.298 ms (0.00% GC)
  mean time:        3.295 ms (0.00% GC)
  maximum time:     3.439 ms (0.00% GC)
  --------------
  samples:          1507
  evals/sample:     1

julia> @benchmark @CuArrays.sync $reg |> $(cnot(25, 3, 9))
BenchmarkTools.Trial: 
  memory estimate:  3.81 KiB
  allocs estimate:  107
  --------------
  minimum time:     1.809 ms (0.00% GC)
  median time:      1.996 ms (0.00% GC)
  mean time:        2.021 ms (0.00% GC)
  maximum time:     2.305 ms (0.00% GC)
  --------------
  samples:          2466
  evals/sample:     1

julia> @benchmark @CuArrays.sync $reg |> $(put(25, 5=>Rx(0.5)))
BenchmarkTools.Trial: 
  memory estimate:  11.34 KiB
  allocs estimate:  181
  --------------
  minimum time:     3.175 ms (0.00% GC)
  median time:      3.409 ms (0.00% GC)
  mean time:        3.408 ms (0.00% GC)
  maximum time:     3.712 ms (0.00% GC)
  --------------
  samples:          1456
  evals/sample:     1

Julia1.5-beta

julia> using CuYao

julia> reg = rand_state(25) |> cu
ArrayReg{1, Complex{Float64}, CuArray...}
    active qubits: 25/25

julia> @benchmark @CuArrays.sync $reg |> $(put(25, 5=>X))
ERROR: LoadError: UndefVarError: @benchmark not defined
in expression starting at REPL[3]:1

julia> using BenchmarkTools
u[ Info: Precompiling BenchmarkTools [6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf]
sing C
julia> using CuArrays

julia> using BenchmarkTools

julia> @benchmark @CuArrays.sync $reg |> $(put(25, 5=>X))
BenchmarkTools.Trial: 
  memory estimate:  3.61 KiB
  allocs estimate:  88
  --------------
  minimum time:     1.916 ms (0.00% GC)
  median time:      2.122 ms (0.00% GC)
  mean time:        2.114 ms (0.00% GC)
  maximum time:     7.800 ms (0.00% GC)
  --------------
  samples:          2362
  evals/sample:     1

(@v1.5) pkg> st CuYao
Status `~/.julia/environments/v1.5/Project.toml`
  [b48ca7a8] CuYao v0.2.2 `~/.julia/dev/CuYao`

(@v1.5) pkg> st CuArrays
Status `~/.julia/environments/v1.5/Project.toml`
  [3a865a2d] CuArrays v2.2.0

julia> @benchmark @CuArrays.sync $reg |> $(cnot(25, 3, 9))
BenchmarkTools.Trial: 
  memory estimate:  4.13 KiB
  allocs estimate:  100
  --------------
  minimum time:     1.054 ms (0.00% GC)
  median time:      1.091 ms (0.00% GC)
  mean time:        1.109 ms (0.00% GC)
  maximum time:     9.261 ms (0.00% GC)
  --------------
  samples:          4504
  evals/sample:     1

julia> @benchmark @CuArrays.sync $reg |> $(put(25, 5=>Rx(0.5)))
BenchmarkTools.Trial: 
  memory estimate:  11.35 KiB
  allocs estimate:  169
  --------------
  minimum time:     2.000 ms (0.00% GC)
  median time:      2.200 ms (0.00% GC)
  mean time:        2.195 ms (0.00% GC)
  maximum time:     7.010 ms (0.00% GC)
  --------------
  samples:          2275
  evals/sample:     1
@maleadt
Copy link

maleadt commented Jun 8, 2020

Which version of CUDAnative and GPUCompiler does this use?

@GiggleLiu
Copy link
Member Author

GiggleLiu commented Jun 8, 2020

(@v1.5) pkg> st CUDAnative
Status `~/.julia/environments/v1.5/Project.toml`
  [be33ccc6] CUDAnative v3.1.0

The same in Julia-1.4.1.
CuYao does not depend on GPUCompiler.

Maybe the performance increase is related to the updates in Julia1.5
For example, the immutable reference (especially non-allocating view):
https://docs.julialang.org/en/v1.5-dev/NEWS/#Compiler/Runtime-improvements-1
JuliaLang/julia#34126
JuliaLang/julia#14955

Maybe you can also run the CUDAnative benchmark again, some of them might be faster automatically.

@GiggleLiu
Copy link
Member Author

Please also see:
JuliaArrays/UnsafeArrays.jl#8

@GiggleLiu
Copy link
Member Author

GiggleLiu commented Jun 8, 2020

This is really amazing, isn't it! @maleadt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants