-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allocations incurred from broadcast expression #1372
Comments
The simplest example is when you broadcast a FT = Float32
@. x + FT(1) which is equivalent to x .+ FT.(1) Note that this is distinct from JuliaLang/julia#50554 (which is due to the use of Unfortunately this is a common pattern we use to make functions type-generic. Some possible solutions: The first is simply to avoid broadcasting types over scalars. We could either switch away from using x .+ FT(1) which forces the the @. x + $(FT(1)) but this is kind of ugly. Finally you can just move the conversion outside i = FT(1)
@. x + i If we want to keep using it, one option is via type-piracy on Perhaps a more elegant solution is struct SuffixConverter{FT}
end
Base.:*(x::Number, ::SuffixConverter{FT}) where {FT} = convert(FT, x)
Base.Broadcast.broadcasted(::typeof(*), x::Number, ::SuffixConverter{FT}) where {FT} = convert(FT, x) Then we can write _FT = SuffixConverter{FT}()
@. x + 1_FT which has the nice advantage of avoiding more parentheses, and so could make the code cleaner. |
I've posted a patch to CUDA.jl which should address it: JuliaGPU/CUDA.jl#2000 |
I like the |
Okay, how about https://github.com/simonbyrne/SuffixConversion.jl |
I think that’s a great solution, and probably better that it lives outside of clima since it’s very widely applicable. |
I often end up writing things like half = convert(NF,0.5)
dt_NF = convert(NF,dt)
Tmin = convert(NF,scheme.Tmin)
vj = convert(NF,v[j]) (SpeedyWeather.jl uses NF for number format) I like I'd even do |
JuliaGPU/CUDA.jl#2000 got merged, so maybe this is fixed? |
This closed for 1.11, but is still an issue on 1.10. |
This line is breaking on the GPU, and is resulting in runtime allocations.
I'll try to make a MWE when I have a chance.
The text was updated successfully, but these errors were encountered: