-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to model performance by reducing allocations #871
base: main
Are you sure you want to change the base?
Conversation
Use `view()` directly to avoid a tiny bit more allocations, particularly when only one variable is being indexed Preallocate variables where possible.
Weighted sum now implemented directly in parent function
No real difference in performance and the older code was more readable anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two questions
for i in length(growth_rate):-1:2 | ||
# Skip size class if nothing is moving up | ||
sum(@view(cover[(i - 1):i])) == 0.0 ? continue : false | ||
sum(view(cover, (i - 1):i)) == 0.0 ? continue : false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any advantage of using view
instead of the macro @view
here? (I'm asking because the macro seems easier to read to me)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I'm not sure which is better between view()
and @view()
(from what I understand the difference should be minimal?) but they both beat @views
, at least when slicing a single matrix (size: 79x3806)
view() |
@view() |
@views |
---|---|---|
https://www.juliabloggers.com/the-view-and-views-macros-are-you-sure-you-know-how-they-work/
https://discourse.julialang.org/t/difference-between-view-and-view/40798
src/ecosystem/corals/growth.jl
Outdated
w_per_group = w ./ sum(w; dims=1) | ||
replace!(w_per_group, NaN => 0.0) | ||
# Calculate contribution to cover to determine weights for each functional group | ||
w::Matrix{Float64} = sink_settlers .* view(tp.data, source_locs, sink_loc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make a relevant difference to cache this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tricky thing is the number of sources can change.
But maybe there is a way of preallocating and then resizing. Maybe it will avoid excessive triggering of the GC... let me have a think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't think of a way to cache the weights matrix because the number of source locations change as I said, and making the cache matrix dynamic resize involved a few more intermediate steps which would increase overall allocations (at least the way I was thinking of it).
I did play with the idea that we can cache the result outside the function given the number of source locations are constant, but this is only true in our current case where we use mean connectivity. If/when we move to variable/dynamic connectivity data, then we'd need this implementation anyway.
I did think of some other tweaks which helped a tiny bit though:
The performance difference is effectively null compared to `view()` so switching for readability.
Pre-allocate arrays outside function definition. I'm not sure if this actually reduces allocations on the whole, but defining caches as part of the function signature seems to trip up the profiler.
A swathe of changes to the simulations to improve runtime performance, principally by reducing the volume of allocations.
Includes purpose-specific weighted sum function for use when determining DHW tolerance thresholds.
Before:
After:
Error between approaches:
Error calculation was determined by serializing raw results to disk and then comparing the absolute maximum difference:
There are further performance optimizations potentially possible within CoralBlox and elsewhere in ADRIA, but seems to be mostly around determining strategies to avoid the GC.
We could even see if it is possible to make key functions completely allocation free and turning off the GC for those...
Update:
After some more tweaks