Improvements to model performance by reducing allocations #871

ConnectedSystems · 2024-10-04T01:33:58Z

A swathe of changes to the simulations to improve runtime performance, principally by reducing the volume of allocations.

Includes purpose-specific weighted sum function for use when determining DHW tolerance thresholds.

Before:

After:

Error between approaches:

using Revise, Infiltrator

using ADRIA

using Statistics
using Serialization


RME_DOM = "C:/Users/tiwanaga/development/RME/rme_ml_2024_01_08"
gbr_dom = ADRIA.load_domain(RMEDomain, RME_DOM, "45")

global debug_c = 1
scen = ADRIA.param_table(gbr_dom)
rs = ADRIA.run_model(gbr_dom, scen[1, :]);

cover = dropdims(sum(rs.raw, dims=2), dims=2)
# serialize("cover_optimized.dat", cover)
# serialize("cover_unoptimized.dat", cover)

Error calculation was determined by serializing raw results to disk and then comparing the absolute maximum difference:

# Comparison

cover_opt = deserialize("cover_optimized.dat")
cover_unopt = deserialize("cover_unoptimized.dat")

maximum(abs.(cover_opt .- cover_unopt))

There are further performance optimizations potentially possible within CoralBlox and elsewhere in ADRIA, but seems to be mostly around determining strategies to avoid the GC.

We could even see if it is possible to make key functions completely allocation free and turning off the GC for those...

Update:

After some more tweaks

Use `view()` directly to avoid a tiny bit more allocations, particularly when only one variable is being indexed Preallocate variables where possible.

Weighted sum now implemented directly in parent function

No real difference in performance and the older code was more readable anyway.

Zapiano

Just two questions

Zapiano · 2024-10-04T03:35:08Z

src/ecosystem/corals/growth.jl

    for i in length(growth_rate):-1:2
        # Skip size class if nothing is moving up
-        sum(@view(cover[(i - 1):i])) == 0.0 ? continue : false
+        sum(view(cover, (i - 1):i)) == 0.0 ? continue : false


Is there any advantage of using view instead of the macro @view here? (I'm asking because the macro seems easier to read to me)

Now I'm not sure which is better between view() and @view() (from what I understand the difference should be minimal?) but they both beat @views, at least when slicing a single matrix (size: 79x3806)

view() @view() @views

https://www.juliabloggers.com/the-view-and-views-macros-are-you-sure-you-know-how-they-work/

https://discourse.julialang.org/t/difference-between-view-and-view/40798

Zapiano · 2024-10-04T03:58:15Z

src/ecosystem/corals/growth.jl

-        w_per_group = w ./ sum(w; dims=1)
-        replace!(w_per_group, NaN => 0.0)
+        # Calculate contribution to cover to determine weights for each functional group
+        w::Matrix{Float64} = sink_settlers .* view(tp.data, source_locs, sink_loc)


Would it make a relevant difference to cache this?

The tricky thing is the number of sources can change.

But maybe there is a way of preallocating and then resizing. Maybe it will avoid excessive triggering of the GC... let me have a think.

Couldn't think of a way to cache the weights matrix because the number of source locations change as I said, and making the cache matrix dynamic resize involved a few more intermediate steps which would increase overall allocations (at least the way I was thinking of it).

I did play with the idea that we can cache the result outside the function given the number of source locations are constant, but this is only true in our current case where we use mean connectivity. If/when we move to variable/dynamic connectivity data, then we'd need this implementation anyway.

I did think of some other tweaks which helped a tiny bit though:

The performance difference is effectively null compared to `view()` so switching for readability.

Pre-allocate arrays outside function definition. I'm not sure if this actually reduces allocations on the whole, but defining caches as part of the function signature seems to trip up the profiler.

ConnectedSystems added 6 commits October 3, 2024 22:49

Custom weighted sum function

7d6d4c6

Tweaks for performance

8f406bc

Remove unused implementation

c183ca2

More performance tweaks

a14a3c3

Use `view()` directly to avoid a tiny bit more allocations, particularly when only one variable is being indexed Preallocate variables where possible.

Remove unused function

d391d33

Weighted sum now implemented directly in parent function

Avoid reallocation of habitable location indices

79b2e4a

ConnectedSystems added the enhancement New feature or request label Oct 4, 2024

ConnectedSystems marked this pull request as draft October 4, 2024 01:34

Revert change

1eb5fbc

No real difference in performance and the older code was more readable anyway.

ConnectedSystems requested review from DanTanAtAims and Zapiano October 4, 2024 01:51

Appease formatter

a66ef9d

ConnectedSystems marked this pull request as ready for review October 4, 2024 02:11

Zapiano approved these changes Oct 4, 2024

View reviewed changes

ConnectedSystems added 3 commits October 5, 2024 11:46

Switch to @view() style for readability

1a52814

The performance difference is effectively null compared to `view()` so switching for readability.

Make broadcasted loops explicit to avoid intermediate allocations

100b7b8

Pre-allocate arrays outside function definition. I'm not sure if this actually reduces allocations on the whole, but defining caches as part of the function signature seems to trip up the profiler.

Appease formatter again

eca5c6a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to model performance by reducing allocations #871

Improvements to model performance by reducing allocations #871

ConnectedSystems commented Oct 4, 2024 •

edited

Loading

Zapiano left a comment

Zapiano Oct 4, 2024

ConnectedSystems Oct 4, 2024 •

edited

Loading

Zapiano Oct 4, 2024

ConnectedSystems Oct 4, 2024

ConnectedSystems Oct 5, 2024 •

edited

Loading

Improvements to model performance by reducing allocations #871

Are you sure you want to change the base?

Improvements to model performance by reducing allocations #871

Conversation

ConnectedSystems commented Oct 4, 2024 • edited Loading

Zapiano left a comment

Choose a reason for hiding this comment

Zapiano Oct 4, 2024

Choose a reason for hiding this comment

ConnectedSystems Oct 4, 2024 • edited Loading

Choose a reason for hiding this comment

Zapiano Oct 4, 2024

Choose a reason for hiding this comment

ConnectedSystems Oct 4, 2024

Choose a reason for hiding this comment

ConnectedSystems Oct 5, 2024 • edited Loading

Choose a reason for hiding this comment

ConnectedSystems commented Oct 4, 2024 •

edited

Loading

ConnectedSystems Oct 4, 2024 •

edited

Loading

ConnectedSystems Oct 5, 2024 •

edited

Loading