Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazily create storage in StructOfConstraints #1315

Merged
merged 10 commits into from
May 12, 2021
Merged

Lazily create storage in StructOfConstraints #1315

merged 10 commits into from
May 12, 2021

Conversation

odow
Copy link
Member

@odow odow commented May 5, 2021

Part of #1313

Calling MOIU.Model{Float64}() does a lot of work. It has to create 97 different CleverDicts to store all the constraints in! Even though most users only use a small fraction of the possible constraint types.

This PR uses a small Union{Nothing,X} to avoid creating the dictionaries, instead it lazily creates them on the first call to add_constraint.

Before

(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp
 21.929698 seconds (54.74 M allocations: 3.082 GiB, 6.57% gc time, 34.33% compilation time)
  0.001764 seconds (6.02 k allocations: 538.508 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp --no-bridge
 11.345386 seconds (19.50 M allocations: 1.112 GiB, 4.68% gc time, 98.00% compilation time)
  0.001040 seconds (3.05 k allocations: 309.961 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk           
 16.459349 seconds (34.09 M allocations: 1.955 GiB, 6.23% gc time, 52.33% compilation time)
  0.000730 seconds (3.27 k allocations: 251.531 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk --no-bridge
 12.485400 seconds (20.88 M allocations: 1.194 GiB, 4.43% gc time, 99.97% compilation time)
  0.000660 seconds (2.55 k allocations: 234.750 KiB)

After

Running: clp 
 18.193627 seconds (42.68 M allocations: 2.395 GiB, 7.23% gc time, 19.00% compilation time)
  0.001554 seconds (3.85 k allocations: 396.117 KiB)
Running: clp --no-bridge
  5.388487 seconds (7.22 M allocations: 424.453 MiB, 3.97% gc time, 92.45% compilation time)
  0.000915 seconds (1.90 k allocations: 238.125 KiB)
Running: glpk 
  9.581772 seconds (21.91 M allocations: 1.263 GiB, 5.77% gc time, 34.16% compilation time)
  0.000766 seconds (2.15 k allocations: 180.867 KiB)
Running: glpk --no-bridge
  6.361546 seconds (8.79 M allocations: 520.060 MiB, 3.72% gc time, 99.96% compilation time)
  0.000524 seconds (1.48 k allocations: 164.852 KiB)```

@odow odow added Submodule: Utilities About the Utilities submodule Type: Performance labels May 5, 2021
@blegat
Copy link
Member

blegat commented May 5, 2021

Alternatively, shouldn't CleverDict store dict as a Union{Nothing,...}. This way we could drop the is_dense field. As the union makes the compiler add a if but we already add a if with is_dense so it might be an improvement without compromise

@odow
Copy link
Member Author

odow commented May 5, 2021

We'd still have to create the 97 arrays, but I'll take a look.

@blegat
Copy link
Member

blegat commented May 5, 2021

We'd still have to create the 97 arrays, but I'll take a look.

Indeed but we were already allocating one array per type before MOI v0.9.21. There is a tradeoff there between the added if because of the Union{Nothing, ...} and the allocations of the arrays, it's not easy to see what's best.

@odow odow force-pushed the od/voc branch 2 times, most recently from 69667d2 to fe041f7 Compare May 5, 2021 20:24
@odow
Copy link
Member Author

odow commented May 5, 2021

I took a look. it's actually not about the cost of creating the objects. It's the cost of inferring them. If every function-set could exist, the compiler needs to make sure that every function that could be called exists. Having nothing for most of the VectorOfConstraints

This PR:

(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp             
Running: clp 
 16.580958 seconds (42.90 M allocations: 2.410 GiB, 7.70% gc time, 19.80% compilation time)
  0.001340 seconds (3.83 k allocations: 395.164 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp --no-bridge
Running: clp --no-bridge
  5.537055 seconds (7.07 M allocations: 416.149 MiB, 4.46% gc time, 92.67% compilation time)
  0.000798 seconds (1.89 k allocations: 237.641 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk           
Running: glpk 
  9.245249 seconds (21.75 M allocations: 1.255 GiB, 5.44% gc time, 34.56% compilation time)
  0.000583 seconds (2.15 k allocations: 180.398 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk --no-bridge
Running: glpk --no-bridge
  6.289336 seconds (8.63 M allocations: 511.208 MiB, 3.73% gc time, 99.97% compilation time)
  0.000546 seconds (1.48 k allocations: 164.383 KiB)

master with changes to CleverDicts only:

(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp            
Running: clp 
 22.614560 seconds (53.81 M allocations: 3.019 GiB, 6.76% gc time, 35.18% compilation time)
  0.001422 seconds (5.07 k allocations: 455.883 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp --no-bridge
Running: clp --no-bridge
 10.323763 seconds (18.48 M allocations: 1.046 GiB, 5.22% gc time, 96.00% compilation time)
  0.010798 seconds (2.58 k allocations: 269.898 KiB, 90.17% gc time)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk           
Running: glpk 
 14.181238 seconds (33.26 M allocations: 1.899 GiB, 5.86% gc time, 50.91% compilation time)
  0.000874 seconds (2.75 k allocations: 211.203 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk --no-bridge
Running: glpk --no-bridge
 11.368516 seconds (20.30 M allocations: 1.154 GiB, 4.99% gc time, 99.98% compilation time)
  0.000545 seconds (2.09 k allocations: 195.375 KiB)

master

(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp             
Running: clp 
 21.721361 seconds (54.08 M allocations: 3.038 GiB, 6.72% gc time, 38.28% compilation time)
  0.001439 seconds (5.95 k allocations: 534.227 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp --no-bridge
Running: clp --no-bridge
 11.073884 seconds (18.82 M allocations: 1.070 GiB, 5.20% gc time, 96.30% compilation time)
  0.001350 seconds (3.01 k allocations: 309.070 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk           
Running: glpk 
 14.628575 seconds (33.40 M allocations: 1.913 GiB, 6.63% gc time, 53.36% compilation time)
  0.000687 seconds (3.17 k allocations: 249.656 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk --no-bridge
Running: glpk --no-bridge
 11.674022 seconds (20.39 M allocations: 1.164 GiB, 4.75% gc time, 99.98% compilation time)
  0.000558 seconds (2.52 k allocations: 233.828 KiB)

@blegat
Copy link
Member

blegat commented May 5, 2021

So having nothing avoids the CleverDicts be compiled for that function/set types ? So having nothing in StructOfConstraints could avoid having VectorOfVariables be compiled ?

@odow
Copy link
Member Author

odow commented May 5, 2021

Editing the StructOfConstraints might get a bit messy. I'd prefer to do it in a separate PR. I think its an and instead of an or, and that we would have this change regardless.

The biggest win comes from dropping the 97 combinations to a few. I don't think going from 6 function types to less is going to make as much of a difference.

@odow
Copy link
Member Author

odow commented May 6, 2021

Now we're getting somewhere:

Running: clp 
 11.041472 seconds (27.14 M allocations: 1.580 GiB, 7.37% gc time, 24.26% compilation time)
  0.001072 seconds (1.74 k allocations: 133.008 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp --no-bridge
Running: clp --no-bridge
  3.905373 seconds (6.42 M allocations: 377.827 MiB, 7.50% gc time, 87.95% compilation time)
  0.001050 seconds (651 allocations: 70.391 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk           
Running: glpk 
  8.396816 seconds (22.52 M allocations: 1.298 GiB, 6.74% gc time, 33.32% compilation time)
  0.000406 seconds (1.28 k allocations: 80.117 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk --no-bridge
Running: glpk --no-bridge
  4.199532 seconds (7.21 M allocations: 428.397 MiB, 6.26% gc time, 99.96% compilation time)
  0.000232 seconds (564 allocations: 55.727 KiB)

@odow
Copy link
Member Author

odow commented May 6, 2021

Now there are just lots of little type instabilities:
image

@odow
Copy link
Member Author

odow commented May 6, 2021

Interestingly, this PR doesn't affect how many times we enter inference. It's 41 on this PR and on master. So there's still more that can be improved.

using SnoopCompile
tinf = @snoopi_deep example_diet(Clp.Optimizer, true)
itrigs = inference_triggers(tinf)
mtrigs = accumulate_by_source(Method, itrigs)
julia> length(mtrigs)
41

@odow odow requested a review from blegat May 6, 2021 21:27
@odow
Copy link
Member Author

odow commented May 9, 2021

@blegat objections to merging? It would be good to get into master so we can keep checking against new PRs.

@odow odow changed the title Lazily create storage in VectorOfConstraints Lazily create storage in StructOfConstraints May 12, 2021
@odow
Copy link
Member Author

odow commented May 12, 2021

Remove the Union in VectorOfConstraints. There's no real difference. Here's the latest:

(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp
Running: clp 
 10.385136 seconds (26.20 M allocations: 1.527 GiB, 5.64% gc time, 22.66% compilation time)
  0.000878 seconds (1.77 k allocations: 127.586 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp --no-bridge
Running: clp --no-bridge
  3.665770 seconds (6.37 M allocations: 375.495 MiB, 6.83% gc time, 89.64% compilation time)
  0.000728 seconds (674 allocations: 64.781 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk         
Running: glpk 
  8.007184 seconds (21.50 M allocations: 1.241 GiB, 6.50% gc time, 31.91% compilation time)
  0.000379 seconds (1.26 k allocations: 72.664 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk --no-bridge
Running: glpk --no-bridge
  4.771045 seconds (8.81 M allocations: 520.515 MiB, 5.64% gc time, 99.96% compilation time)
  0.000439 seconds (629 allocations: 56.891 KiB)

Much better than before.

@odow odow merged commit 339482b into master May 12, 2021
@odow odow deleted the od/voc branch May 12, 2021 02:06
@odow odow mentioned this pull request May 13, 2021
@blegat blegat added this to the v0.10 milestone May 22, 2021
@blegat blegat modified the milestones: v0.10, v0.9.22 May 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants