Investigate performance slow down #97

karussell · 2017-06-13T17:38:17Z

There seems to be a problem with speed since the u-turn avoidance.

stefanholder · 2017-06-14T08:32:13Z

As discussed here u-turn avoidance should cause at most a 4x slowdown (at most double number of candidates means at most 4 times number of transitions). But since some candidates are tower nodes and query results are deduped (#91) the slowdown should be less than 4x. Actually, @kodonnell only measured a 2.5x slowdown after u-turn avoidance was implemented (see first link), which was even before #91.

michaz · 2018-04-24T01:33:30Z

You gave the answer in #80. When I run Measurement and set the initial collection size to 150_000, which will happen in large graphs, map matching takes forever, because it allocates and deallocates 4 * 600k a zillion times.

karussell · 2018-04-24T10:50:31Z

Oh, ugly. Thanks a lot @michaz - this got lost in issues :( !

Should we make initial collection size configurable in GH core or should we use a better heuristic like small initial size for small beeline distance? Also Dijstra and A* should have a different initialization, plus the different initialization size for bidirectional vs. unidirectional looks unsynched (Dijkstra 2K vs. 150K for bidir Dijkstra).

Probably the best would be to introduce a new variable in AlgorithmOptions and make it configurable, like "expected minimum nodes" or something.

michaz · 2018-04-24T16:16:13Z

I think in library code one should not add a special assumption, an overridden method, or even a configuration parameter, anywhere, for a 10% speed-up. We would probably overfit it to the particular cases we are looking at, plus we make the code bad by increasing its surface for errors, and by diverting attention from what it actually does.

Except maybe the authors of the JVM, because when they are successful, half the world gets 10% faster, so the investment in maintenance and testing and code-ugliness pays off.

Look at the formula for the initial collection size (I had to set a breakpoint to understand where the value actually comes from, because it goes back and forth so often between super and subclasses):

Math.min(Math.max(200, graph.getNodes() / 10), 150_000)

It is almost evil, because the typical graph size is "the world", or "a country", except in toy scenarios and test cases. So the result of the formula is almost always 150000, except in toy scenarios and test cases, so you never detect the problem created by 150000 with your toy scenarios and test cases, so it stays undetected. And the formula looks clever so nobody suspects anything, whereas if you had just written 150_000, it would have been obvious that this is a huge overhead and it would have been picked up by the performance test.

We'll probably meet in the middle, but you were asking: I would remove all of it.

Or, if necessary, introduce a "hint" (as you say), but the default should always be the default behavior of the collection classes.

Also, lets remove the subclassing of the collection classes (GHIntObjectHashMap). It makes additional assumptions where it's not clear where they come from. And the name of the parameter changes three times on its long way (size, capacity, expectedElements, which are literally three different things).

michaz closed this as completed in fa5d568 May 27, 2018

karussell added this to the 0.11 milestone Jun 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate performance slow down #97

Investigate performance slow down #97

karussell commented Jun 13, 2017

stefanholder commented Jun 14, 2017

michaz commented Apr 24, 2018 •

edited

Loading

karussell commented Apr 24, 2018 •

edited

Loading

michaz commented Apr 24, 2018 •

edited

Loading

Investigate performance slow down #97

Investigate performance slow down #97

Comments

karussell commented Jun 13, 2017

stefanholder commented Jun 14, 2017

michaz commented Apr 24, 2018 • edited Loading

karussell commented Apr 24, 2018 • edited Loading

michaz commented Apr 24, 2018 • edited Loading

michaz commented Apr 24, 2018 •

edited

Loading

karussell commented Apr 24, 2018 •

edited

Loading

michaz commented Apr 24, 2018 •

edited

Loading