Skip to content
This repository has been archived by the owner on Mar 1, 2021. It is now read-only.

correctly utilise dijkstra one to many #82

Closed
kodonnell opened this issue Nov 24, 2016 · 12 comments
Closed

correctly utilise dijkstra one to many #82

kodonnell opened this issue Nov 24, 2016 · 12 comments

Comments

@kodonnell
Copy link
Contributor

kodonnell commented Nov 24, 2016

It's much faster ... should we warn the user if they're no using the above? I imagine it's easy to not think about it (as I did), and get stuck with much worse performance ...

I was going to suggest updating the doco to reflect this ... but when I checked, it looks like the doco is still for the old API?

If you agree, I'll make the changes ...

Edit: I just realised that we do need to change the computeTransitionProbabilities to actually benefit from dijkstra one-to-many as currently it creates a new algo for every route, and hence doesn't utilise the cache etc. Shouldn't be hard to tweak though. (I didn't notice as I'm working on a bespoke implementation at the moment.)

@kodonnell kodonnell changed the title warn if not using CH + dijkstra one to many + doco update support for dijkstra one to many Nov 24, 2016
@kodonnell kodonnell changed the title support for dijkstra one to many correctly utilise dijkstra one to many Nov 24, 2016
@karussell
Copy link
Member

It's much faster ... should we warn the user if they're no using the above?

What do you mean here? Do you mean the class DijkstraOneToMany? This class is not suited to be used for many scenarios (e.g. does not work with virtual nodes/edges out of the box) or do you mean to use the Dijkstra class with the use cases one-to-many? This is indeed not documented but easy doable

but when I checked, it looks like the doco is still for the old API?

This could be, yes.

@kodonnell
Copy link
Contributor Author

What do you mean here?

I really just use algorithm(Parameters.Algorithms.DIJKSTRA_ONE_TO_MANY) in the algorithm options, and tweaked computeTransitionProbabilities to reuse the algo for all to candidates.

This class is not suited to be used for many scenarios

Hmmm, I see. Everything just worked in the sense that the code ran through without errors - though I haven't compared the actual route with that created from other algorithms (though initial inspection looks OK).

Do you agree that using the DijkstraOneToMany would be valuable? If so, how feasible? Also, I just realised something - am I right that you'd be cautious of helping as this is effectively creating a distance matrix, which (I believe) is still proprietary?

I'll update the doco - what should we pick as the default algo for now?

@karussell
Copy link
Member

I really just use algorithm(Parameters.Algorithms.DIJKSTRA_ONE_TO_MANY) in the algorithm options, and tweaked computeTransitionProbabilities to reuse the algo for all to candidates.

The specific class won't work easily in this scenario. What we could do is to use the Dijkstra class with a set of to points, similar to this code

@karussell
Copy link
Member

I'll update the doco - what should we pick as the default algo for now?

Maybe let us fix this here and keep the documentation using one to many?

@kodonnell
Copy link
Contributor Author

The specific class won't work ...

What do you mean by 'won't work'? It 'worked' for me, and (conceptually) seems to fit.

Maybe let us fix this here and keep the documentation using one to many?

Fixing this first is probably best, yes.

@karussell
Copy link
Member

What do you mean by 'won't work'? It 'worked' for me, and (conceptually) seems to fit.

The DijkstraOneToMany has problems in this scenario which a one-to-many implemented Dijkstra would not have:

  • does not support virtual nodes and edges (yet?)
  • would require to allocate a new algorithm object for every request, which means you allocate arrays with size in the millions for a world wide graph. The RAM usage could explode for many parallel requests.
  • no turn cost support

I.e. DijkstraOneToMany is currently designed to work very fast but for CH preparation only.

@kodonnell
Copy link
Contributor Author

Hmmm, I tried to implement the custom dijkstra you linked above, but it's about twice as slow as DijkstraOneToMany, even when I don't extract any paths.

On the other hand, DijkstraOneToMany still seems to work = ) I don't need turn costs (and nor do most average users, I imagine), so that's fine. It hasn't complained about virtual nodes/edges yet (and seems to be working the same as other algorithms). As for RAM ... I just create one algo, and algo.clear() whenever I want to reset the from node. Either way, I imagine many users (i.e. me) would be happy to use lots of RAM if it's a lot faster.

@karussell
Copy link
Member

Can you count the visited nodes for both? This should be identical.

just create one algo, and algo.clear() whenever I want to reset the from node.

Are you using this from a web service where concurrent requests could come in?

@kodonnell
Copy link
Contributor Author

Can you count the visited nodes for both

Not easily with my current implementation. Want me to submit a rough PR (with a unit test) so we can compare and discuss there?

Are you using this from a web service where concurrent requests could come in?

In production no - it'll be a batch job (running in Spark, which I eventually figured out how to do). If I see where you're going ... DijkstraOneToMany would give faster requests, and possibly even a higher throughput (depending on the situation).

@kodonnell
Copy link
Contributor Author

DijskstraOneToMany seems to pass the tests for non-CH graphs (except for one about num visited nodes which I don't think is a problem) ... see issue82.

@karussell - thoughts? Once #73 is incorporated we can discuss performance differences ...

@karussell
Copy link
Member

If you have a smaller area this might improve performance but otherwise you create many large arrays everytime you create this class (see inside DijskstraOneToMany). And again: virtual nodes could make problems e.g. if 10 virtual nodes are necessary and then you do a clear and then 20 virtual nodes are necessary, then the arrays are too small. (One could create big enough arrays here, not sure)

@kodonnell
Copy link
Contributor Author

I agree with all you said. I think there are possible cases where it could be useful (especially if it's faster - as it is in my case), but they're probably a minority. So for now I'll close this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants