-
Notifications
You must be signed in to change notification settings - Fork 1
ResourceConstraints
While trying to work out #474 I noticed some potentially serious problems with the way transfers and walking are limited. Below are some counterexamples to the correctness of our method, where null traverse results are used to prune paths that exceed the limits.
Example:
Assume that any reasonable path from origin to destination uses the train and bus B. Walking from origin to the nearest train stop takes a long time, so the routing engine takes bus A and transfers to the train. This is both faster and easier than walking to the train station.
The search continues down the train line, through the streets, and eventually reaches a stop on bus line B. You have set max transfers to 1. Therefore, bus B is not boarded by this path.
All vertices on the train's trip pattern (as well as bus B's stop vertices and the streets leading to them) have already been visited. An alternate path exists where you walk from origin to the train, and can still make it to destination with only one transfer, but this alternate path is dominated everywhere by states on the original, pruned path that will never reach the destination.
Generally: When the fastest and easiest way to get from A to B involves more than two transfers, limiting the search to two transfers can make the destination unreachable, even though a solution with two transfers exists. If the destination is not unreachable, some non-optimal solution that just happens to have two or less transfers will be reported.
Example:
There are two transit lines that will take you from the origin to the destination. The slow local bus leaves from stop A (the origin point) in 20 minutes, and arrives at stop B 30 minutes later. The fast express bus will take you from stop C to stop D in only 15 minutes, and is leaving 15 minutes from now, but it's a 10-minute walk away from the local line.
Assume a walk speed of 1 m/sec, walkReluctance and waitReluctance of 2, and a maxWalk of 1500m.
Let the search progress up to stop B. The express bus path with more walking takes 15+15+10=40 minutes, and its weight is 152+15+102=65. The local bus path takes 20+30=50 minutes, and its weight is 20*2+30=70. The express bus branch is dominant in the vicinty of stop B and continues to expand. 300 meters into the final walk leg to the destination, it runs into the 1.5km walk limit and is abandoned, but by this time it has left dominant states at all the on-street nodes in the area, which prevent the local bus branch from reaching the destination.
Discovery of the optimal path respecting the walk limit is prevented by the existence of a lower-cost, faster path that exceeds the weight limit. This might be acceptable if the weight limit would just be raised and the optimal path found. However, it is likely that there are many other completely non-optimal paths that happen to respect the walk limit and happen to arrive at the destination from another direction. These are returned without ever raising the walk limit, and the optimal path is never found.
It turns out that limiting transfers and walking means that we are trying to solve the "resource-constrained shortest path problem", which is NP-hard. To do so correctly you must track the consumption of each constrained resource as a separate criterion. Intuitively, this is because the optimal path may use more of the resource budget toward its beginning or end, and you don't know where until you find the path. For details see:
Ziegelmann, Mark. "Constrained Shortest Paths and Related Problems".
Fortunately, we are not trying to find all pareto-optimal paths, just the N paths with the lowest generalized cost.
An article by Machuca et al. (2009) looks at "extensions of the A* algorithm to the multiobjective case", and evaluates the performance of Tung & Chew's (1992) scalar heuristic, which is similar to OTP's bidirectional remaining weight heuristic.
As demonstrated in their Fig. 4, a multi-objective search without goal-direction will take quite a while to reach the target, and tend to find the entire Pareto-optimal solution set toward the end of the search. Multi-objective search with goal direction takes much longer to find the entire set of optimal solutions, but begins finding results much earlier, and will find the first few results in a small fraction of the time needed to find the entire set. Because these solutions are discovered in an order determined by the heuristic, and the heuristic can be designed to express passenger preferences, we are only interested in the first few solutions found.
This approach requires a high quality goal-direction heuristic to minimize branching. Heuristic values can be calculated by running a secondary single-criterion search backward from the destination, as suggested by Tung and Chew. Machuca et al. state that "the time needed to calculate the heuristic values is not significant compared to total execution time (less than 5% on average, and less than 1% in the largest problems). This is due to the fact that the single objective search runs, used to precompute the heuristic values, have much smaller time requirements than bicriterion search." In our experience with OTP, this single-objective backward search takes the majority of the time, and the first few (lowest generalized cost) paths are found quickly thereafter.
Reference:
Machuca, Mandow, and Pérez de la Cruz. "An Evaluation of Heuristic Functions for Bicriterion Shortest Path Problems", 2009.