-
Notifications
You must be signed in to change notification settings - Fork 272
added find radial method to address #65 #67
Conversation
Ah, I forgot to mention that I had to make a couple of GH methods public/protected instead of private (e.g. |
I'm not sure about the exact purpose of this PR. Can you summarize what it does? And why always adding to queryResults - wouldn't it increase endlessly, shouldn't it be limited to a certain number? |
The method
It also seems to give more reasonable results (based on the test suite examples).
Isn't that also the behaviour of With map matching, if one has a measurementSigmaError of 100m, then one wants to find all edges within 100m of the GPX entry. That's what this does (I hope!) - it will add all edges within 100m. So, if the user e.g. sets Anyway, it would be easy to limit it to the top |
Yes, this should be fixed
When I think about it: it should return all edges, if the starting points are in the reach (first and second iteration) which we can expect for the most map matching use cases. So what it mainly does: it does a location lookup and then adds edges via BreadthFirstSearch until a certain distance, which should be more efficient compared to do just location lookups. As lookups can contain duplicates and a graph traversal not. The only problem at the moment are case where there are no entries found within the first&second iteration, but we could just increase this iteration count and I'm unsure what triggered your other changes. E.g. why is XFirstSearchCheck still necessary if you already have all the points? And something which we have to fix elsewhere: implement a basic lookup and map matching performance measurements to see how changes performs. Identical to what we've done for normal routing already.
Hmmh, not sure. Where does the new method overwrite the entry if it is closer? See the old "// overwrite old edge with current" |
I disagree - I think it should return all edges, if the starting points are within the specified GPS error. This is a superset of your statement, and covers additional cases (i.e. outside first/second iteration). In most map matchin use cases (with small GPS error), it will still only ever test the first/second iteration, so will be equivalent.
I don't know = ) The current method simply increases the size of the rectangle, doing lookups as it goes. However, it only carries out two iterations, at most, so doesn't handle larger GPS error - which was the problem for me. Initially, I could have just increased the iteration count, as you say. All I've done is make this dynamic (i.e. it keeps iterating until it's reached the bounds), and optimised it to be radial - e.g. if you've got bounds of 10km, you don't need to do location lookups on the corner tiles of the 20kmx20km bounding box. Other than that, I just follow the existing example (hence why A graph traversal sounds much better - find the closest node to the query, then iterate out until you're out of bounds. Any example code I can look at for graph traversal from a point?
Ah, right. I just ignored duplicate edges, as I thought an edge was and edge. But I think I see now that while the edge may be the same, the query distance may be closer (depending which tile it came from, etc.). I can (easily) implement this, but I'd prefer to do the graph traversal above. |
it is already there: XFirstSearchCheck But this is also not as easy as it sounds ;) |
Yes, extending the loop should be easy. Probably we need to split this PR and take careful steps for a better location lookup in general. I think this will be important and could lead to better readable code, faster lookup and more flexible approach. E.g. we should limit by radius and by a certain number of 'snaps' and recognize #57. Also the previous lookup always included minimal one snap, even if it was out of the specified radius, although I'm not sure why this was required. All these changes I would schedule after the 0.8 release as we need to get people started with the new map matching algorithm and also #15 and #66 are very important. And before we can change more here, we need more tests (as indicated via 'I'm not sure'). Otherwise I would fear that we can break something. And as 'fear' is never good sign for quality (!), we need to increase the test coverage for the lookup. This is of course not your job, but if you would have time ... :) |
Ok, I'm happy to leave it until after 0.8 release . @stefanholder's refactoring in #66 might actually help - I was also playing with on-the-fly lookups in #60.
OK, I'll do what I can - I'm on holiday this week, but playing round with some unit tests might be relaxing enough (I hope!). Should I create a separate issue to discuss what tests you'd like (in addition to #68)? Shall I close this PR, for now? |
Is now merged
I'm not entirely sure ;)
You can reuse #68 :)
Probably only if you create a new issue summarizing like 'Take careful steps for a better location lookup in general', but we can also keep it open |
@karussell: Regarding your previous comment
What do you mean with certain distance, is this returnAllResultsWithin? I'm not sure if a BreadthFirstSearch really happens until the search distance. Because of this line it seems to me that the graph traversal only checks immediate neighbors of all nodes returned by location lookup. |
Yes
This is indeed confusing. If we speak about the location index of the core then we create cells of a certain area (QuadTree) and feed every cell with the nodes that are within the boundaries. (Keep in mind that one such cell is not rectangular on earth). Now we could avoid feeding nodes that are reachable by other nodes within the cell, that was the reason I used BFS to save storage space but it turned out that the location index is small enough and we can avoid the graph traversal, of course we should make this more explicit in the code. Another complex topic of the location index is when no junction node falls inside a cell but still the edge crosses it and it is explained here graphhopper/graphhopper#232 and here graphhopper/graphhopper#221 |
Oh dear, I just deleted my comment! In short, I've had to reimplement for my specific use case (donuts with e.g. radii of 30km and hence many lookups - and lots of data so they need to be really fast). I noticed both of the above, and hence the approach I take is
This removes the need for XFirstSearchCheck, etc., and seems more straightforward, as well as maybe catching a few more edge cases (hah, good pun) and a bit faster. However, both (I think) still fail to catch some edges. E.g. if we have a 50m search radius and and edge which passes through it, but whose nodes are 10km away (which is possible - and the 10km depends on index configuration), then it may not be found. I tried some approaches to fix this (e.g. increasing initial location index search region, or starting a BFS search at each candidate node, and continuing until we reach a node which is e.g. more than 1km out of bounds), however I think the 'best' may be to create an edge-based index (as opposed to a node-based one). This will not only be simpler (and, in theory, much faster), but will probably benefit other GH development. @karussell - has this been tried before? If so, and you think it a good idea, it's probably for a new thread ... @karussell - shall we close this PR and start a new issue (or discussion) around possibilities for a better location lookup? |
I currently do not see how. We avoid duplicate checks already (?) and so we just check the edges and nodes that we need. The only thing I see is that with the edge based stuff we could get rid of "graph exploration" (set node & iterate) and fetch the EdgeIteratorState objects directly, we'd have to try if this results in something faster.
But then the bresenham should make its work. Do you have a test case reproducing the issue you have (in mind)? |
Looks like I have misunderstood - will a node be added to the a tile in the index if one of its edges is in that tile (but the node isn't)? Is that how bresenham is used? If that is the case, then the current node index is (after graph exploration) effectively an edge index.
At one stage I thought we didn't - I think I checked and found we do (since the BFS only searches edges or current node and goes no further).
I was thinking of some sort of quadtree. To build, we iterate through every edge, and add the edge to each tile it passes through (splitting the tile when it has too many edges). (Maybe this is what's already happening anyway, just saving nodes not edges?) Then the lookup of a given tile can simply return the edge IDs instead of node IDs, and, as you say, we avoid the graph exploration stuff - and the code may be simpler. |
Yes. Sometimes bresenham does not work 100% precise (see one of the linked issues) and so it could be that the edge is in the cell although it is only crossing a neighbour cell so there is optimization potential but nothing bad. |
Closing in favor of graphhopper 994. |
See discussion at #65.
Notes:
minDistance
feature (though I do, for cell network stuff where we only know a radius as opposed to a point). It's easy to keep, so I figure there's no harm.@karussell, am I correct with the last two points?