-
Notifications
You must be signed in to change notification settings - Fork 273
route cache for performance? #81
Comments
Does LocationIndexMatch.findNClosest return the perpendicular of the passed GPS position on all real edges within gpxAccuracyInMetern? Moreover, if the perpendicular is at the start/end of a real edge does it then return a real node? (Anyway, it would be nice to document this method.)
All real node candidates should be deduped on node ID to avoid redundant computation. Not sure if we also need to dedupe virtual nodes (but we should if there are duplicates before adding direction). So we could just dedupe all candidates using QueryResult.getClosestNode() before adding direction. Otherwise, I think it is very unlikely that the same transition occurs multiple times during a map matching so a route cache should not be needed. |
No idea. I'm using my own implementation, as findNClosest doesn't work for me.
I thought that too, but see my comment above - we do need both edges as candidates (even if they share the same node). That's why we'd need the route cache, so that when these duplicate (by node ID) edges are routed, we only route once, to avoid redundant computations.
It depends on the size and nature of the route. E.g. for a taxi based at a single stand (e.g. the airport), they'll travel the same roads around the stand many times - and depending on the nature of the roads and the frequency of their GPS measurements, the same transitions may occur often. Similar for e.g. buses. My use case certainly duplicates them, but it's unlikely to be relevant to most. Either way, the route cache should help performance in the general case - and greatly help it if there are many duplicate transitions. So, no only positives, no harm?** ** OK, maybe more memory. Though eventually I'd like to not cache the actual paths (likewise in the timesteps), just the distance/time, etc. |
I don't think so, because we only use the closest node for the candidate (as you wrote before). This is because the routing between subsequent candidates only needs a start and an end node. The closest edge of the QueryResult is never used.
But only if we get the exact same virtual nodes by another |
I think it is - when we do the lookup for the virtual node. E.g.
Currently, we get two candidates, one with a virtual node somewhere on v1 and one with a virtual node on v2. If we dedupe on closest node (N) then we'll only get one virtual node (on v1 or v2 depending which is closer to the GPX point), and hence we'll eliminate one (valid) candidate node. It's worth noting that these virtual nodes are directly used in the computation of the transition probabilities. However, that also means that our route cache wouldn't help if e.g. we were using the nodes as the key ...
Fair point. Again, I'm thinking of my use case (cell data) where the GPS location are quantized as such, and hence are likely to be repeated. In the generic case, I think you're right. |
Can you please show me where this happens? When calling algo.calcPath, only the virtual nodes are used as input to compute the path. Different calls of this method with different candidates but the same virtual nodes should still give same results. Or are you referring to PR #83? There we potentially need to dedupe the result of findGPXEntriesInGraph on virtual nodes before adding direction. |
I think we agree -
with N/M virtual nodes (GPX points) and v1/v2/u1/u2 virtual edges (well pairs of them). If we dedupe on closest node, we might end up with just these edges:
when there are actually three other possible arrangements. Whether or not it's OK to ignore these three, I'm not sure ... but if feels like we shouldn't (especially if we're going to be doing directional stuff e.g. #83). I also think I tried it (it's a one-liner replacement) and the tests failed - not 100% sure, but I'll check tomorrow if you haven't confirmed. |
My understanding is that a GPS point is not replaced by a virtual node but that the closest (virtual) point of a GPS point is the perpendicular of the GPS point on the real edge. So different GPS points might have the same closest virtual node and this virtual node is then used in both cases for routing. @karussell, can you please clarify? |
I'm not sure if the QueryGraph already does the optimization that two GPS points create only one virtual node if it is the same, I would have to try this.
A GPS point is nothing we feed the algorithm, so we cannot replace nodes with it. Instead a GPS point is looked up with the QueryGraph which highly likely creates a virtual node (and 4 virtual edges) and sets the 'closest node' property of QueryResult.
So X is the measurement via lat,lon ('GPS point') and creates a snapped point on the edge A-B (lat,lon), this snapped point has an associated virtual node V and is inserted with 4 new virtual edges. A and B are real nodes. The GPS point highly likely creates no virtual nodes and edges if the measurement is not 'perpendicular' relative to the edge like for 'endstanding' edges e.g.
|
Thanks @karussell! Does |
Yes, this is true for mostly all cases. But it can also return results outside of gpxAccuracyInMetern e.g. if otherwise the set would be empty. So it returns any closest match but in most cases there will be matches within the limit |
Correct. Sorry, I first misread your sentence.
If we do the following, I don't think we're missing candidate edges:
|
Maybe we should wait until we see what form #83 takes? |
Yes, this makes sense. |
@karussell: I have a PR ready for deduping all query results by node id after calling |
Nice - looking forward to this!
Will this reuse existing virtual nodes, or in which sense did you mean deduping?
Please do not wait for my action. Currently too many things on the table. |
This will only dedupe multiple occurrences of the same tower node. |
I'm wondering if we should cache routing requests in computing the transition probabilities.
The first reason is obvious - if it's e.g. a long GPX track covering the same location multiple times, then some of the candidate routes may be repeated, etc.
The second reason is maybe a little more involved ...
It should be pretty easy to implement, though we'd probably want #73 implemented first so we can test performance etc.
The text was updated successfully, but these errors were encountered: