-
Notifications
You must be signed in to change notification settings - Fork 3
/
around_performance.html
289 lines (223 loc) · 13.3 KB
/
around_performance.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<style>
pre { background-color:yellow; padding:0.5em; }
</style>
<title>Version 0.7.54 Completed</title>
</head>
<body style="font-family:sans-serif; max-width:30em">
<p><a href="../index.html">Overpass API</a> > <a href="index.html">Blog</a> ></p>
<h1>Version 0.7.56 Examples</h1>
<p>Published: 2020-11-18</p>
<h3>Table of content</h3>
<a href="#angle">Measuring Angles</a><br/>
<a href="#memberpos">Only the First Node of a Way</a><br/>
<a href="#type_shortcuts">Type Shortcuts</a><br/>
<p>For a software developer, it is always good to learn how the users actually use the software.
From that point of view, the analysis by StreetComplete has been very instructive.</p>
<p>The entry in the WeeklyOSM even sounds like there were an opportunity to get the OVerpass API overall faster with little effort.</p>
<p>It has shown that there are some misunderstandings still around,
even with very experienced users.
Thus I would like to address them in the following.</p>
<h2 id="remote">Using a Remote Database</h2>
<p>The whole point of server side preprocessing is that you want to get less data, not more.</p>
<p>The given scenario is to filter all OSM data within a hundred meter or so for literally a hundred different criteria.
Effectively, many features are potentially downloaded over and over again,
thus in the end it is as much data transmitted as would have been anyway for a complete download,
just with much more computation server side.</p>
<p>In fact, this approach is inferior already much earlier:
- There is substantial connection overhead: round trip times, handshaking, headers in both HTTP and the JSON response, connection management by the server and so on.
- In this case much worse, the server is reading the same data again and again from the disk. The total processing time makes it unlikely that the whole needed data resides in the cache.
- Data might get inconsistent to each other, because the requests might be routed towards different servers or an update of the data has been applied.
- The broken-down requests deprive the server of optimization opportunities, because the server cannot know which requests will follow.
<p>So what to do else?</p>
<p>If your filter criteria line up more or less well with the Overpass API features
then build one large request or at least few large requests.
With a statement pair like `make marker_42; out;` you can have as many markers to group the response as you want.</p>
<p>If you anyway have your own filters on data within a few hundred meters
then it makes sense to download the full data and filter client-side.
Whether you are better off with the Overpass API or the main database API
depends largely on whether the ultimate purpose is editing -
the main API is reserved for editing purposes only, to avoid or mitigate overload problems.</p>
## Flat Worlds, Ball Shaped Worlds
<p>There is [https://github.com/drolbr/Overpass-API/pulls](amongst the open pull requests) no obvious candidate for a huge performance improvement.
After reading all of the comments from [https://github.com/westnordost/StreetComplete/issues/1514](the issue in StreetComplete) that is about the performance analysis,
it likely refers to [https://github.com/drolbr/Overpass-API/pull/167](this) quite old pull request.
This is not a general improvement, but a heuristic for pre-filtering for only the `around` filter.</p>
<p>The problem with that optimization is that removes valid results, i.e. the computation is invalid.</p>
<p>Let's have a look at 60°N.
It is convenient because cosine is 0.5 for 60°, i.e. the latitude at 60° has a total length that is only half of the equator.
In the bounding box model, one should expect that the eastmost point from 60°N 0°E that is no further away than 11.111111 km is 60°N 0.2°E. In fact, rounded for OSM precision, it is 60.00015°N 0.2000003°E.
In other words: a bounding box needs to extend up to 0.2000003°E instead of just 0.2°E.
While this is already significant with OSM precision,
the effect gets quickly stronger both with approaching the poles and with growing distances.</p>
<p>There are probably solutions for this:
The best would be to
like a blunt plus one percent for up to 60 degrees latitude from equator and turning off the feature beyond 60 degrees.
A more sophisticated approach would rely on the internal block structure of the quadtiles.
The whole thing must take segments into account as well.</p>
<p>So there is simply too much work left for an, all in all, rather corner case.</p>
## Optimizations
<p>Please not that many suggested optimizations are indeed rather tradeoffs.
I.e. if I favour a certain special case then many other cases will be executed slower.
I avoid those changes whenver possible.</p>
<p>Are there other general optimizations in the pipeline?
Any answer is pretty close to vaporware, thus I'm reluctant to tell plans.</p>
<p>Having ids on each and every node makes for a giant data bloat.
Depending on the exact data model after the transition,
the data would shrink by 50% to 80% without any data loss.
Joto and I [...](have presented) this observation on the SotM in Milano.</p>
<p>This is expected to bring a speedup of factor two to five for all operations that involve geomtry for ways.
The effort comes from the bookeeping in all edge cases of changesets, no matter how obscure.
I'm working on this in the branch [...](geometric ways),
together with hardening against updating trouble from occasional bugs in the replication of the main database.</p>
<p>Our data model around relations has a separate flaw that makes working with the hsitory basically impossible.
I have not made any final decision on the data model afterwards.
The aim is to fulfill requests on attic data for relations in a speed similar to the same request on current data.</p>
<p>Both together are already a program for up to mid-2021 at best.</p>
<p>I'm back with providing updates. After more than a year I'm proud to present a new release.
This release provides some smaller new features.</p>
<p>This is not exactly in line with the strategy of: <em>release early, release often</em>
The rationale is that all the substantial planned changes will change the underlying database model.
This ia a lot of work and bandwith load for all people running their our instances.
Thus I hope I can bundle those changes in one large later version.</p>
<p>A first step are the versions 0.7.56.100x
(as opposed to the 0.7.56.x line presented here).
I do not recommend them yet for private installations because there are no clone files for it.
They feature more flexible block sizes.
That way they can accomodate the area description for Antartica which had been too large due to a projection artifact.
At the same time they are faster because on average less data needs to be decompressed per request.
The code currently runs on the <a href="https://z.overpass-api.de">z.overpass-api.de</a> instance.</p>
<p>Back to this version, there are three new features:
You can measure and work with the angles of ways now.
You can recurse on only selected nodes of a way.
And some extra shortcuts make it easier to get some but not all types of objects.</p>
<h2 id="angle">Measuring Angles</h2>
<p>It is now possible to measure angles in ways.
In the example we query for all ways that <a href="https://overpass-turbo.eu/s/SGu">have a very acute angle</a>:</p>
<pre>
way({{bbox}})
(if:lrs_in(1,per_vertex(angle() < -170 || angle() > 170)));
out geom;
</pre>
<p>Why does this look so involved?
First of all, we search for all ways in a given bouding box, contained in the Overpass Turbo link.
The angle related part happens in the filter <em>(if:...)</em>.
This filter is evaluated once for every object, i.e. for each way here,
but angles are different per vertex of the way.</p>
<p>For this reason, the angle related condition is stated within <em>per_vertex(...)</em>.
That evaluator evalutes its expression once for every vertex of the way
and returns the results as a semicolon separated list.
For illustrative purposes, <a href="https://overpass-turbo.eu/s/SGw">an example</a> for an (almost) retangular building and for an almost straight street:</p>
<pre>
way(id:266149365,194926851);
convert info desc=is_tag("highway")?"highway":"building",
angles=per_vertex(angle());
out geom;
</pre>
<p>You can see the built list of angles in tag <em>angle</em> of each of the two created objects.
It is almost zero for all angles for the street,
and it mixes almost zero with almost 90 degrees for the building.</p>
<p>A remark to values: I decided to use turning angle as the mental point of view.
I.e. a straight road is understood as having an angle close to zero at the respective vertex,
a sharply turning road has a higher angle, and a U-turn is having an angle close to 180 degrees.
Consequently, acute angles in buildings are also close to 180 or -180 degrees respectively.</p>
<p>The approach is <a href="https://overpass-turbo.eu/s/SGx">to simplify these numbers</a> to boolean zero or one already in the expression of <em>per_vertex</em>.
We put a boolean expression like <em>angle() < -2</em> in the argument of <em>per_vertex</em>
(refined to catch both negative and positive large angles).
This way we get zero for almost zero angles and one for all other angles:</p>
<pre>
way(id:266149365,194926851);
convert info desc=is_tag("highway")?"highway":"building",
angles=per_vertex(angle() < -2 || angle() > 2);
out geom;
</pre>
<p>We can now use the <a href="https://dev.overpass-api.de/overpass-doc/en/criteria/lrs.html#single">here documented</a> evaluator <em>lrs_in(1, ...)</em>
to check for each way whether there is at least one vertex in a way that fulfills the spike condition.
This way, we keep only those ways that have a very acute angle.</p>
<p>For your convenience, a complete example to <a href="https://overpass-turbo.eu/s/RtB">query for non-rectangular buildings</a>:</p>
<pre>
way({{bbox}})[building]
(if:lrs_in(1,per_vertex(
angle() < -92 || (angle() > -88 && angle() < -2)
|| (angle() > 2 && angle() < 88) || angle() > 92)));
out geom;
</pre>
<p>In this case, it could be interesting to figure out
where the non-rectangular vertices are.
We do this <a href="https://overpass-turbo.eu/s/RtC">for a single way</a>, because where houses are built wall-to-wall,
it is probable that one looks on the wrong building otherwise:</p>
<pre>
way(60538085);
out geom;
</pre>
<p>Now we select <a href="https://overpass-turbo.eu/s/RtE">all nodes for which</a> the way has a non-rectangular and non-zero angle in that node:</p>
<pre>
way(60538085);
node(w)(if:lrs_in(id(),set(per_vertex(
(angle()<-92 || (angle() >-88 && angle() < -2)
|| (angle() > 2 && angle() < 88) || angle() > 92)
? ref() : 0))));
out geom;
</pre>
<h2 id="memberpos">Only the First Node of a Way</h2>
<p>There are more evaluators that can be applied per vertex of a way:
With the evalutators for <em>pos</em> and <em>ref</em> of an element,
one can get out of a recursion only elements on specific positions.
We can ask for only the first and the last member of the way just mentioned before:</p>
<pre>
way(60538085);
node(w)(if:lrs_in(id(),set(
per_member(lrs_in(pos(),"1;"+(count_members()-1))?ref():0))));
out geom;
</pre>
<p>We use the filter to accept only the ids of those objects
that are a member of the way on the desired positions.
They are taken from a list constructed for this purpose with <em>per_member</em>.</p>
<p>Why do we not use <em>per_vertex</em> again?
The evaluator <em>per_vertex</em> is designed for angles:
It skips the first position always and the last position for open ways,
because there is no angle defined there.
By contrast, <em>per_member</em> is always executed once for every member.</p>
<p>Inside <em>per_member</em>, a list evaulator is used
to select the pos from the expected positions <em>1</em> or <em>count_members()-1</em>.
If it matches, then the id is returned,
and that adds the id to the list of accepted ids.
Otherwise zero is returned, and this is never a valid id.</p>
<p>For your convenience, there is a shortcut integrated in the recurse filter,
as shown in <a href="https://overpass-turbo.eu/s/RtF">this example</a>.
Negative values are counted from the tail of the way,
i.e. <em>-2</em> is the last but one node of the way:</p>
<pre>
way(60538085);
node(w:1,-2);
out geom;
</pre>
<p>The shortcut is not available for relations yet,
because I have not figured out so far a compelling syntax for that.</p>
<h2 id="type_shortcuts">Type Shortcuts</h2>
<p>The introduction of the <em>nwr</em> shortcut has improved the readability of queries.
Thus the language goes further down that road:
there are now shortcuts <em>nw</em>, <em>nr</em>, and <em>wr</em> available.
Please feel free to experiment in examples like <a href="https://overpass-turbo.eu/s/RtG">this</a>:
Exchange one shortcut for another and watch how the result changes!</p>
<pre>
nwr({{bbox}})[amenity=place_of_worship];
out center;
</pre>
<p>To make the language more consistent,
the same set of shortcuts <a href="https://overpass-turbo.eu/s/RtH">can be used</a> as argument for the <em>count</em> evaluator:</p>
<pre>
[out:csv(cnt,amenity)];
nwr({{bbox}})[amenity];
for (t["amenity"])
{
make stat cnt=count(nwr),amenity=_.val;
out;
}
</pre>
</body>
</html>