-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathdark_pois.html
349 lines (294 loc) · 12.8 KB
/
dark_pois.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<style>
pre { background-color:yellow; padding:0.5em; }
</style>
<title>Finding Dark POIs</title>
</head>
<body style="font-family:sans-serif; max-width:30em">
<p><a href="../index.html">Overpass API</a> > <a href="index.html">Blog</a> ></p>
<h1>Finding Dark POIs</h1>
<p>Published: 2017-03-06</p>
<h3>Table of content</h3>
<a href="#design_considerations">Design considerations</a><br/>
<a href="#hands_on">Hands On</a><br/>
<a href="#the_real_work">The Real Work</a><br/>
<a href="#caveats">Caveats</a><br/>
<a href="#the_shortcut">The Shortcut</a><br/>
<a href="#feedback">Feedback</a><br/>
<p>The mapper who has mapped <a href="https://www.openstreetmap.org/node/4379801321">this node</a> may wonder
why it never has appeared on the map:</p>
<pre>
node 4379801321: 51.5133578, -0.1015195
name = Go Native
</pre>
<p>Well, although it has a name tag, and thus at least some interesting information, we need more.
To get an object rendered, a necessary condition is that the object has a location and a type.
The type is missing here.</p>
<p>I would like to call these objects <em>dark POIs</em>,
in analogue to <a href="https://en.wikipedia.org/wiki/Dark_matter">dark matter</a>.</p>
<h2 id="design_considerations">Design considerations</h2>
<p>Do these elements pose a problem?
We will se later on that they are so sparse that this does not matter.
However, somebody has put effort to add information here.
And we cannot make sense of it.
Thus, we could improve the map if we could re-observe the object or get in touch with the mapper.</p>
<p>The deeper reason why this can happen is that OpenStreetMap allows for free-form taggging.
This is for a reason:
It turned out that there are more features of unexpected types outside than you and me might imagine.</p>
<p>For example, if you would like to create a map of subway lines then you may center you attention about tunnels.
But a subway with a full-fledged service quality running every few minutes can be been constructed as a monorail.
Even a suspended monorail.</p>
<figure>
<img src="schwebebahn.jpg" style="width:20em;"/>
<figcaption>The <a href="https://en.wikipedia.org/wiki/Wuppertal_Suspension_Railway">Wuppertaler Schwebebahn</a>, a subway that is neither sub-surface nor standing on rails.</figcaption>
</figure>
<p>There are also <a href="https://en.wikipedia.org/wiki/Rubber-tyred_metro">subways on rubber tires</a> and <a href="https://en.wikipedia.org/wiki/Downtown_Seattle_Transit_Tunnel">buses calling in subway tunnels</a>.
The free form tagging enables us to model them all as they are in reality.</p>
<p>In fact it is even better:
If you design a tool then OpenStreetMap and Overpass API will help you to adapt your data model to reality:
you can find all the objects that violate your assumptions and thus update your assumptions quickly.
But this is subject of a later blog post.</p>
<h2 id="hands_on">Hands On</h2>
<p>Hence, our aim should be to recover the information that is missing. This requires two steps:</p>
<ol>
<li>Find non-conforming data</li>
<li>Contact the mapper or if not possible then re-survey the data.</li>
</ol>
<p>Please refrain from just deleting such data.
That would destroy the clue that there is data missing.</p>
<p>Now let us get hands on the first step.
The second is out of scope for this blogpost.</p>
<p>Suspicious objects are objects that are not well known.
For ways, a very simplistic apporach could be the assumption
that all ways are streets or buildings:</p>
<pre>
way({{bbox}})
[!building]
[!highway];
out geom;
</pre>
<p>We test this on purpose <a href="https://overpass-turbo.eu/s/nfv">on a small bounding box</a>.</p>
<p>What is this query composed of?
It fact it is a little computer program that consists of two commands:
<ul>
<li>The first command selects all ways (<em>way</em>)
that are in the given bounding box (<em>({{bbox}})</em>)
and do not have a tag <em>building</em> (shrek then <em>building</em> in brackets)
and do not have a tag <em>highway</em> (again shrek then <em>highway</em> in brackets).</li>
<li>Then the command <em>out geom</em> lets the engine send back the data to your browser.
Using <em>geom</em> ensures that the geometry is sent along.
We need it to know where to display each way in the map window.</li>
</ul>
<p>The result has around 400 ways.
Thst is pretty a lot for exceptions.
Clicking on some of the features reveals further commonplace tags:</p>
<pre>
way({{bbox}})
[!building]
[!"building:part"]
[!highway]
[!railway]
[!waterway];
out geom;
</pre>
<p>Note that <em>building:part</em> is in quotation marks.
Basically, all keys should have been in quotation marks to avoid that their special characters are misinterpreted.
But letters are no special characters,
hence Overpass API tolerates strings of pure letters also without quotation marks.</p>
<h2 id="the_real_work">The Real Work</h2>
<p>It turns out that a lot of ways have no tags at all.
Are all these ways mapping errors?
No, they might be just members of relations.
Collecting them needs some extra work:</p>
<pre>
rel({{bbox}})->.foo;
( way({{bbox}})
[!building]
[!"building:part"]
[!highway]
[!railway]
[!waterway];
- way({{bbox}})(r.foo); );
out geom;
</pre>
<p>There are two differences to the previous query.
The first line is new:
The statement <em>rel({{bbox}})</em> selects all relations in the bounding box.
The suffix <em>->foo</em> lets the query engine store the result in the set <em>foo</em>
- <em>foo</em> has the job of a variable here.</p>
<p>The second difference is that <em>way({{bbox}})...</em> is in parentheses and an extra line follows.
The extra line select all ways in the bounding box (the <em>way({{bbox}})</em> part)
that in addition are a member of one or more of the relations stored in <em>foo</em>.
Finally, we have the <em>( ... - ... )</em> parentheses and the minus.
This is the difference statement:
The result of the difference statement are all the objects from the result of its first statement
that are not contained in the result of the second statement.
In our case:
All ways with none of the listed tags (the first statement)
that are not a member of a relation from our bounding box (the second statement).</p>
<p><a href="https://overpass-turbo.eu/s/nfw">Try it yourself!</a></p>
<p>To faciliate finding further commonplace tags,
we can omit all but tags in the result.
For this, we switch from <em>out geom</em> to <em>out tags</em>.</p>
<pre>
rel({{bbox}})->.r;
( way({{bbox}})
[!building]
[!"building:part"]
[!highway]
[!railway]
[!waterway];
- way({{bbox}})(r.r); );
out tags;
</pre>
<p>Choose to get the data shown when the browser prompts you.</p>
<p>From this point we repeat the process until <a href="https://overpass-turbo.eu/s/nfA">we have excluded all the elements</a>
that we can classify based on their most commonplace tag.</p>
<pre>
rel({{bbox}})->.r;
( way({{bbox}})
[!amenity]
[!barrier]
[!building]
[!"building:part"]
[!"building:wall"]
[!highway]
[!landuse]
[!leisure]
[!man_made]
[!natural]
[!power]
[!railway]
[!shop]
[!surface]
[!tourism]
[!"roof:edge"]
[!"roof:ridge"]
[!waterway];
- way({{bbox}})(r.r); );
out geom;
</pre>
<p>Nodes and relations will have their own sets of tags.</p>
<pre>
way({{bbox}})->.w;
( node({{bbox}})
[!"addr:street"]
[!amenity]
[!barrier]
[!highway]
[!man_made]
[!natural]
[!shop]
[!tourism]
- node({{bbox}})(w.w); );
rel({{bbox}})->.r;
( ._ - node(r.r); );
out;
</pre>
<p><a href="https://overpass-turbo.eu/s/ng2">This</a> is one possibility for nodes.
We need to subtract twice other data from our potential result set.
First we remove the nodes referred by ways, because these are by far the most nodes.
Then we remove the nodes referred by relations.</p>
<pre>
rel({{bbox}})
[!amenity]
[!boundary]
[!building]
[!landcover]
[!leisure]
[!network]
[public_transport!=platform]
[public_transport!=stop_area]
[!restriction]
[type!~"^(bridge|multipolygon|route|site|tunnel)$"]
[!waterway];
out geom({{bbox}});
</pre>
<p>And <a href="https://overpass-turbo.eu/s/ng3">this</a> is one possibility for relations.
Currently all classes of relations should have tags.
Thus we do not need to exclude any relations because they are members of other relations.</p>
<p>Please note the line with <em>type</em>:
This would be identical to a list <em>[type!=bridge][type!=multipolygon]...</em>,
but this is faster:
It checks all expressions at once using a so called regular expression.
These come from the underlying operating system standard, POSIX, and are beyond this blog post.</p>
<p>In the last line, we use a <em>({{bbox}})</em> after <em>geom</em>.
This restricts the recieved data to the asked-for bounding box.
You have often long distance relations in your results.
If you had them completely
then you would have to handle orders of magintudes more data in an orders of magnitudes larger bounding box.</p>
<h2 id="caveats">Caveats</h2>
<p>We are after all about a so small set of elements
that we can treat each single way with diligence.
Do not forget that there might be so many mappers to contact as elements are there.</p>
<p>This is the reason why the list of tags will vary with the bounding box.
For example, <em>building:wall</em> is not that commonplace
- you should check for any tag in question the numbers on <a href="https://taginfo.openstreetmap.org/keys/building%3Awall">Taginfo</a>.
But I do accept for the moment that it is intended for 3D mapping and makes sense there.
If you were in 3D mapping you may want to walk through those objects
instead of accepting them as of well-known type.</p>
<p>Please note also that we easily might accept objects whose tagging does not make sense.
For example, the tagging <em>type=multipolygon</em> lets us accept <a href="https://www.openstreetmap.org/relation/5802680">relation 5802680</a>.
But that relation does not have any useful tag for classification.
It has just a name and the property being a multipolygon.</p>
<p>To sum up the caveats:</p>
<ul>
<li>There is no (and cannot be) objective choice of filter criteria.</li>
<li>Any choice will let pass objects that are less-than-perfect tagged, but of lower priority for fixing.</li>
<li>To improve the database you can focus on only such much objects as you can review by hand.</li>
<li>Deleting makes data of which we know it is missing into data of which we do not even know it is missing.
That is by far worse.</li>
</ul>
<h2 id="the_shortcut">The Shortcut</h2>
<p>Version 0.7.54 offers a faster approach to query for objects with insufficient tagging.
You can use the number of tags as filter criterion.
<a href="https://overpass-turbo.eu/s/ngj">Get all nodes</a> which have a <em>name</em> tag but no other tag:</p>
<pre>
node({{bbox}})
[name]
(if:count_tags()==1);
out;
</pre>
<p>The new feature is represented by the line <em>(if:count_tags()==1)</em>.
This consists of three parts:</p>
<ul>
<li>The frame <em>(if:...)</em> is <a href="https://wiki.openstreetmap.org/wiki/Overpass_API/Overpass_QL#Conditional_query_filter_.28if:.29">the frame</a> for all the new evaulators.
It allows to use an evaulator in a query statement.
The only current limitation is that it cannot be the only criterion in a query statement.
This may or may not be improved in the next version.</li>
<li>The function <a href="https://wiki.openstreetmap.org/wiki/Overpass_API/Overpass_QL#Element_properties_count"><em>count_tags()</em></a> returns per candidate element for the result the number of tags of that element.</li>
<li>The <a href="https://wiki.openstreetmap.org/wiki/Overpass_API/Overpass_QL#Equality_and_inequality">equality operator</a> returns whether the values on both sides are equal.</li>
</ul>
<p>Of course this works also with ways.
More interesting is to search <a href="https://overpass-turbo.eu/s/ngE">for ways with no tags at all</a>.
To get a relevant result set, we must exclude ways that are part of relations:</p>
<pre>
rel({{bbox}})->.foo;
( way({{bbox}})
(if:count_tags()==0);
- way({{bbox}})(r.foo); );
out geom;
</pre>
<p>This allows you to find suspect objects without listing explicitly well-known tags.
As it needs anyway time to fix these objects,
it is a good balancy to make efficient use of time for data gardening.
It is straightforward to apply this to other tags that do not classify objects (like <em>area=yes</em>).
How the long way helps to figure out whether your data model fits to the actual OpenStreetMap data
will be subject of a subsequent blog post.</p>
<h2 id="feedback">Feedback</h2>
<p>As nebulon42 <a href="https://lists.openstreetmap.org/pipermail/talk/2017-March/077648.html">has suggested</a>,
you can <a href="https://overpass-turbo.eu/s/ntc">search for a lot of old-style multipolygons</a> this way:</p>
<pre>
relation({{bbox}})
[type=multipolygon]
(if:count_tags()==1);
(._;>;);
out;
</pre>
<p>Thanks a lot.</p>
</body>
</html>