You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Added a methodology at here for identifying if a dataset is of high value.
A dataset is considered of high value if the bbox it defines applies to a whole country.
The methodology has some limitations taking into account only datasets(xml file) that contain a single bbox.
We have defined a threshold if the bounding box covers at least 70% of the country to consider a dataset
of high value. This threshold can be changed based on our needs
Next Steps
The methodology identifying if a dataset is of high value taking into account multiple bboxes in a dataset(xml file) gets more complicated.
we could also do the approach differently; instead of detecting whether something is ‘national’, try to find out what level the bbox is at:
take midpoint bbox (possibly expandable to multiple points should we find this safer)
query OSM with these coordinates (or other database with hierarchical administrative units)
compare overlapping area with all administrative units found
save the level with best overlap so that use case by use case everyone can look at what is most valuable.
4extra) use decimal numbers to get extra nuances; e.g. dataset overlapping 40% of a country gets level 3.6
For the rest, I certainly also agree with Paul that ‘High Value’ should certainly not only equate to national. I would very much like to see a data density factor added in the future.
But further approach probably still to be discussed in an upcoming meeting?
Added a methodology at here for identifying if a dataset is of high value.
A dataset is considered of high value if the bbox it defines applies to a whole country.
The methodology has some limitations taking into account only datasets(xml file) that contain a single bbox.
We have defined a threshold if the bounding box covers at least 70% of the country to consider a dataset
of high value. This threshold can be changed based on our needs
Next Steps
The methodology identifying if a dataset is of high value taking into account multiple bboxes in a dataset(xml file) gets more complicated.
FYI @DajanaSnopkova @Max-at-Vlaanderen @pvgenuchten @Tomas-Pavelka
The text was updated successfully, but these errors were encountered: