-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Construction from sf object breaks when circular linestrings occur #59
Comments
Hi! I think that I understood what's the problem here. The solution is not super simple but I will try my best to summarize it. The first part of the following reprex is taken from Robin's example. # download and install packages
remotes::install_github("itsleeds/geofabrik", quiet = TRUE)
remotes::install_github("itsleeds/slopes", quiet = TRUE)
remotes::install_github("luukvdmeer/sfnetworks", upgrade = FALSE, quiet = TRUE)
# load packages
library(geofabrik)
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(slopes)
# download geofabrik data
city_name = "sheffield"
buffer_size = 2000
city_sf = tmaptools::geocode_OSM(city_name, as.sf = TRUE)
city_buffer = stplanr::geo_buffer(city_sf, dist = buffer_size)
osm_data_city = get_geofabrik(name = city_sf)
osm_data_no_na = osm_data_in_buffer[!is.na(osm_data_in_buffer$highway), ]
geo_type <- sf::st_geometry_type(osm_data_no_na)
osm_data_linestring = osm_data_no_na[geo_type == "LINESTRING", ]
# add slopes
osm_data_z = slope_3d(osm_data_linestring) # must be linestrings
# create sfnetwork object
osm_data_no_z <- st_zm(osm_data_z)
#error
sfnetworks::as_sfnetwork(osm_data_no_z)
#> Error: Assigned data `value` must be compatible with existing data.
#> x Existing data has 5057 rows.
#> x Assigned data has 5027 rows.
#> i Only vectors of size 1 are recycled. Using the debugging functions I can see that the problem is created by par(mar = rep(0, 4))
plot(st_geometry(osm_data_no_z[3, ])) indeed st_boundary(st_geometry(osm_data_no_z)[3])
#> Geometry set for 1 feature (with 1 geometry empty)
#> geometry type: MULTIPOINT
#> dimension: XY
#> bbox: xmin: NA ymin: NA xmax: NA ymax: NA
#> geographic CRS: WGS 84
#> MULTIPOINT EMPTY and this is the reason why there are less rows than what it should be. The weird part is that there is no problem with osm_data_no_z_stplanr <- stplanr::SpatialLinesNetwork(osm_data_no_z)
#> Warning in SpatialLinesNetwork.sf(osm_data_no_z): Graph composed of multiple
#> subgraphs, consider cleaning it with sln_clean_graph(). Why? I think the reason is that osm_data_no_z_coordinates <- st_coordinates(osm_data_no_z)
head(osm_data_no_z_coordinates, 10)
#> X Y L1
#> [1,] -1.477983 53.38939 1
#> [2,] -1.477845 53.38926 1
#> [3,] -1.477702 53.38914 1
#> [4,] -1.477549 53.38906 1
#> [5,] -1.477468 53.38901 1
#> [6,] -1.477391 53.38897 1
#> [7,] -1.477221 53.38889 1
#> [8,] -1.484426 53.38227 2
#> [9,] -1.484370 53.38231 2
#> [10,] -1.484317 53.38238 2 then we should indentify the indexes of the first and last occurrences of the pairs of coordinates: first_pair <- !duplicated(osm_data_no_z_coordinates[, "L1"])
last_pair <- !duplicated(osm_data_no_z_coordinates[, "L1"], fromLast = TRUE)
idxs <- first_pair | last_pair I took the code from here: https://stackoverflow.com/questions/11546684/how-can-i-find-the-first-and-last-occurrences-of-an-element-in-a-data-frame osm_data_no_z_pairs <- osm_data_no_z_coordinates[idxs, ]
osm_data_no_z_nodes <- st_cast(st_sfc(st_multipoint(osm_data_no_z_pairs[, c("X", "Y")]), crs = st_crs(osm_data_no_z)), "POINT")
osm_data_no_z_nodes
#> Geometry set for 10114 features
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: -1.500281 ymin: 53.3627 xmax: -1.440187 ymax: 53.39863
#> geographic CRS: WGS 84
#> First 5 geometries:
#> POINT (-1.477983 53.38939)
#> POINT (-1.477221 53.38889)
#> POINT (-1.484426 53.38227)
#> POINT (-1.479991 53.38661)
#> POINT (-1.483918 53.3819) Now I try to wrap up everything in a function: test_coordinates_pairs <- function(x) {
# 1. extract coordinates
x_coordinates <- unname(st_coordinates(x))
# 2. Find idxs of first and last coordinate (i.e. the boundary points)
first_pair <- !duplicated(x_coordinates[, 3])
last_pair <- !duplicated(x_coordinates[, 3], fromLast = TRUE)
idxs <- first_pair | last_pair
# 3. Extract idxs and rebuild sfc
x_pairs <- x_coordinates[idxs, ]
x_nodes <- st_cast(st_sfc(st_multipoint(x_pairs[, c(1, 2)]), crs = st_crs(x)), "POINT")
x_nodes
} Compare with test_coordinates_pairs(sfnetworks::roxel)
#> Geometry set for 1702 features
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: 7.522622 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#> geographic CRS: WGS 84
#> First 5 geometries:
#> POINT (7.533722 51.95556)
#> POINT (7.533461 51.95576)
#> POINT (7.532442 51.95422)
#> POINT (7.53209 51.95328)
#> POINT (7.532709 51.95209)
sfnetworks:::get_boundary_points(sfnetworks::roxel)
#> Geometry set for 1702 features
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: 7.522622 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#> geographic CRS: WGS 84
#> First 5 geometries:
#> POINT (7.533722 51.95556)
#> POINT (7.533461 51.95576)
#> POINT (7.532442 51.95422)
#> POINT (7.53209 51.95328)
#> POINT (7.532709 51.95209) Check performance bench::mark(
test_coordinates_pairs(sfnetworks::roxel),
sfnetworks:::get_boundary_points(sfnetworks::roxel),
iterations = 20,
check = FALSE
)
#> # A tibble: 2 x 6
#> expression min median `itr/sec`
#> <bch:expr> <bch:> <bch:> <dbl>
#> 1 test_coordinates_pairs(sfnetworks::roxel) 30.2ms 35.7ms 25.1
#> 2 sfnetworks:::get_boundary_points(sfnetworks::roxel) 44.6ms 47.8ms 20.6
#> # ... with 2 more variables: mem_alloc <bch:byt>, `gc/sec` <dbl> and it is slightly faster than the current approach. Why did I set all.equal(test_coordinates_pairs(sfnetworks::roxel), sfnetworks:::get_boundary_points(sfnetworks::roxel))
#> [1] "Attributes: < Component \"crs\": Names: 2 string mismatches >"
#> [2] "Attributes: < Component \"crs\": Component 1: Modes: character, numeric >"
#> [3] "Attributes: < Component \"crs\": Component 1: target is character, current is numeric >"
#> [4] "Attributes: < Component \"crs\": Component 2: 1 string mismatch >"
#> [5] "Attributes: < Component \"ids\": Numeric: lengths (1, 851) differ >" Do you know if the differences in the crs and ids are relevant? Obviously it should be tested a bit more but I just wanted to check your ideas before keep working. Created on 2020-06-17 by the reprex package (v0.3.0) I tested these ideas in another branch of this repo and it seems to be working: # download and install packages
remotes::install_github("luukvdmeer/sfnetworks", ref = "cb42231f1fc50677dc916c0ccaef9cadd323d50d", upgrade = FALSE, quiet = TRUE)
# load packages
library(geofabrik)
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(slopes)
# download geofabrik data
city_name = "sheffield"
buffer_size = 2000
city_sf = tmaptools::geocode_OSM(city_name, as.sf = TRUE)
city_buffer = stplanr::geo_buffer(city_sf, dist = buffer_size)
osm_data_city = get_geofabrik(name = city_sf)
#> although coordinates are longitude/latitude, st_contains assumes that they are planar
#> The place is within these geofabrik zones: Europe, Great Britain, Britain and Ireland, England, South Yorkshire
#> Selecting the smallest: South Yorkshire
#> Data already detected in C:/Users/Utente/Documents/my_osm_data_from_geofabric/South Yorkshire.osm.pbf
#> Old attributes: attributes=name,highway,waterway,aerialway,barrier,man_made
#> New attributes: attributes=name,highway,waterway,aerialway,barrier,man_made,maxspeed,oneway,building,surface,landuse,natural,start_date,wall,service,lanes,layer,tracktype,bridge,foot,bicycle,lit,railway,footway
#> Using ini file that can can be edited with file.edit(C:\Users\Utente\AppData\Local\Temp\RtmpmuBoz5/ini_new.ini)
osm_data_in_buffer = st_intersection(osm_data_city, city_buffer)
#> although coordinates are longitude/latitude, st_intersection assumes that they are planar
#> Warning: attribute variables are assumed to be spatially constant throughout all
#> geometries
osm_data_no_na = osm_data_in_buffer[!is.na(osm_data_in_buffer$highway), ]
geo_type <- sf::st_geometry_type(osm_data_no_na)
osm_data_linestring = osm_data_no_na[geo_type == "LINESTRING", ]
# add slopes
osm_data_z = slope_3d(osm_data_linestring) # must be linestrings
#> Loading required namespace: ceramic
#> Preparing to download: 16 tiles at zoom = 14 from
#> https://api.mapbox.com/v4/mapbox.terrain-rgb/
# create sfnetwork object
osm_data_no_z <- st_zm(osm_data_z)
osm_data_no_z_sfnetwork <- sfnetworks::as_sfnetwork(osm_data_no_z, directed = FALSE)
plot(osm_data_no_z_sfnetwork) Created on 2020-06-17 by the reprex package (v0.3.0) If you think it's a good idea I will keep working in that separate branch and PR to develop. |
Thanks for investigating @agila5 Unfortunately I cannot run the examples without a Mapbox API key. The workflow you describe is exactly the same idea of how I changed to library(sf)
#> Linking to GEOS 3.7.1, GDAL 2.2.2, PROJ 4.9.2
l = st_linestring(c(st_point(c(1,1)), st_point(c(1,1))))
st_boundary(l)
#> MULTIPOINT EMPTY
# Of course, l is not a valid linestring geometry, because in the simple
# feature standard, linestrings cannot have the same start and endpoint.
st_is_valid(l)
#> [1] FALSE Created on 2020-06-17 by the reprex package (v0.3.0) But in road networks, these circular linestrings will occur a lot right, e.g. for roundabouts and squares (that according the sf standard should be modeled as polygons, but that does not fit into the routable network structure). I think we can hardly call the Having said that: I guess the conclusion is clear that I also do not see what exactly the difference in the crs is, but since Note: we should first merge master into develop so that they are equal again. Sorry I did not do that before |
One thing I just noticed in the code @agila5 : If library(sfnetworks)
library(sf)
#> Linking to GEOS 3.7.1, GDAL 2.2.2, PROJ 4.9.2
test_coordinates_pairs = function(x) {
# 1. extract coordinates
x_coordinates <- unname(st_coordinates(x))
# 2. Find idxs of first and last coordinate (i.e. the boundary points)
first_pair <- !duplicated(x_coordinates[, 3])
last_pair <- !duplicated(x_coordinates[, 3], fromLast = TRUE)
idxs <- first_pair | last_pair
# 3. Extract idxs and rebuild sfc
x_pairs <- x_coordinates[idxs, ]
x_nodes <- st_cast(st_sfc(st_multipoint(x_pairs[, c(1, 2)]), crs = st_crs(x)), "POINT")
x_nodes
}
roxel_z = st_zm(roxel, drop = FALSE, what = "Z")
sfnetworks:::get_boundary_points(roxel_z)
#> Geometry set for 1702 features
#> geometry type: POINT
#> dimension: XYZ
#> bbox: xmin: 7.522622 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#> CRS: EPSG:4326
#> First 5 geometries:
#> POINT Z (7.533722 51.95556 0)
#> POINT Z (7.533461 51.95576 0)
#> POINT Z (7.532442 51.95422 0)
#> POINT Z (7.53209 51.95328 0)
#> POINT Z (7.532709 51.95209 0)
test_coordinates_pairs(roxel_z)
#> Geometry set for 2 features
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: 7.530218 ymin: 51.95556 xmax: 7.533722 ymax: 51.95924
#> CRS: EPSG:4326
#> POINT (7.533722 51.95556)
#> POINT (7.530218 51.95924) Created on 2020-06-17 by the reprex package (v0.3.0) |
Hi and thanks for the review!
I totally missed that 😅 Still, it's good that the underlying ideas are the same.
That's not super difficult to fix. For example, I was thinking of: # packages
library(sfnetworks)
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
test_coordinates_pairs = function(x) {
# 1a. extract coordinates
x_coordinates <- st_coordinates(x)
# 1b. Find index of L1 column
L1_index <- which(colnames(x_coordinates) == "L1")
# 1c. Remove colnames
x_coordinates <- unname(x_coordinates)
# 2. Find idxs of first and last coordinate (i.e. the boundary points)
first_pair <- !duplicated(x_coordinates[, L1_index])
last_pair <- !duplicated(x_coordinates[, L1_index], fromLast = TRUE)
idxs <- first_pair | last_pair
# 3. Extract idxs and rebuild sfc
x_pairs <- x_coordinates[idxs, ]
x_nodes <- st_cast(st_sfc(st_multipoint(x_pairs[, -L1_index]), crs = st_crs(x)), "POINT")
x_nodes
}
roxel_z = st_zm(roxel, drop = FALSE, what = "Z")
test_coordinates_pairs(roxel_z)
#> Geometry set for 1702 features
#> geometry type: POINT
#> dimension: XYZ
#> bbox: xmin: 7.522622 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#> geographic CRS: WGS 84
#> First 5 geometries:
#> POINT Z (7.533722 51.95556 0)
#> POINT Z (7.533461 51.95576 0)
#> POINT Z (7.532442 51.95422 0)
#> POINT Z (7.53209 51.95328 0)
#> POINT Z (7.532709 51.95209 0)
sfnetworks:::get_boundary_points(roxel_z)
#> Geometry set for 1702 features
#> geometry type: POINT
#> dimension: XYZ
#> bbox: xmin: 7.522622 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#> geographic CRS: WGS 84
#> First 5 geometries:
#> POINT Z (7.533722 51.95556 0)
#> POINT Z (7.533461 51.95576 0)
#> POINT Z (7.532442 51.95422 0)
#> POINT Z (7.53209 51.95328 0)
#> POINT Z (7.532709 51.95209 0) Created on 2020-06-17 by the reprex package (v0.3.0) Other suggestions are welcome. I'm not sure if the |
I don't think it is necessary. The result of
The I changed the name of the issue, now we know where the problem occurs. Also, I updated develop to be equal with master again. Can I assign you for this @agila5 ? And would it be possible to implement before our next milestone on June 30th? - if not, I can implement your function as well - |
This is what I meant. # packages
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
# data
my_linestring <- st_linestring(rbind(c(0, 0), c(1, 1), c(2, 2)))
# test with the absence of unname
test_coordinates_pairs = function(x) {
# 1a. extract coordinates
x_coordinates <- st_coordinates(x)
# 1b. Find index of L1 column
L1_index <- which(colnames(x_coordinates) == "L1")
# 2. Find idxs of first and last coordinate (i.e. the boundary points)
first_pair <- !duplicated(x_coordinates[, L1_index])
last_pair <- !duplicated(x_coordinates[, L1_index], fromLast = TRUE)
idxs <- first_pair | last_pair
# 3. Extract idxs and rebuild sfc
x_pairs <- x_coordinates[idxs, ]
x_nodes <- st_cast(st_sfc(st_multipoint(x_pairs[, -L1_index]), crs = st_crs(x)), "POINT")
x_nodes
}
test_sfc <- test_coordinates_pairs(my_linestring)
lapply(test_sfc, names)
#> [[1]]
#> [1] "X" "Y"
#>
#> [[2]]
#> [1] "X" "Y" Created on 2020-06-20 by the reprex package (v0.3.0)
👍
Yes and yes. And great work! |
The definition of the function is taken from #59
Fantastic work guys, I can give this a spin. Amazing to hear of the speed-ups also. |
Here are some more tests on a similar dataset. This one does not require a MapBox API key, that was just for testing the Z dimension, which seems to work. Some observations from the reproducible example below:
remotes::install_github("itsleeds/geofabrik", quiet = TRUE)
remotes::install_github("luukvdmeer/sfnetworks", ref = "develop", quiet = TRUE)
library(geofabrik)
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0
library(sfnetworks)
# download geofabrik data
city_name = "sheffield"
buffer_size = 2000
city_sf = tmaptools::geocode_OSM(city_name, as.sf = TRUE)
city_buffer = stplanr::geo_buffer(city_sf, dist = buffer_size)
osm_data_city = get_geofabrik(name = city_sf)
#> although coordinates are longitude/latitude, st_contains assumes that they are planar
#> The place is within these geofabrik zones: Europe, Great Britain, Britain and Ireland, England, South Yorkshire
#> Selecting the smallest: South Yorkshire
#> Data already detected in ~/hd/data/osm/South Yorkshire.osm.pbf
#> Old attributes: attributes=name,highway,waterway,aerialway,barrier,man_made
#> New attributes: attributes=name,highway,waterway,aerialway,barrier,man_made,maxspeed,oneway,building,surface,landuse,natural,start_date,wall,service,lanes,layer,tracktype,bridge,foot,bicycle,lit,railway,footway
#> Using ini file that can can be edited with file.edit(/tmp/RtmpS6Hzi5/ini_new.ini)
osm_data_highway = osm_data_city[!is.na(osm_data_city$highway), ]
osm_data_buffer = osm_data_highway[city_buffer, ]
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
mapview::mapview(st_as_sf(net %>% activate("edges")))
#> Error in eval(lhs, parent, parent): object 'net' not found
net = sfnetworks::as_sfnetwork(osm_data_buffer)
p1 = tmaptools::geocode_OSM("sheffield rail station", as.sf = TRUE)
#> No results found for "sheffield rail station".
p2 = tmaptools::geocode_OSM("west street sheffield", as.sf = TRUE)
route = net %>%
activate("edges") %>%
tidygraph::convert(to_spatial_shortest_paths, p1, p2)
#> although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
mapview::mapview(sf::st_as_sf(route)) +
mapview::mapview(p1) +
mapview::mapview(p2) p1 # it calculates a route when p1 is NULL!
#> NULL
p1 = tmaptools::geocode_OSM("sheffield tap", as.sf = TRUE)
route = net %>%
activate("edges") %>%
tidygraph::convert(to_spatial_shortest_paths, p1, p2)
#> although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
#> although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
mapview::mapview(sf::st_as_sf(route)) +
mapview::mapview(p1) +
mapview::mapview(p2) net = as_sfnetwork(osm_data_buffer, directed = FALSE)
route = net %>%
activate("edges") %>%
tidygraph::convert(to_spatial_shortest_paths, p1, p2)
#> although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
#> although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
mapview::mapview(sf::st_as_sf(route)) +
mapview::mapview(p1) +
mapview::mapview(p2) Created on 2020-06-22 by the reprex package (v0.3.0) |
Fixed in 0.3.1 |
For sure? I think there should be a test or at least an example to show the fix. |
Sorry to be unclear. This is the fix from @agila5 that is showcased and tested above, but was until now only present in |
Great. Worth showcasing with |
Describe the bug
The function
as_sfnetworks()
fails on a dataset from Sheffield and we're not sure why.Reproducible example
Created on 2020-06-16 by the reprex package (v0.3.0)
Expected behavior
A clear and concise description of what you expected to happen.
I expected this to work:
R Session Info
Include session info of your session, with the function
sessionInfo()
.library(sfnetworks)
Created on 2020-06-16 by the reprex package (v0.3.0)
Session info
From our work on #46
The text was updated successfully, but these errors were encountered: