Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking #6

Open
Robinlovelace opened this issue Nov 29, 2019 · 17 comments
Open

Benchmarking #6

Robinlovelace opened this issue Nov 29, 2019 · 17 comments
Assignees
Labels
testing ✅ Related to testing tasks

Comments

@Robinlovelace
Copy link
Collaborator

For another project I've done some benchmarks and it seems that sfnetworks is already pretty fast. Wonder if we can make it even faster!

# Aim: benchmark the performance of different spatial network packages

library(magrittr)
library(stplanr)
library(sf)
#> Linking to GEOS 3.7.1, GDAL 2.4.2, PROJ 5.2.0
piggyback::pb_download("chapeltown_leeds_key_roads.Rds", repo = "ropensci/stplanr", dest = ".", show_progress = FALSE)
chapeltown_leeds_key_roads <- readRDS("chapeltown_leeds_key_roads.Rds")
x <- chapeltown_leeds_key_roads %>% 
  st_transform(crs = geo_select_aeq(.))
x_sp = as(x, "Spatial")

# spatial network creation ------------------------------------------------

stplanr <- function() stplanr::SpatialLinesNetwork(x)
sfnetworks <- function() sfnetworks::sfn_asnetwork(x)
dodgr <- function() dodgr::weight_streetnet(x)
shp2graph <- function() shp2graph::readshpnw(x_sp)

bench::mark(check = FALSE, stplanr(), sfnetworks(), dodgr(), shp2graph())
#> Warning in SpatialLinesNetwork.sf(x): Graph composed of multiple subgraphs,
#> consider cleaning it with sln_clean_graph().

#> Warning in SpatialLinesNetwork.sf(x): Graph composed of multiple subgraphs,
#> consider cleaning it with sln_clean_graph().

#> Warning in SpatialLinesNetwork.sf(x): Graph composed of multiple subgraphs,
#> consider cleaning it with sln_clean_graph().

#> Warning in SpatialLinesNetwork.sf(x): Graph composed of multiple subgraphs,
#> consider cleaning it with sln_clean_graph().

#> Warning in SpatialLinesNetwork.sf(x): Graph composed of multiple subgraphs,
#> consider cleaning it with sln_clean_graph().
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 4 x 6
#>   expression        min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>   <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 stplanr()     137.2ms 142.87ms     7.03     7.85MB     5.27
#> 2 sfnetworks()  64.69ms  69.22ms    14.3      4.49MB     7.15
#> 3 dodgr()         1.41s    1.41s     0.709   68.29MB     2.84
#> 4 shp2graph()  310.25ms 324.86ms     3.08   473.86MB    27.7

Created on 2019-11-29 by the reprex package (v0.3.0)

@luukvdmeer
Copy link
Owner

I added this to milestone 3 (our last milestone), such that we can do the benchmarking towards the end of the project when the core of the code is finished, and it is time to finetune.

Is it ok if I assign you for this @Robinlovelace ?

@luukvdmeer luukvdmeer added the testing ✅ Related to testing tasks label Mar 29, 2020
@Robinlovelace
Copy link
Collaborator Author

Sure I'm up for that. Will be good to generate some consistent benchmarks, I'll start by looking for other open network datasets used by other projects for benchmarking.

@mvl22
Copy link

mvl22 commented Apr 1, 2020

Benchmarking of routing performance will depend entirely what/which you're optimising for:

  • Network size (city vs continent)
  • Dynamic data (e.g. live traffic)
  • Number of routing output types, involving shared data
  • Transport type (bicycle will involve considering more of the graph than car)
  • Whether you want the OpenTripPlanner-style triangle to compute preferences dynamically rather than optimise up-front
  • Speed of returned result
  • CPU
  • RAM footprint

OSRM for instance is fast for massive networks but is less optimal once you want rapidly-changing live traffic data, as the ability to do up-front optimisation is lowered.

@Robinlovelace
Copy link
Collaborator Author

One approach to continuous benchmarking is this: https://github.com/r-lib/bench#continuous-benchmarking

Thoughts @agila5, @loreabad6 and @luukvdmeer ? Worth a try I guess but could be overly complex compared with reporting benchmarks in README with each build manually.

@Robinlovelace
Copy link
Collaborator Author

Good news so far: sfnetworks seems to be faster at creating spatial objects, even though the object sizes are larger:

library(sfnetworks)
    system.time({
        net = as_sfnetwork(roxel)
    })
#>    user  system elapsed 
#>   0.062   0.001   0.062
    system.time({
        net2 = stplanr::SpatialLinesNetwork(roxel)
    })
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0
#> Warning in SpatialLinesNetwork.sf(roxel): Graph composed of multiple subgraphs,
#> consider cleaning it with sln_clean_graph().
#>    user  system elapsed 
#>   0.859   0.020   0.879
    pryr::object_size(net)
#> Registered S3 method overwritten by 'pryr':
#>   method      from
#>   print.bytes Rcpp
#> 807 kB
    pryr::object_size(net2)
#> 447 kB
    
    res = bench::press(n = seq(from = 10, to = nrow(roxel), length.out = 5),
                       {
                           bench::mark(
                               check = FALSE,
                               time_unit = "ms",
                               stplanr::SpatialLinesNetwork(roxel[1:n, ]),
                               sfnetworks::as_sfnetwork(roxel[1:n, ])
                           )
                       }
    )
#> Running with:
#>       n
#> 1   10

    ggplot2::autoplot(res)

Created on 2020-06-22 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.3 (2020-02-29)
#>  os       Ubuntu 18.04.4 LTS          
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language en_GB:en                    
#>  collate  en_GB.UTF-8                 
#>  ctype    en_GB.UTF-8                 
#>  tz       Europe/London               
#>  date     2020-06-22                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                                
#>  assertthat    0.2.1      2019-03-21 [2] CRAN (R 3.6.0)                        
#>  backports     1.1.8      2020-06-17 [1] CRAN (R 3.6.3)                        
#>  beeswarm      0.2.3      2016-04-25 [1] CRAN (R 3.6.1)                        
#>  bench         1.1.1      2020-01-13 [2] CRAN (R 3.6.2)                        
#>  callr         3.4.3      2020-03-28 [1] CRAN (R 3.6.3)                        
#>  class         7.3-17     2020-04-26 [2] CRAN (R 3.6.3)                        
#>  classInt      0.4-3      2020-04-06 [1] Github (r-spatial/classInt@d024051)   
#>  cli           2.0.2      2020-02-28 [1] CRAN (R 3.6.2)                        
#>  codetools     0.2-16     2018-12-24 [4] CRAN (R 3.6.3)                        
#>  colorspace    1.4-1      2019-03-18 [1] CRAN (R 3.6.3)                        
#>  crayon        1.3.4      2017-09-16 [2] standard (@1.3.4)                     
#>  curl          4.3        2019-12-02 [2] CRAN (R 3.6.2)                        
#>  DBI           1.1.0      2019-12-15 [2] CRAN (R 3.6.2)                        
#>  desc          1.2.0      2018-05-01 [2] standard (@1.2.0)                     
#>  devtools      2.3.0      2020-04-10 [1] CRAN (R 3.6.3)                        
#>  digest        0.6.25     2020-02-23 [1] CRAN (R 3.6.2)                        
#>  dplyr         1.0.0.9000 2020-06-16 [1] Github (tidyverse/dplyr@fd08fe9)      
#>  e1071         1.7-3      2019-11-26 [2] CRAN (R 3.6.1)                        
#>  ellipsis      0.3.1      2020-05-15 [3] CRAN (R 3.6.3)                        
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 3.6.0)                        
#>  fansi         0.4.1      2020-01-08 [1] CRAN (R 3.6.2)                        
#>  farver        2.0.3      2020-01-16 [1] CRAN (R 3.6.2)                        
#>  foreign       0.8-76     2020-03-03 [2] CRAN (R 3.6.2)                        
#>  fs            1.4.1      2020-04-04 [2] CRAN (R 3.6.3)                        
#>  generics      0.0.2      2018-11-29 [3] CRAN (R 3.5.1)                        
#>  geosphere     1.5-10     2019-05-26 [2] CRAN (R 3.6.0)                        
#>  ggbeeswarm    0.6.0      2017-08-07 [1] CRAN (R 3.6.1)                        
#>  ggplot2       3.3.2      2020-06-19 [1] CRAN (R 3.6.3)                        
#>  glue          1.4.1      2020-05-13 [2] CRAN (R 3.6.3)                        
#>  gtable        0.3.0      2019-03-25 [3] CRAN (R 3.5.3)                        
#>  highr         0.8        2019-03-20 [3] CRAN (R 3.5.3)                        
#>  htmltools     0.5.0.9000 2020-06-18 [1] Github (rstudio/htmltools@a8025f3)    
#>  httr          1.4.1      2019-08-05 [2] CRAN (R 3.6.1)                        
#>  igraph        1.2.5      2020-03-19 [1] CRAN (R 3.6.3)                        
#>  KernSmooth    2.23-17    2020-04-26 [4] CRAN (R 3.6.3)                        
#>  knitr         1.28       2020-02-06 [1] CRAN (R 3.6.2)                        
#>  lattice       0.20-41    2020-04-02 [2] CRAN (R 3.6.3)                        
#>  lifecycle     0.2.0.9000 2020-03-16 [1] Github (r-lib/lifecycle@355dcba)      
#>  lwgeom        0.2-5      2020-06-12 [1] CRAN (R 3.6.3)                        
#>  magrittr      1.5        2014-11-22 [2] CRAN (R 3.5.2)                        
#>  maptools      1.0-1      2020-05-14 [1] CRAN (R 3.6.3)                        
#>  memoise       1.1.0      2017-04-21 [3] CRAN (R 3.5.0)                        
#>  mime          0.9        2020-02-04 [1] CRAN (R 3.6.2)                        
#>  munsell       0.5.0      2018-06-12 [3] CRAN (R 3.5.0)                        
#>  pillar        1.4.4      2020-05-05 [1] CRAN (R 3.6.3)                        
#>  pkgbuild      1.0.8      2020-05-07 [1] CRAN (R 3.6.3)                        
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 3.6.1)                        
#>  pkgload       1.1.0      2020-05-29 [3] CRAN (R 3.6.3)                        
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 3.6.2)                        
#>  processx      3.4.2      2020-02-09 [1] CRAN (R 3.6.3)                        
#>  profmem       0.5.0      2018-01-30 [2] CRAN (R 3.5.2)                        
#>  pryr          0.1.4      2018-02-18 [1] CRAN (R 3.6.1)                        
#>  ps            1.3.3      2020-05-08 [1] CRAN (R 3.6.3)                        
#>  purrr         0.3.4      2020-04-17 [1] CRAN (R 3.6.3)                        
#>  R6            2.4.1      2019-11-12 [2] CRAN (R 3.6.1)                        
#>  raster        3.3-3      2020-06-18 [1] Github (rspatial/raster@d63b497)      
#>  Rcpp          1.0.4.6    2020-04-09 [1] CRAN (R 3.6.3)                        
#>  remotes       2.1.1      2020-02-15 [1] CRAN (R 3.6.2)                        
#>  rgeos         0.5-3      2020-05-08 [1] CRAN (R 3.6.3)                        
#>  rlang         0.4.6.9000 2020-06-22 [1] Github (r-lib/rlang@64df8e3)          
#>  rmarkdown     2.3        2020-06-18 [1] CRAN (R 3.6.3)                        
#>  rprojroot     1.3-2      2018-01-03 [2] CRAN (R 3.5.3)                        
#>  scales        1.1.1      2020-05-11 [1] CRAN (R 3.6.3)                        
#>  sessioninfo   1.1.1      2018-11-05 [3] CRAN (R 3.5.1)                        
#>  sf          * 0.9-4      2020-06-22 [1] Github (r-spatial/sf@0b08ed5)         
#>  sfnetworks  * 0.3.0      2020-06-22 [1] Github (luukvdmeer/sfnetworks@7baa168)
#>  sp            1.4-2      2020-05-20 [1] CRAN (R 3.6.3)                        
#>  stplanr       0.6.0      2020-06-01 [1] local                                 
#>  stringi       1.4.6      2020-02-17 [1] CRAN (R 3.6.2)                        
#>  stringr       1.4.0      2019-02-10 [2] standard (@1.4.0)                     
#>  testthat      2.3.2      2020-03-02 [1] CRAN (R 3.6.3)                        
#>  tibble        3.0.1      2020-04-20 [1] CRAN (R 3.6.3)                        
#>  tidygraph     1.2.0      2020-05-12 [2] CRAN (R 3.6.3)                        
#>  tidyr         1.1.0      2020-05-20 [3] CRAN (R 3.6.3)                        
#>  tidyselect    1.1.0      2020-05-11 [1] CRAN (R 3.6.3)                        
#>  units         0.6-7      2020-06-13 [1] CRAN (R 3.6.3)                        
#>  usethis       1.6.1      2020-04-29 [1] CRAN (R 3.6.3)                        
#>  utf8          1.1.4      2018-05-24 [2] CRAN (R 3.5.3)                        
#>  vctrs         0.3.1      2020-06-05 [1] CRAN (R 3.6.3)                        
#>  vipor         0.4.5      2017-03-22 [1] CRAN (R 3.6.1)                        
#>  withr         2.2.0      2020-04-20 [2] CRAN (R 3.6.3)                        
#>  xfun          0.15       2020-06-21 [1] CRAN (R 3.6.3)                        
#>  xml2          1.3.2      2020-04-23 [3] CRAN (R 3.6.3)                        
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 3.6.2)                        
#> 
#> [1] /home/robin/R/x86_64-pc-linux-gnu-library/3.6
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library

Robinlovelace added a commit that referenced this issue Jun 22, 2020
@Robinlovelace Robinlovelace mentioned this issue Jun 22, 2020
@Robinlovelace
Copy link
Collaborator Author

Heads-up, I've added continuous benchmarking in #64 but the build is failing due to credentials issues. That should be an easy fix. See here for details: r-lib/bench#87

Any ideas of what else we should benchmark?

@agila5
Copy link
Collaborator

agila5 commented Jun 23, 2020

Hi and thanks for your work! IMO, for the moment, it's good enough since I think we should focus on testing the current functionalities, fix the bugs and then optimize the code and benchmark different implementations considering also what @mvl22 said. Let's keep this issue open for the time being.

Did you understand why the build is failing? Sorry but I have literally 0 experience with Github Actions and benchmarks.

@Robinlovelace
Copy link
Collaborator Author

Did you understand why the build is failing?

No, I'm not sure why the benchmarks are failing. One consideration: wonder if it's worth adding an optional edge_lengths parameter in as_sfnetworks() which could be FALSE by default.

@agila5
Copy link
Collaborator

agila5 commented Jun 23, 2020

One consideration: wonder if it's worth adding an optional edge_lengths parameter in as_sfnetworks() which could be FALSE by default.

IMO yes if the network is created with explicit edges since I've always used the edge lengths during the analysis after creating the network.

@luukvdmeer
Copy link
Owner

Heads-up @Robinlovelace . Since lately the continuous benchmarking is failing. Whenever you find the time could you take a look? For me it is a mystery ;-)

@Robinlovelace
Copy link
Collaborator Author

Hi @luukvdmeer yes will do. Do we want to benchmark any other things?

@Robinlovelace
Copy link
Collaborator Author

Seems it has benchmarked things historically:

setwd("~/wip/sfnetworks/")
bench::cb_fetch()
d = bench::cb_read()
bench::cb_plot_time(d)
#> Loading required namespace: ggplot2
#> Loading required namespace: tidyr

Created on 2020-11-05 by the reprex package (v0.3.0)

@Robinlovelace
Copy link
Collaborator Author

Just tried this locally and it worked with no errors:

bench::cb_run()

Robinlovelace added a commit that referenced this issue Nov 5, 2020
@Robinlovelace
Copy link
Collaborator Author

Not 100% sure how it works either. I have checked here https://github.com/r-lib/bench/actions?query=workflow%3A%22Continuous+Benchmarks%22 and cannot see build logs there either. The examples above show how to read benchmarks saved in the past, would be useful to have a date.

TBH I do not fully understand continuous benchmarking. We could add simple benchmarks to a vignette instead. Thoughts @luukvdmeer ?

I have advocated for better documentation on the 'CB' approach in r-lib/bench#87 but while we're waiting for that we could change tack.

luukvdmeer added a commit that referenced this issue Dec 4, 2020
@luukvdmeer
Copy link
Owner

I agree. Lets for now disable it until it matures. I saved the bench setup as we had it in a branch named bench. We can re-add that content later.

@Robinlovelace
Copy link
Collaborator Author

Great thinking. I can add something on benchmarking using system.time() - which vignette though?

@luukvdmeer
Copy link
Owner

How about starting with some basic benchmarking (the currently existing ones) in a "Benchmarks" section in the README? Once we have more coverage of other functionalities (or if you already have them) we could dedicate a new vignette to it, focused only on benchmarking.

@luukvdmeer luukvdmeer removed this from the v0.4.0 milestone Dec 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing ✅ Related to testing tasks
Projects
None yet
Development

No branches or pull requests

4 participants