Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r_to_py() does not successfully convert sf objects to geopandas.GeoDataFrame #1521

Open
philiporlando opened this issue Jan 12, 2024 · 4 comments

Comments

@philiporlando
Copy link

philiporlando commented Jan 12, 2024

When passing an sf data.frame to r_to_py(), it is implicitly converted to a non-spatial pandas.DataFrame, and the geometry field is converted to a normal list. This transformation results in the loss of the coordinate reference system (CRS) information, which is crucial for spatial data analysis.

Are there any opportunities to convert sf objects to a geopandas.GeoDataFrame when using r_to_py()?

library(sf)
library(reticulate)

nc <- sf::st_read(system.file("shape/nc.shp", package="sf"))
class(nc)
# [1] "sf"         "data.frame"
all(sf::st_is_valid(nc$geometry))
# [1] TRUE


nc_py <- reticulate::r_to_py(nc)
class(nc_py)
# [1] "pandas.core.frame.DataFrame"        "pandas.core.generic.NDFrame"       
# [3] "pandas.core.base.PandasObject"      "pandas.core.accessor.DirNamesMixin"
# [5] "pandas.core.indexing.IndexingMixin" "pandas.core.arraylike.OpsMixin"    
# [7] "python.builtin.object" 
all(sf::st_is_valid(nc_py$geometry))
# Error in UseMethod("st_is_valid") : 
#   no applicable method for 'st_is_valid' applied to an object of class "c('pandas.core.series.Series', 'pandas.core.base.IndexOpsMixin', 'pandas.core.arraylike.OpsMixin', 'pandas.core.generic.NDFrame', 'pandas.core.base.PandasObject', 'pandas.core.accessor.DirNamesMixin', 'pandas.core.indexing.IndexingMixin', 'python.builtin.object')"

Here are my session details:

reticulate::py_config()
# python:         /home/porlando/Projects/do4ds/.venv/bin/python
# libpython:      /usr/lib/python3.10/config-3.10-x86_64-linux-gnu/libpython3.10.so
# pythonhome:     /home/porlando/Projects/do4ds/.venv:/home/porlando/Projects/do4ds/.venv
# version:        3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
# numpy:          /home/porlando/Projects/do4ds/.venv/lib/python3.10/site-packages/numpy
# numpy_version:  1.24.3
# 
# NOTE: Python version was forced by use_python() function

utils::sessionInfo()
# R version 4.2.1 (2022-06-23)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 22.04.3 LTS
# 
# Matrix products: default
# BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
# LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
# 
# locale:
# [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
# [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
# [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
# [10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
# 
# attached base packages:
# [1] stats     graphics  grDevices datasets  utils     methods   base     
# 
# other attached packages:
# [1] sf_1.0-14         dplyr_1.1.2       reticulate_1.34.0
# 
# loaded via a namespace (and not attached):
# [1] Rcpp_1.0.11        dbplyr_2.2.1       compiler_4.2.1     pillar_1.9.0       class_7.3-20      
# [6] RPostgres_1.4.5    tools_4.2.1        digest_0.6.29      bit_4.0.4          jsonlite_1.8.0    
# [11] lattice_0.20-45    lubridate_1.8.0    lifecycle_1.0.3    tibble_3.2.1       png_0.1-8         
# [16] pkgconfig_2.0.3    rlang_1.1.1        Matrix_1.5-1       DBI_1.1.3          cli_3.6.1         
# [21] rstudioapi_0.13    e1071_1.7-11       s2_1.1.0           janitor_2.1.0      stringr_1.4.0     
# [26] generics_0.1.3     vctrs_0.6.3        hms_1.1.1          classInt_0.4-7     bit64_4.0.5       
# [31] grid_4.2.1         tidyselect_1.2.0   snakecase_0.11.0   glue_1.6.2         R6_2.5.1          
# [36] fansi_1.0.3        tidyr_1.2.0        tzdb_0.3.0         purrr_0.3.4        readr_2.1.2       
# [41] blob_1.2.3         logger_0.2.2       magrittr_2.0.3     ellipsis_0.3.2     units_0.8-0       
# [46] assertthat_0.2.1   renv_0.16.0        KernSmooth_2.23-20 utf8_1.2.2         stringi_1.7.8     
# [51] proxy_0.4-27       wk_0.6.0     
@t-kalinowski
Copy link
Member

Hi, that seems reasonable to me, though I'm not sure if taking a "Suggests" dependency to {sf} in reticulate is prudent. @kevinushey Do you have thoughts?

@philiporlando Would you like to submit a PR adding r_to_py() and py_to_r() methods?

@philiporlando
Copy link
Author

philiporlando commented Jan 12, 2024

@t-kalinowski, if adding {sf} to "Suggests" isn't an issue, then I can start working on a PR but I may need some additional support throughout the process.

@kevinushey
Copy link
Collaborator

I don't feel too strongly either way; I think adding sf as a Suggests is okay.

@gufranpathan
Copy link

Thank you for working on this. It would be a very useful feature. Looking forward!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants