Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoPandas feature parity #130

Open
jorisvandenbossche opened this issue Jan 13, 2022 · 9 comments
Open

GeoPandas feature parity #130

jorisvandenbossche opened this issue Jan 13, 2022 · 9 comments
Milestone

Comments

@jorisvandenbossche
Copy link
Member

Quickly exploring which spatial methods that are defined in GeoPandas are not yet available here:

import pandas as pd
import geopandas
import dask.dataframe as dd
import dask_geopandas

methods_pandas = set([n for n in dir(pd.DataFrame) if not n.startswith("_")])
methods_geopandas = set([n for n in dir(geopandas.GeoDataFrame) if not n.startswith("_")])
methods_dask = set([n for n in dir(dd.DataFrame) if not n.startswith("_")])
methods_dask_geopandas = set([n for n in dir(dask_geopandas.GeoDataFrame) if not n.startswith("_")])

methods_geopandas_extra = methods_geopandas - methods_pandas
methods_dask_geopandas_extra = methods_dask_geopandas - methods_dask

>>> methods_geopandas_extra - methods_dask_geopandas_extra
{'cascaded_union',
 'covered_by',
 'covers',
 'estimate_utm_crs',
 'explore',
 'from_features',
 'from_file',
 'from_postgis',
 'has_sindex',
 'iterfeatures',
 'overlay',
 'rename_geometry',
 'sjoin',
 'sjoin_nearest',
 'to_file',
 'to_postgis',
 'to_wkb',
 'to_wkt'}

>>> methods_dask_geopandas_extra - methods_geopandas_extra
{'calculate_spatial_partitions',
 'hilbert_distance',
 'interpolate',
 'morton_distance',
 'set_geometry',
 'to_dask_dataframe'}

Some quick first notes:

  • covers and covered_by are 2 missing predicates that should be trivial to add here
  • sjoin was added as a method in geopandas, we should do the same here
@martinfleis
Copy link
Member

One more thing to this is also API parity, e.g. sjoin in geopandas now uses predicate while here we still have op only.

@jorisvandenbossche
Copy link
Member Author

An updated version:

>>> methods_geopandas_extra - methods_dask_geopandas_extra
{'cascaded_union',
 'clip_by_rect',
 'estimate_utm_crs',
 'explore',
 'from_features',
 'from_file',
 'from_postgis',
 'has_sindex',
 'iterfeatures',
 'overlay',
 'sjoin_nearest',
 'to_file',

I think from this list, overlay and sjoin_nearest are the most useful (but also complicated).
clip_by_rect is probably easy to add (since it's an element-wise operation, similarly to intersection).
to_file depends on how we want to deal with writing multiple partitions (write multiple files? Or append to a single file, but this produces a serial bottleneck in the graph)

@martinfleis
Copy link
Member

explore should probably use datashader as spatial pandas does. Certainly not dumping data to leaflet.

@rabernat
Copy link

Big 👍 to overlay. Is it significantly more complicated than sjoin?

@martinfleis
Copy link
Member

Is it significantly more complicated than sjoin?

I am afraid so. See #217 (comment)

@alejohz
Copy link

alejohz commented Dec 7, 2022

👍 to explore . Is this on the roadmap?

@martinfleis
Copy link
Member

@alejohz I will tentatively say yes with a note that it will be a datashader-based method using holoviz ecosystem most likely, so a bit different than explore in vanilla GeoPandas.

@Geoyi
Copy link

Geoyi commented Mar 13, 2023

I am wondering if overlay has been actively working on here.

@martinfleis
Copy link
Member

@Geoyi I am not aware of that. It is on the roadmap but the priority of the team currently lies in the main GeoPandas project and adjacent so I don't think there's an active development of overlay at this moment. Anyone can pick it up if interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants