-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: GeoMesa Support and GeoPySpark Refactoring #650
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
Signed-off-by: Austin Heyne <[email protected]>
@aheyne Is this ready for review? It's marked as |
@jbouffard I'm planning on squashing and putting up a better PR if we decide to move forward on this. Interest is unclear to me and there are some outstanding questions; mainly the four discussion points above. There is also the more fundamental discrepancy in API level we're utilizing in the two projects (GeoPySpark being more RDD focused and GeoMesa_PySpark being Dataframe focused). At the very least I think this is a good starting point to get a Python environment with both GeoMesa and GeoTrellis playing nice together. If we're okay with that and want to review/merge it so we can continue this line of development that'd be awesome. |
@aheyne Ah, I see. At least on our end, I know there's definite interest in seeing GeoMesa_PySpark integrated into GPS. I've requested some time in these next few weeks to focus on this integration. I think that would be a good time to discuss your 4 points and the API discrepancies. I'm okay with having this PR be the initial starting point for this integration. We could just mark it as experimental until a later point. What are your opinions @echeipesh @jpolchlo? |
This is a WIP PR to drive comments and discussion about the proposed merge of geomesa_pyspark module of GeoMesa into GeoPySpark and the subsequent promotion of GeoPySpark from just GeoTrellis/VectorPipe Python bindings to a project that brings general Geo* support to Python/PySpark.
Major thing of note and discussion points.
A beta release of this is available here. The GeoMesa pre-release that includes the complementary code is here with my working branch here. The changes in GeoMesa are to bind the
pyUDT
function inAbstractGeometryUDT
togeopyspark.GeometryUDT
. This fixes the Non issue we've been seeing.I've included this in the README but for demonstration and reference purposes I'll include here a code sample of how to pull features from GeoMesa. This uses the YARN packaging mentioned above.