-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use geotrellis-contrib to perform windowed reads on remote rasters #689
Comments
This gist contains some more mock Python API/logic for the With the Not Tiled:
Tiled:
I think it'd make sense then to combine The one issue with this from an API perspective is the number of parameter's class GeoTiffOptions(namedtuple("GeoTiffOptions", 'chunk_size partition_bytes time_tag time_format delimiter s3_client s3_credentials')):
__slots__ = [] And then pass in those options as a parameter to def get(layer_type,
paths,
tile_size=MAX_TILE_SIZE,
target_crs=None,
geotrellis_options=GeoTiffOptions(),
read_method=ReaderMethod.GEOTRELLIS) |
Yes, that's one of the next steps in that project but its probably going to take a little bit to design that API. I don't think lack of that capability should block this feature. For now I think it would be acceptable tor throw
Agreed that it makes sense to group this functionality by tiled and not-tiled use case. But grouping rasterio and GeoTrellis read methods could potentially be more confusing. You're going to end up with one hell of a doc-string on the function trying to explain differences in behavior. Also the ways in which either method is going to fail and needs to be configured is going to be significantly different. Maybe another way is to work towards unifying Also another differences is that |
A typical GPS workflow might start like this:
This is great for various reasons but has following cost:
numpy
to GeoTrellis Tiles on read (low cost)RasterSummary
spark jobBufferedTiles
for reproject operationtarget_crs
BufferedTiles
Prototyped here is an alternative workflow that avoids the high cost of spark shuffles on pixel tiles in this process: https://github.com/geotrellis/geotrellis-contrib/blob/demo/wsel/wse/src/main/scala/Main.scala
This can be encapsulated in a GeoPySpark operation that combines the
read
andtile_to_layout
(with optional reprojection) steps to produce aTiledRasterLayer
.Ideally API can be similar to
tile_to_layout
perhaps:There is an outstanding question of what should happen to merge order. I think currently any overlapping raster is going to get merged in arbitrary order as result of
tile_to_layout
operation and that would be an easy default to start from here.The text was updated successfully, but these errors were encountered: