-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add STAC Summary Endpoint for Global 10m LULC Data #117
base: develop
Are you sure you want to change the base?
Conversation
We want to use an Equal Area (Albers) projection for this. If the AoI is in the Continental United States, then we use ConusAlbers.
Thanks! Taking a look at this now. How did you generate this flamegraph? Would be useful to do as I iterate on this. Looks like we're reprojecting on line 83, but also passing the targetCRS as a parameter in lines 88 and 90: mmw-geoprocessing/api/src/main/scala/package.scala Lines 83 to 90 in 025bfac
If I remove the reprojection from line 83, I get an empty output (albeit much faster, likely because it's not reading any data 😅) diff --git a/api/src/main/scala/package.scala b/api/src/main/scala/package.scala
index 199fce0..010a792 100644
--- a/api/src/main/scala/package.scala
+++ b/api/src/main/scala/package.scala
@@ -80,14 +80,14 @@ package object geoprocessing {
sources match {
case head :: Nil => head.some
case head :: tail =>
- val reprojectedSources = NonEmptyList.of(head, tail: _*).map(_.reproject(targetCRS))
- val attributes = reprojectedSources.toList.attributesByName
+ val sources = NonEmptyList.of(head, tail: _*)
+ val attributes = sources.toList.attributesByName
val mosaicRasterSource =
if (parallelMosaicEnabled)
- MosaicRasterSourceIO.instance(reprojectedSources, targetCRS, collectionName, attributes)(IORuntime.global)
+ MosaicRasterSourceIO.instance(sources, targetCRS, collectionName, attributes)(IORuntime.global)
else
- MosaicRasterSource.instance(reprojectedSources, targetCRS, collectionName, attributes)
+ MosaicRasterSource.instance(sources, targetCRS, collectionName, attributes)
mosaicRasterSource.some
case _ => None time ./scripts/run_geotrellis examples/LowerSchuylkill.geojson
{}
________________________________________________________
Executed in 6.81 secs fish external
usr time 20.76 millis 0.10 millis 20.65 millis
sys time 21.00 millis 1.04 millis 19.95 millis I'll see if I can figure out how to do the resampling only once while still selecting the right data. The above may imply that 6.8s is a timing floor below which we cannot go. |
Overview
Adds a Stac Summary endpoint that takes a shape, a year, a STAC URI and a STAC Collection, and returns the histogram of pixels intersecting the AoI.
There is also a sister repository https://github.com/rajadain/mmw-io-10m-lulc-summary which has helper scripts to exercise this new endpoint, as well as run a Python based comparison for two sample shapes.
Currently, the Python implementation is faster:
with the GeoTrellis implementation operating at 0.5-0.75x the Python speed.
The resulting numbers are quite comparable though:
Note that "List(0)" represents NODATA values which are present in the Python output but not in the GeoTrellis one. Ultimately they are ignored so it doesn't matter.
The close percentage values are especially promising, because that is what is used in Model My Watershed.
Closes #113
Demo
xh --print HhBb :8090/stac < scratch/request.json
Notes
I'm looking for help in three places:
Rasterizer.foreachCellByMultiPolygon()
to count the pixels. Here we're doing amask
and thenhistogram.binCount()
. Is this implementation correct?Testing Instructions
./scripts/update && ./script/server
./scripts/setup
./scripts/run_python examples/LowerSchuylkill.geojson
to get a baseline./scripts/run_geotrellis examples/LowerSchuylkill.geojson
to exercise this new endpoint