aggregate_spatial: use 'id' property of input geojson to name output features #74

jdries · 2022-09-07T11:37:10Z

when writing netcdf output for aggregate_spatial, we now name the features like['feature_0','feature_1',....]
Our users would like us to use the feature id of the input geojson to name these features, to more easily link back a timeseries to the input.

This relates to support for vector cubes/feature collections, which also requires us to preserve feature properties throughout the processing.

jdries · 2022-09-13T08:16:43Z

Loading geometry through specific process should trigger usage of new-style vector cube, which retains these features id's.

soxofaan · 2022-09-13T09:23:40Z

FYI: see EP-3981 and Open-EO/openeo-python-driver#114 for my initial implementation of VectorCube support (geopandas based)

basic test that illustrates workflow: https://github.com/Open-EO/openeo-python-driver/blob/master/tests/test_views_execute.py#L754-L810

load new "VectorCube" from geojson with (experimental) load_uploaded_files process
use vector cube as geometry in aggregate_spatial
export result as geojson again (note preservation of original geojson properties and addition of aggregation values as new properties)

bossie · 2022-09-16T10:22:24Z

Test case:

polygon_1 = Polygon([(10.4566, 51.3747), (10.4335, 51.3732), (10.4527, 51.3615), (10.4566, 51.3747)])
polygon_2 = Polygon([(10.4566, 51.3747), (10.4334, 51.3732), (10.4528, 51.3614), (10.4566, 51.3747)])

def as_feature(geometry, id) -> dict:
    return {
        'type': 'Feature',
        'id': id,
        'properties': {},
        'geometry': mapping(geometry)
    }

feature_collection = {
    'type': 'FeatureCollection',
    'properties': {},
    'features': [
        as_feature(polygon_1, id="apples"),
        as_feature(polygon_2, id="oranges")
    ]
}

im = (conn
      .load_collection("SENTINEL2_L2A",
                       bands=["B04", "B03", "B02"],
                       spatial_extent={"west": 10.4005, "south": 51.3371, "east": 10.5152, "north": 51.3856},
                       temporal_extent=["2021-07-08T00:00:00Z", "2021-07-08T00:00:00Z"])
      .aggregate_spatial(feature_collection, "mean"))

im.download("/tmp/test_aggregate_spatial_feature_ids.nc")

bossie@rastapopoulos:~$ ncdump /tmp/test_aggregate_spatial_feature_ids.nc | grep 'feature_names ='
 feature_names = "feature_0", "feature_1" ;

bossie · 2022-09-20T10:04:46Z

Example client code:

feature_collection = conn.datacube_from_process("load_uploaded_files",
                                                paths=["/data/users/Public/vdboschj/FeatureCollection.geojson"],
                                                format="GeoJSON")

im = (conn
      .load_collection("SENTINEL2_L2A",
                       bands=["B04", "B03", "B02"],
                       spatial_extent={"west": 10.4005, "south": 51.3371, "east": 10.5152, "north": 51.3856},
                       temporal_extent=["2021-07-08T00:00:00Z", "2021-07-08T00:00:00Z"])
      .aggregate_spatial(feature_collection, "mean"))

im.download("means.nc")

@soxofaan : is there a more elegant way to write this?

Open-EO/openeo-geotrellis-extensions#74

bossie · 2022-09-20T13:03:06Z

Alternative:

from openeo.processes import load_uploaded_files

feature_collection = load_uploaded_files(paths=["/data/users/Public/vdboschj/FeatureCollection.geojson"],
                                         format="GeoJSON")

# ...

Open-EO/openeo-geotrellis-extensions#74

bossie · 2022-09-23T11:19:18Z

The input GeoJSON file has to be accessible from the OpenEO back-end and its Features should carry an "id", either:

as a child of the Feature (like below) or;
as part of its "properties": "properties": {"id": "apples"}

{
  "type": "FeatureCollection",
  "properties": {},
  "features": [
    {
      "type": "Feature",
      "id": "apples",
      "properties": {},
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              10.4566,
              51.3747
            ],
            [
              10.4335,
              51.3732
            ],
            [
              10.4527,
              51.3615
            ],
            [
              10.4566,
              51.3747
            ]
          ]
        ]
      }
    },
    {
      "type": "Feature",
      "id": "oranges",
      "properties": {},
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              10.4566,
              51.3747
            ],
            [
              10.4334,
              51.3732
            ],
            [
              10.4528,
              51.3614
            ],
            [
              10.4566,
              51.3747
            ]
          ]
        ]
      }
    }
  ]
}

…rameters Open-EO/openeo-geotrellis-extensions#74

Open-EO/openeo-geotrellis-extensions#74

bossie · 2022-09-26T08:15:15Z

@lru_cache load_collection is now able to cope with DriverVectorCube being passed as an argument and part of the cache key. Added some tests to make sure that this caching actually works (I don't think it did).

jdries assigned bossie Sep 7, 2022

jdries mentioned this issue Sep 16, 2022

SRD-07 data retrieval from openEO: raw bands, based on samples Open-EO/FuseTS#63

Closed

bossie added a commit to Open-EO/openeo-python-driver that referenced this issue Sep 20, 2022

assert that both "id" and "properties.id" are considered

2d0b27e

Open-EO/openeo-geotrellis-extensions#74

bossie added a commit to Open-EO/openeo-python-driver that referenced this issue Sep 20, 2022

incorporate feature IDs in netCDF output for aggregate_spatial

037790f

Open-EO/openeo-geotrellis-extensions#74

bossie added a commit to Open-EO/openeo-python-driver that referenced this issue Sep 22, 2022

use input feature's "id" in aggregate_spatial netCDF output

69b98f0

Open-EO/openeo-geotrellis-extensions#74

bossie added a commit to Open-EO/openeo-geopyspark-driver that referenced this issue Sep 23, 2022

support DriverVectorCube in aggregate_spatial

7d3c35a

Open-EO/openeo-geotrellis-extensions#74

bossie added a commit to Open-EO/openeo-python-driver that referenced this issue Sep 23, 2022

support @lru_cache on load_collection with DriverVectorCube in LoadPa…

dd05a21

…rameters Open-EO/openeo-geotrellis-extensions#74

bossie added a commit to Open-EO/openeo-geopyspark-driver that referenced this issue Sep 23, 2022

assert load_collection is cached

cd9747a

Open-EO/openeo-geotrellis-extensions#74

bossie closed this as completed Sep 26, 2022

bossie mentioned this issue Sep 26, 2022

make LoadParameters immutable Open-EO/openeo-python-driver#140

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aggregate_spatial: use 'id' property of input geojson to name output features #74

aggregate_spatial: use 'id' property of input geojson to name output features #74

jdries commented Sep 7, 2022 •

edited

Loading

jdries commented Sep 13, 2022

soxofaan commented Sep 13, 2022

bossie commented Sep 16, 2022

bossie commented Sep 20, 2022 •

edited

Loading

bossie commented Sep 20, 2022 •

edited

Loading

bossie commented Sep 23, 2022

bossie commented Sep 26, 2022

aggregate_spatial: use 'id' property of input geojson to name output features #74

aggregate_spatial: use 'id' property of input geojson to name output features #74

Comments

jdries commented Sep 7, 2022 • edited Loading

jdries commented Sep 13, 2022

soxofaan commented Sep 13, 2022

bossie commented Sep 16, 2022

bossie commented Sep 20, 2022 • edited Loading

bossie commented Sep 20, 2022 • edited Loading

bossie commented Sep 23, 2022

bossie commented Sep 26, 2022

jdries commented Sep 7, 2022 •

edited

Loading

bossie commented Sep 20, 2022 •

edited

Loading

bossie commented Sep 20, 2022 •

edited

Loading