-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add load_stac
#127
feat: add load_stac
#127
Conversation
Codecov Report
@@ Coverage Diff @@
## main #127 +/- ##
==========================================
- Coverage 76.63% 75.70% -0.93%
==========================================
Files 25 26 +1
Lines 1070 1169 +99
==========================================
+ Hits 820 885 +65
- Misses 250 284 +34
... and 1 file with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@soxofaan maybe you could also have a look at this PR, since it can be related to how we handle STAC in the python client too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bit unknown territory for me, so just some superficial notes
Thanks @soxofaan, your feedback is always appreciated. I will fix the addressed points and then wait for @LukeWeidenwalker when he comes back next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clausmichele Thanks a lot for this - bunch of comments!
On the question of stackstac vs odc-stac:
- I'm generally rather indifferent to which library we end up using, but would be good to be deliberate about the choice!
- You mention metadata not being parsed in clarification on difference between this library and stackstac? opendatacube/odc-stac#54 (comment) - just for my understanding, which specific metadata are you referring to here?
- In anticipation of UC8, we should consider how we'll limit this query to certain geometries.
odc-stac
can do this with thegeopolygon
parameter here, are you aware of any equivalent functionality in stackstac?
I'll do some more testing with EODC collections specifically today, so there might be more feedback coming up!
Co-authored-by: Lukas Weidenholzer <[email protected]>
Co-authored-by: Lukas Weidenholzer <[email protected]>
You can test the difference using this sample script: import odc.stac
import stackstac
import pystac_client
import planetary_computer as pc
import stackstac
URL = "https://planetarycomputer.microsoft.com/api/stac/v1"
catalog = pystac_client.Client.open(URL,modifier=pc.sign_inplace)
spatial_extent = {"west": 11.259613, "east": 11.406212, "south": 46.461019, "north": 46.522237}
bbox = [spatial_extent["west"],spatial_extent["south"],spatial_extent["east"],spatial_extent["north"]]
items = catalog.search(
bbox=bbox,
collections=["landsat-8-c2-l2"],
datetime=["2021-01-01T00:00:00.000Z", "2023-12-01T00:00:00.000Z"]
).get_all_items()
print(len(items))
print("+"*80)
print("OUTPUT OF ODC-STAC")
print("+"*80)
odc_data = odc.stac.load(
items,chunks={})
print(odc_data.SR_B1)
print("+"*80)
print("OUTPUT OF STACKSTAC")
print("+"*80)
stackstac_data = stackstac.stack(items)
print(stackstac_data.loc[dict(band='SR_B1')]) Output: 79
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
OUTPUT OF ODC-STAC
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
<xarray.DataArray 'SR_B1' (time: 79, y: 13132, x: 11832)>
dask.array<SR_B1, shape=(79, 13132, 11832), dtype=float32, chunksize=(1, 13132, 11832), chunktype=numpy.ndarray>
Coordinates:
* y (y) float64 5.374e+06 5.374e+06 5.374e+06 ... 4.98e+06 4.98e+06
* x (x) float64 4.98e+05 4.98e+05 4.98e+05 ... 8.529e+05 8.529e+05
spatial_ref int32 32632
* time (time) datetime64[ns] 2021-01-04T10:04:24.413826 ... 2022-03...
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
OUTPUT OF STACKSTAC
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
C:\Users\mclaus\Documents\GitHub\stackstac\stackstac\prepare.py:369: UserWarning: The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now t
he default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.
times = pd.to_datetime(
<xarray.DataArray 'stackstac-4a7f4a6696aa69b99ddd17edd9f4fb28' (time: 79,
y: 13132,
x: 11832)>
dask.array<getitem, shape=(79, 13132, 11832), dtype=float64, chunksize=(1, 1024, 1024), chunktype=numpy.ndarray>
Coordinates: (12/27)
* time (time) datetime64[ns] 2021-01-04T10:04:24.41...
id (time) <U31 'LC08_L2SP_193027_20210104_02_T1...
band <U13 'SR_B1'
* x (x) float64 4.98e+05 4.98e+05 ... 8.529e+05
* y (y) float64 5.374e+06 5.374e+06 ... 4.98e+06
instruments object {'tirs', 'oli'}
... ...
title <U46 'Coastal/Aerosol Band (B1)'
gsd float64 30.0
common_name object 'coastal'
center_wavelength object 0.44
full_width_half_max object 0.02
epsg int32 32632
Attributes:
spec: RasterSpec(epsg=32632, bounds=(497970.0, 4980270.0, 852930.0...
crs: epsg:32632
transform: | 30.00, 0.00, 497970.00|\n| 0.00,-30.00, 5374230.00|\n| 0.0...
resolution: 30.0 Anyway, we can also easily support both libraries! But at the moment I need to release this version based on stackstac and open a new PR later on showing how to use odc-stac as well.
From what I see in the odc-stac docs, it is just using the bbox of the polygon, so nothing particularly difficult to implement for stackstac as well. |
Ah, gotcha, thanks for the example! |
@LukeWeidenwalker PR for adding the spec is here: eodcgmbh/openeo-processes#9 |
Closes #120
First version of
load_stac
usingstackstac
to load items generated by queries usingpystac-client
. Another possibility would be usingodc-stac
, but it currently has some limitations which I explained here: opendatacube/odc-stac#54 (comment)Anyway, we could even consider to support both, depending on the user requirements.
Currently supports only STAC Collections provided by catalogs with the
/query
endpoint for fitlering.