-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with latest satsearch #6
Comments
thanks for the detailed issue @TomAugspurger , there have been a lot of changes to sat-search(>0.3) and intake-stac(>0.3) in the last couple months, with a new version of intake-stac just released yesterday. Long story short, the notebook needs some updating once those new versions are in the environment |
Hi @scottyhq and @TomAugspurger , I was just running the landsat8 notebook on a newly installed Dask cluster on Azure K8S. I used the exact same version of satsearch (==0.2.3), but still cannot go through. Here are the details of the error:
Any idea? |
Mmm I'm not sure. That looks like a bug in satsearch. I believe that development focus is shifting from satsearch https://github.com/stac-utils/pystac-api-client, but I'm not sure how mature pystac-api-client is yet. |
I think this would be a simple fix by changing line 28 to:
But I cannot access and edit this file /srv/conda/envs/notebook/lib/python3.8/site-packages/satsearch/search.py as I am using dummy user on dask jupyterlab. |
@TomAugspurger @ZihengSun , yes taking a step back this example really needs to be updated to use a different L8 dataset see #8 for some alternatives including accessing harmonized landsat sentinel2 (HLS) via NASA's CMR STAC endpoint. It would be awesome if all public datasets in AWS, Azure, Google had up-to-date STAC metadata and search endpoints, but that is still very much a work in progress... |
Yes, all a work in progress the magic Stack of STACs |
Hi everyone! I was trying to reproduce this notebook and on the third cell the search throws an error, that seems to be related to the search call:
For me, it seems that is an issue with the endpoint being called. Anyone can give a hint on how to solve this? I have already installed satsearch from its GitHub repo but seems to persist. |
I'd recommend trying pystac-client (and updating the notebook if that works). Something like catalog = pystac_client.Client.open("https://earth-search.aws.element84.com/v0") Docs are at https://pystac-client.readthedocs.io/. |
Thanks @TomAugspurger ! I will try, and if it works, I will fix and submit a PR, ok? |
I was hoping to use this in a demo and found the same issue described in #6 (comment). I was able to run catalog = pystac_client.Client.open("https://earth-search.aws.element84.com/v0") but could not figure out how to refactor the rest of the search in cell 3 to use the new API. This gallery is a very important demonstration of Pangeo's capabilities in geospatial analysis. Let's get it working again! |
I got dragged in lab affairs these last days but will be going over this later this week. @rabernat the refactoring would be due to the method change basically, or are there new "errors"? |
I'll take a look quick. |
Gotta move on, but I have a start at https://gist.github.com/6051aa1705dc6797beccc9ac6e321ef3.
I'll pick things up later if I have a chance. |
The Landsat example here may also be of interest, it uses Here's the rendered version |
Hmm, now I'm having issues with accessing the data (e.g. the link at https://landsat-pds.s3.us-west-2.amazonaws.com/c1/L8/047/027/LC08_L1TP_047027_20210630_20210708_01_T1/LC08_L1TP_047027_20210630_20210708_01_T1_thumb_large.jpg is giving a 40x error). Did that bucket recently change to requester-pays? For some reason I thought it had been requester pays for a while, but now I'm not sure. I could change the source to the Planetary Computer's landsat collection in Azure, but would want an OK from @scottyhq before doing that. Or I could put that in a separate notebook. |
Thanks for taking time to try and fix the now-dated example @TomAugspurger
Yes, I think that s3://landsat-pds has finally been retired! See pydata/xarray#6363 (comment) There are now at least 4 options for cloud-hosted Landsat data (for better or worse)!
@rabernat @TomAugspurger Happy to merge a PR for an updated notebook... But in my mind Pangeo Gallery is primarily to illustrate 1. Large-scale examples of moving compute to the data and 2. Actually execute large examples as big integration tests and see when data or software problems come up (as they have here!). So my reluctance to continue maintaining this example with AWS datasets is due to the following:
|
Hello, I have also been browsing a bunch of online materials to get started with intake for landsat data and have not yet been able to reproduce a single notebook. I have signed up for Microsoft's planetary computer (since, correct me if I am mistaken, seems to be more aligned with the open source principles than Google's Earth Engine), and in the meantime, I have been trying to use the AWS datasets using an AWS account. However, when I try to convert to dask/xarray: import pystac_client
import intake
from rasterio import session
aws_session = session.AWSSession(boto3.Session(profile_name="aws"), requester_pays=True)
stac_uri = "https://landsatlook.usgs.gov/stac-server"
collections = ["landsat-c2l1"]
client = pystac_client.Client.open(stac_uri)
results = client.search(
collections=collections,
bbox =...,
datetime=...)
items = results.get_all_items()
catalog = intake.open_stac_item_collection(items)
with rio.Env(aws_session):
ds = catalog[list(catalog)[0]]["blue"].to_dask() I obtain:
Since the error seems to come from rasterio, I tried: import boto3
import rasterio as rio
from rasterio import session
aws_session = session.AWSSession(boto3.Session(profile_name="aws"), requester_pays=True)
uri = "https://landsatlook.usgs.gov/data/collection02/level-1/standard/oli-tirs/2020/165/062/LC08_L1TP_165062_20201231_20210308_02_T1/LC08_L1TP_165062_20201231_20210308_02_T1_B2.TIF"
with rio.Env(aws_session):
with rio.open(uri) as src:
print(src.profile) but obtain the same error. However, if I change the https uri for its s3 version, i.e., Also note that I tried setting the Is this normal? How can I make the For info: rasterio version is 1.3.0, intake version is 0.6.5, and pystac_client version is 0.4.0. Thank you in advance. Best, |
This is suggestive of reading an HTML form rather than the actual file. I ran into this issue recently and found a solution here https://gis.stackexchange.com/questions/430026/gdalinfo-authenticate-for-remote-file. As a consequence of the authentication required for HTTP links it seems best to stick with the S3:// urls as you've discovered. Here is a more up-to-date example I'd suggest following for |
Thanks @scottyhq for your response. I have managed to stream landsat data from AWS into xarray thanks to the example that you sent. If I am understanding the situation correctly, this means that intake should be updated to read s3 urls rather than https, right? |
A collection of issues with latest versions of satsearch / intake-stac.
https://earth-search.aws.element84.com/v0
is the recommended URL, and you pass it likeresults = Search.search(url=api_url, collection='landsat-8-l1', ...)
.eo:bands
isn't present in the geoDataFrame:band_info = pd.DataFrame(ast.literal_eval(gf.iloc[0]['eo:bands']))
. I think that it's available in the Item assets?so repeat that for each one?
scenid
hardcoded isn't present in thecatalog
ValueError
from intake-stac setup and error on aws search example intake/intake-stac#64.The text was updated successfully, but these errors were encountered: