You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fromnetrcimportnetrcfrompathlibimportPathfromtimeimportsleepfromeodms_ddsimportddsfromeodms_rapiimportEODMSRAPIthis_file=Path(__file__)
eodms_user, _, eodms_pwd=netrc(
Path('~/.netrc').expanduser()
).hosts['data.eodms-sgdot.nrcan-rncan.gc.ca']
rapi=EODMSRAPI(eodms_user, eodms_pwd)
# search options ### the nested tuples/lists/dicts is hard for me to understand but I get that they're necessary ### for multi-select filters. There HAS to be a more user-friendly way, like if a list of product types is ### provided, the search-api must be smart enough to use the right operator. Probably needs logic to account ### for range-type filters like incidence angle toocollection="RCMImageProducts"filters= {
'Product Type': ('=', 'GRD'),
'LUT Applied': ('=', 'Ice'),
}
features= [
('intersects', str(this_file.parent/'assets'/'lancaster_gate_30km_buffer_clip.geojson')),
]
dates= [
{
"start": "20241101_000000",
"end": "20241102_000000"
}
]
### the hit-count kwarg is nice to have for sanity-checks prior to "real" search queries!rapi.search(collection=collection, filters=filters, features=features, dates=dates)
results=rapi.get_results(form='full')
### right here is where I as a user would want an easy way to either convert results dict ### to geodataframe or dump to geojson/shp/gpkg in order to narrow down the suitability of images### since despite intersecting with the AOI it might be a tiny fraction. Using contains/within ### probably won't help either in the initial query.# ddsapi needs the uuids which are stored in a couple of spots but this one seems easiest to manipulateuuids= [r['metadataFullName'].split('/')[-1] forrinresults]
# download resultsout_dir=Path('~/Downloads/eodms-beta-test').expanduser()
out_dir.mkdir(exist_ok=True)
dds_api=dds.DDS_API(eodms_user, eodms_pwd, environment="prod") # testing in prod! right on!foritem_idinuuids:
print(item_id)
item_info=dds_api.get_item(collection=collection, item_uuid=item_id)
# wait for the download_url to appear in dict keys### this polling is better than polling for EODMS order fulfillment - would be nice to be able to queue up N granules (N decided by account type?)while'download_url'notinitem_info.keys(): # could also just check dds_api.img_info? why bother returning item_info then?sleep(10)
item_info=dds_api.get_item(collection=collection, item_uuid=item_id)
### download_item() is curious because it doesn't take an item_id but get_item() does...### I guess because the DDS_API class has an img_info attribute that stores the result of get_item()### but then why does get_item() return the json too? dds_api.download_item(out_dir)
test_03.py
fromnetrcimportnetrcfrompathlibimportPathfromtimeimportsleepfromconcurrent.futuresimportThreadPoolExecutorfromeodms_ddsimportddsfromeodms_rapiimportEODMSRAPIthis_file=Path(__file__)
eodms_user, _, eodms_pwd=netrc(
Path('~/.netrc').expanduser()
).hosts['data.eodms-sgdot.nrcan-rncan.gc.ca']
rapi=EODMSRAPI(eodms_user, eodms_pwd)
# search options ### the nested tuples/lists/dicts is hard for me to understand but I get that they're necessary ### for multi-select filters. There HAS to be a more user-friendly way, like if a list of product types is ### provided, the search-api must be smart enough to use the right operator. Probably needs logic to account ### for range-type filters like incidence angle toocollection="RCMImageProducts"filters= {
'Product Type': ('=', 'GRD'),
'LUT Applied': ('=', 'Ice'),
}
features= [
('intersects', str(this_file.parent/'assets'/'lancaster_gate_30km_buffer_clip.geojson')),
]
dates= [
{
"start": "20241105_000000",
"end": "20241106_000000"
}
]
out_dir=Path('~/Downloads/eodms-beta-test').expanduser()
out_dir.mkdir(exist_ok=True)
### quick-n-dirty function for concurrent use laterdeforder_and_download(api_obj, item_ids):
foriteminitem_ids:
item_info=api_obj.get_item(collection=collection, item_uuid=item)
while'download_url'notinitem_info.keys():
sleep(10)
item_info=api_obj.get_item(collection=collection, item_uuid=item)
api_obj.download_item(out_dir)
return### the hit-count is nice to have for sanity-checks prior to "real" search queriesrapi.search(collection=collection, filters=filters, features=features, dates=dates)
results=rapi.get_results(form='full')
### note how if the query params are adjusted (or even just the search is repeated with same params), the number### of results just goes up (due to how dds_api just appends results rather than replaces)# ddsapi needs the uuids which are stored in a couple of spots but this one seems easiest to manipulate### need to check for Nones because dds_api will just return None in a lot of cases?uuids=list(set([r['metadataFullName'].split('/')[-1] forrinresultsifrisnotNone]))
# download results# really filthy concurrent methodn_workers=4batches= [uuids[i::n_workers] foriinrange(n_workers)]
apis= [dds.DDS_API(eodms_user, eodms_pwd, environment='prod') for_inrange(n_workers)]
withThreadPoolExecutor(max_workers=n_workers) asexecutor:
futures= [executor.submit(order_and_download, api, batch) forapi, batchinzip(apis, batches)]
results= [future.result() forfutureinfutures]
test_04.py
fromnetrcimportnetrcfrompathlibimportPathfromtimeimportsleepfromconcurrent.futuresimportThreadPoolExecutorfromeodms_ddsimportddsfromeodms_rapiimportEODMSRAPIthis_file=Path(__file__)
eodms_user, _, eodms_pwd=netrc(
Path('~/.netrc').expanduser()
).hosts['data.eodms-sgdot.nrcan-rncan.gc.ca']
rapi=EODMSRAPI(eodms_user, eodms_pwd)
# search options ### the nested tuples/lists/dicts is hard for me to understand but I get that they're necessary ### for multi-select filters. There HAS to be a more user-friendly way, like if a list of product types is ### provided, the search-api must be smart enough to use the right operator. Probably needs logic to account ### for range-type filters like incidence angle toocollection="RCMImageProducts"# these filters are a common use-case for mefilters= {
'Product Type': ('=', 'GRD'),
'LUT Applied': ('=', 'Ice'),
}
# this geojson is provided toofeatures= [
('intersects', str(this_file.parent/'assets'/'lancaster_gate_30km_buffer_clip.geojson')),
]
# these dates produce results of just over 100 granulesdates= [
{
"start": "20241105_000000",
"end": "20241118_000000"
}
]
out_dir=Path('~/Downloads/eodms-beta-test').expanduser()
out_dir.mkdir(exist_ok=True)
### quick-n-dirty download function for concurrent use laterdeforder_and_download(api_obj, item_ids):
foriteminitem_ids:
item_info=api_obj.get_item(collection=collection, item_uuid=item)
while'download_url'notinitem_info.keys():
sleep(10)
item_info=api_obj.get_item(collection=collection, item_uuid=item)
api_obj.download_item(out_dir)
return### the hit-count is nice to have for sanity-checks prior to "real" search queriesrapi.search(collection=collection, filters=filters, features=features, dates=dates)
results=rapi.get_results(form='full') # need to use full form to get uuids### note how if the query params are adjusted (or even just the search is repeated with same params), the number### of results just goes up (due to how rapi appends results rather than replaces)### https://github.com/eodms-sgdot/py-eodms-rapi/blob/20d249f5660398b7201ae8e9c73ee65b5714a676/eodms_rapi/eodms.py#L2751### ddsapi needs the uuids which are stored in a couple of spots but this one seems easiest to manipulate### need to check for Nones because rapi returns None for some reason?uuids=list(set([r['metadataFullName'].split('/')[-1] forrinresultsifrisnotNone]))
# download results# really filthy concurrent methodn_workers=8# split uuids into roughly-equivalent batchesbatches= [uuids[i::n_workers] foriinrange(n_workers)]
# create api object for each workerapis= [dds.DDS_API(eodms_user, eodms_pwd, environment='prod') for_inrange(n_workers)]
withThreadPoolExecutor(max_workers=n_workers) asexecutor:
futures= [executor.submit(order_and_download, api, batch) forapi, batchinzip(apis, batches)]
results= [future.result() forfutureinfutures]
Sharing my not-very-polished test suite. Hope it is helpful! Feel free to close at anytime.
General comments:
order_id
method.test_01.py
test_02.py
test_03.py
test_04.py
package versions
The text was updated successfully, but these errors were encountered: