Skip to content

Commit

Permalink
Merge branch 'master' into update-models
Browse files Browse the repository at this point in the history
  • Loading branch information
fmigneault authored Nov 17, 2023
2 parents 467fc51 + 668c80a commit ad0b022
Show file tree
Hide file tree
Showing 23 changed files with 2,921 additions and 74 deletions.
4 changes: 4 additions & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# see documentation
# https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners

* @fmigneault @huard @dchandan
8 changes: 1 addition & 7 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,16 +38,10 @@ jobs:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
# - name: Build and push image using tag
# uses: docker/build-push-action@v3
# with:
# context: .
# push: true
# tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.version.outputs.TAG_VERSION }}
- name: Build and push image using branch name
uses: docker/build-push-action@v3
with:
context: .
file: docker/Dockerfile
push: true
tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.extract_branch.outputs.branch }}
tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.version.outputs.TAG_VERSION }}
13 changes: 13 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,19 @@

<!-- insert list items of new changes here -->

## [0.3.0](https://github.com/crim-ca/stac-populator/tree/0.3.0) (2023-11-16)


* Add request ``session`` keyword to all request-related functions and populator methods to allow sharing a common set
of settings (`auth`, SSL `verify`, `cert`) across requests toward the STAC Catalog.
* Add `DirectoryLoader` that allows populating a STAC Catalog with Collections and Items loaded from a crawled directory
hierarchy that contains `collection.json` files and other `.json`/`.geojson` items.
* Add a generic CLI `stac-populator` that can be called to run populator implementations directly
using command `stac-populator run <implementation> [impl-args]`.
* Remove hardcoded `verify=False` to requests calls.
If needed for testing purposes, users should use a custom `requests.sessions.Session` with `verify=False` passed to
the populator, or alternatively, employ the CLI argument `--no-verify` that will accomplish the same behavior.

## [0.2.0](https://github.com/crim-ca/stac-populator/tree/0.2.0) (2023-11-10)


Expand Down
9 changes: 7 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ MAKEFILE_NAME := $(word $(words $(MAKEFILE_LIST)),$(MAKEFILE_LIST))
-include Makefile.config
APP_ROOT := $(abspath $(lastword $(MAKEFILE_NAME))/..)
APP_NAME := STACpopulator
APP_VERSION ?= 0.2.0
APP_VERSION ?= 0.3.0

DOCKER_COMPOSE_FILES := -f "$(APP_ROOT)/docker/docker-compose.yml"
DOCKER_TAG := ghcr.io/crim-ca/stac-populator:$(APP_VERSION)
Expand All @@ -15,10 +15,15 @@ CATALOG = https://daccs.cs.toronto.edu/twitcher/ows/proxy/thredds/catalog/datase
# CATALOG = https://daccs.cs.toronto.edu/twitcher/ows/proxy/thredds/catalog/datasets/CMIP6/CMIP/NOAA-GFDL/catalog.html
# CATALOG = https://daccs.cs.toronto.edu/twitcher/ows/proxy/thredds/catalog/datasets/CMIP6/CMIP/AS-RCEC/catalog.html

PYESSV_ARCHIVE_DIR ?= ~/.esdoc/pyessv-archive
PYESSV_ARCHIVE_REF ?= https://github.com/ES-DOC/pyessv-archive

## -- Testing targets -------------------------------------------------------------------------------------------- ##

setup-pyessv-archive:
git clone "https://github.com/ES-DOC/pyessv-archive" ~/.esdoc/pyessv-archive
@echo "Updating pyessv archive [$(shell realpath $(PYESSV_ARCHIVE_DIR))]..."
@[ -d $(PYESSV_ARCHIVE_DIR) ] || git clone "$(PYESSV_ARCHIVE_REF)" $(PYESSV_ARCHIVE_DIR)
@cd $(PYESSV_ARCHIVE_DIR) && git pull

test-cmip6:
python $(IMP_DIR)/CMIP6_UofT/add_CMIP6.py $(STAC_HOST) $(CATALOG)
Expand Down
47 changes: 41 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# STAC Catalog Populator

![Latest Version](https://img.shields.io/badge/latest%20version-0.2.0-blue?logo=github)
![Commits Since Latest](https://img.shields.io/github/commits-since/crim-ca/stac-populator/0.2.0.svg?logo=github)
![Latest Version](https://img.shields.io/badge/latest%20version-0.3.0-blue?logo=github)
![Commits Since Latest](https://img.shields.io/github/commits-since/crim-ca/stac-populator/0.3.0.svg?logo=github)
![GitHub License](https://img.shields.io/github/license/crim-ca/stac-populator)

This repository contains a framework [STACpopulator](STACpopulator)
that can be used to implement concrete populators (see [implementations](STACpopulator/implementations))
Expand Down Expand Up @@ -40,16 +41,44 @@ pip install .[dev]
make install-dev
```

You can also employ the pre-built Docker:
You should then be able to call the STAC populator CLI with following commands:

```shell
docker run -ti ghcr.io/crim-ca/stac-populator:0.2.0 [command]
# obtain the installed version of the STAC populator
stac-popultaor --version

# obtain general help about available commands
stac-popultaor --help

# obtain general help about available STAC populator implementations
stac-popultaor run --help

# obtain help specifically for the execution of a STAC populator implementation
stac-popultaor run [implementation] --help
```

You can also employ the pre-built Docker, which can be called as follows,
where `[command]` corresponds to any of the above example operations.

```shell
docker run -ti ghcr.io/crim-ca/stac-populator:0.3.0 [command]
```

*Note*: <br>
If files needs to provided as input or obtained as output for using a command with `docker`, you will need to either
mount files individually or mount a workspace directory using `-v {local-path}:{docker-path}` inside the Docker
container to make them accessible to the command.

## Testing

The provided [`docker-compose`](docker/docker-compose.yml) configuration file can be used to launch a test STAC server.
For example, the [CMIP6_UofT][CMIP6_UofT] script can be run as:
Consider using `make docker-start` to start this server, and `make docker-stop` to stop it.
Alternatively, you can also use your own STAC server accessible from any remote location.

To run the STAC populator, follow the steps from [Installation and Execution](#installation-and-execution).

Alternatively, you can call the relevant populator Python scripts individually.
For example, using the [CMIP6_UofT][CMIP6_UofT] implementation, the script can be run as:

```shell
python STACpopulator/implementations/CMIP6_UofT/add_CMIP6.py \
Expand All @@ -58,5 +87,11 @@ python STACpopulator/implementations/CMIP6_UofT/add_CMIP6.py \
"STACpopulator/implementations/CMIP6_UofT/collection_config.yml"
```

*Note*:
*Note*: <br>
In the script above, a sample THREDDS catalog URL is employed and not one relevant to the global scale CMIP6 data.

For more tests validation, you can also run the test suite with coverage analysis.

```shell
make test-cov
```
2 changes: 1 addition & 1 deletion STACpopulator/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.2.0"
__version__ = "0.3.0"
46 changes: 31 additions & 15 deletions STACpopulator/api_requests.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from typing import Any, Optional

import requests
from requests import Session
from colorlog import ColoredFormatter

LOGGER = logging.getLogger(__name__)
Expand All @@ -15,27 +16,36 @@
LOGGER.propagate = False


def stac_host_reachable(url: str) -> bool:
def stac_host_reachable(url: str, session: Optional[Session] = None) -> bool:
try:
registry = requests.get(url)
registry.raise_for_status()
return True
except (requests.exceptions.RequestException, requests.exceptions.ConnectionError):
return False
session = session or requests
response = session.get(url, headers={"Accept": "application/json"})
response.raise_for_status()
body = response.json()
return body["type"] == "Catalog" and "stac_version" in body

except (requests.exceptions.RequestException, requests.exceptions.ConnectionError) as exc:
LOGGER.error("Could not validate STAC host. Not reachable [%s] due to [%s]", url, exc, exc_info=exc)
return False

def stac_collection_exists(stac_host: str, collection_id: str) -> bool:

def stac_collection_exists(stac_host: str, collection_id: str, session: Optional[Session] = None) -> bool:
"""
Get a STAC collection
Returns the collection JSON.
"""
r = requests.get(os.path.join(stac_host, "collections", collection_id), verify=False)

session = session or requests
r = session.get(os.path.join(stac_host, "collections", collection_id), verify=False)
return r.status_code == 200


def post_stac_collection(stac_host: str, json_data: dict[str, Any], update: Optional[bool] = True) -> None:
def post_stac_collection(
stac_host: str,
json_data: dict[str, Any],
update: Optional[bool] = True,
session: Optional[Session] = None,
) -> None:
"""Post/create a collection on the STAC host
:param stac_host: address of the STAC host
Expand All @@ -44,16 +54,19 @@ def post_stac_collection(stac_host: str, json_data: dict[str, Any], update: Opti
:type json_data: dict[str, Any]
:param update: if True, update the collection on the host server if it is already present, defaults to True
:type update: Optional[bool], optional
:param session: Session with additional configuration to perform requests.
"""
session = session or requests
collection_id = json_data["id"]
r = requests.post(os.path.join(stac_host, "collections"), json=json_data, verify=False)
collection_url = os.path.join(stac_host, "collections")
r = session.post(collection_url, json=json_data)

if r.status_code == 200:
LOGGER.info(f"Collection {collection_id} successfully created")
elif r.status_code == 409:
if update:
LOGGER.info(f"Collection {collection_id} already exists. Updating.")
r = requests.put(os.path.join(stac_host, "collections"), json=json_data, verify=False)
r = session.put(os.path.join(stac_host, "collections"), json=json_data)
r.raise_for_status()
else:
LOGGER.info(f"Collection {collection_id} already exists.")
Expand All @@ -67,6 +80,7 @@ def post_stac_item(
item_name: str,
json_data: dict[str, dict],
update: Optional[bool] = True,
session: Optional[Session] = None,
) -> None:
"""Post a STAC item to the host server.
Expand All @@ -80,17 +94,19 @@ def post_stac_item(
:type json_data: dict[str, dict]
:param update: if True, update the item on the host server if it is already present, defaults to True
:type update: Optional[bool], optional
:param session: Session with additional configuration to perform requests.
"""
session = session or requests
item_id = json_data["id"]

r = requests.post(os.path.join(stac_host, f"collections/{collection_id}/items"), json=json_data)
item_url = os.path.join(stac_host, f"collections/{collection_id}/items")
r = session.post(item_url, json=json_data)

if r.status_code == 200:
LOGGER.info(f"Item {item_name} successfully added")
elif r.status_code == 409:
if update:
LOGGER.info(f"Item {item_id} already exists. Updating.")
r = requests.put(os.path.join(stac_host, f"collections/{collection_id}/items/{item_id}"), json=json_data)
r = session.put(os.path.join(stac_host, f"collections/{collection_id}/items/{item_id}"), json=json_data)
r.raise_for_status()
else:
LOGGER.info(f"Item {item_id} already exists.")
Expand Down
Loading

0 comments on commit ad0b022

Please sign in to comment.