Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enable collection id to be a parameter TDE-730 #163

Merged
merged 5 commits into from
Aug 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 16 additions & 12 deletions workflows/imagery/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ In addition, a Basemaps link is produced enabling visual QA.
| group | int | 50 | The number of files to group into the pods (testing has recommended using 50 for large datasets). |
| compression | enum | webp | Standardised file format |
| cutline | str | | (Optional) location of a cutline file to cut the imagery to `.fgb` or `.geojson` (leave blank if no cutline) |
| collection-id | str | | (Optional) Provide a Collection ID if re-processing an existing published survery, otherwise a ULID will be generated for the collection.json ID field. |
| title | str | \*Region/District/City\* \*GSD\* \*Urban/Rural\* Aerial Photos (\*Year-Year\*) | Collection title |
| description | str | Orthophotography within the \*Region Name\* region captured in the \*Year\*-\*Year\* flying season. | Collection description |
| producer | enum | Unknown | Imagery producer |
Expand All @@ -50,6 +51,7 @@ In addition, a Basemaps link is produced enabling visual QA.
| group | 50 |
| compression | webp |
| cutline | s3://linz-imagery-staging/cutline/bay-of-plenty_2021-2022.fgb |
| collection-id | 01FP371BHWDSREECKQAH9E8XQ |
| title | Bay of Plenty 0.2m Rural Aerial Photos (2021-2022) |
| description | Orthophotography within the Bay of Plenty region captured in the 2021-2022 flying season. |
| producer | Aerial Surveys |
Expand Down Expand Up @@ -98,7 +100,7 @@ uri: https://basemaps.linz.govt.nz?config=...

```mermaid
graph TD;
generate-ulid-->standardise-validate;
collection-id-setup-->standardise-validate;
get-location-->standardise-validate;
tileindex-validate-->standardise-validate;
standardise-validate-->create-collection;
Expand All @@ -107,9 +109,10 @@ graph TD;
create-overview-->create-config;
```

### [generate-ulid](./standardising.yaml)
### [collection-id-setup](./standardising.yaml)

Generates a ULID which is used as the collection id for the standardised dataset.
Sets the collection ID for the workflow as the input parameter.
If no input collection ID is provided a ULID is generated and used as the collection id for the standardised dataset.

### [tileindex-validate](https://github.com/linz/argo-tasks/blob/master/src/commands/tileindex-validate/)

Expand Down Expand Up @@ -249,15 +252,16 @@ This workflow carries out the steps in the [Standardising](#Standardising) workf

### Standardising-Publish Optional Parameters - can be specified on the command line to override default value

| Parameter | Type | Default | Description |
| ----------- | ----- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| cutline | str | | (Optional) location of a cutline file to cut the imagery to `.fgb` or `.geojson` (do not include if no cutline) |
| compression | str | webp | Standardised file format. Must be `webp` or `lzw` |
| group | int | 50 | Applies to the standardising workflow. The number of files to group into the pods (testing has recommended using 50 for large datasets). |
| include | regex | .tiff?$ | Applies to the standardising workflow. A regular expression to match object path(s) or name(s) from within the source path to include in standardising\*. |
| copy-option | str | --no-clobber | Applies to the standardising and publishing workflows and should not need to be changed. `--no-clobber` Skip overwriting existing files. `--force` Overwrite all files. `--force-no-clobber` Overwrite only changed files, skip unchanged files. |
| source-epsg | str | 2193 | The EPSG code of the source imagery. |
| target-epsg | str | 2193 | The Target EPSG code, if different to source-epsg the imagery will be reprojected. |
| Parameter | Type | Default | Description |
| ------------- | ----- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| cutline | str | | (Optional) location of a cutline file to cut the imagery to `.fgb` or `.geojson` (do not include if no cutline) |
| collection-id | str | | (Optional) Provide a Collection ID if re-processing an existing published survery, otherwise a ULID will be generated for the collection.json ID field. |
| compression | str | webp | Standardised file format. Must be `webp` or `lzw` |
| group | int | 50 | Applies to the standardising workflow. The number of files to group into the pods (testing has recommended using 50 for large datasets). |
| include | regex | .tiff?$ | Applies to the standardising workflow. A regular expression to match object path(s) or name(s) from within the source path to include in standardising\*. |
| copy-option | str | --no-clobber | Applies to the standardising and publishing workflows and should not need to be changed. `--no-clobber` Skip overwriting existing files. `--force` Overwrite all files. `--force-no-clobber` Overwrite only changed files, skip unchanged files. |
| source-epsg | str | 2193 | The EPSG code of the source imagery. |
| target-epsg | str | 2193 | The Target EPSG code, if different to source-epsg the imagery will be reprojected. |

\* This regex can be used to exclude paths as well, e.g. if there are RBG and RGBI directories, the following regex will only include TIFF files in the RGB directory: `RGB(?!I).*.tiff?$`.

Expand Down
2 changes: 2 additions & 0 deletions workflows/imagery/standardising-publish-import.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ spec:
parameters:
- name: cutline # optional standardising cutline
value: ""
- name: collection-id # optional
value: ""
- name: compression
value: "webp"
- name: source-epsg
Expand Down
28 changes: 18 additions & 10 deletions workflows/imagery/standardising.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,9 @@ spec:
- name: cutline
description: "(Optional) location of a cutline file to cut the imagery to .fgb or .geojson"
value: ""
- name: collection-id
description: "(Optional) If an existing dataset add collection ID here, else a new one will be generated."
value: ""
- name: title
value: "*Region/District/City* *GSD* *Urban/Rural* Aerial Photos (*Year-Year*)"
- name: description
Expand Down Expand Up @@ -210,8 +213,9 @@ spec:
- name: main
dag:
tasks:
- name: generate-ulid
template: generate-ulid
- name: collection-id-setup
template: collection-id-setup

- name: tile-index-validate
templateRef:
name: tpl-at-tile-index-validate
Expand Down Expand Up @@ -255,21 +259,21 @@ spec:
- name: group_id
value: "{{item}}"
- name: collection-id
value: "{{tasks.generate-ulid.outputs.parameters.ulid}}"
value: "{{tasks.collection-id-setup.outputs.parameters.collection-id}}"
- name: target
value: "{{tasks.get-location.outputs.parameters.location}}flat/"
artifacts:
- name: group_data
from: "{{ tasks.group.outputs.artifacts.output }}"
depends: "group && generate-ulid && get-location"
depends: "group && collection-id-setup && get-location"
withParam: "{{ tasks.group.outputs.parameters.output }}"

- name: create-collection
template: create-collection
arguments:
parameters:
- name: collection-id
value: "{{tasks.generate-ulid.outputs.parameters.ulid}}"
value: "{{tasks.collection-id-setup.outputs.parameters.collection-id}}"
- name: location
value: "{{tasks.get-location.outputs.parameters.location}}"
depends: "standardise-validate"
Expand Down Expand Up @@ -314,19 +318,23 @@ spec:
parameter: "{{tasks.get-location.outputs.parameters.location}}"
# END TEMPLATE `main`

- name: generate-ulid
- name: collection-id-setup
script:
image: "019359803926.dkr.ecr.ap-southeast-2.amazonaws.com/eks:topo-imagery-{{=sprig.trim(workflow.parameters['version-topo-imagery'])}}"
command: [python]
source: |
import ulid
with open("/tmp/ulid", "w") as f:
f.write(str(ulid.ULID()))
collection_id = "{{workflow.parameters.collection-id}}"
with open("/tmp/collection-id", "w") as f:
if not collection_id:
f.write(str(ulid.ULID()))
else:
f.write(collection_id)
outputs:
parameters:
- name: ulid
- name: collection-id
valueFrom:
path: "/tmp/ulid"
path: "/tmp/collection-id"

- name: standardise-validate
retryStrategy:
Expand Down
Loading