Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: group using artifact files #143

Merged
merged 4 commits into from
Jul 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions templates/argo-tasks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# ArgoTasks Templates

## argo-tasks/group - `tpl-at-group`

Group inputs into outputs to be used with `withParam` to run one task per grouping

### Template usage

Group the output of `tile-index-validate` into groups of size 5

```yaml
- name: group
templateRef:
name: tpl-at-group
template: main

arguments:
artifacts:
- name: input
from: "{{ tasks.tile-index-validate.outputs.artifacts.files }}"

parameters:
- name: size
value: 5

depends: "tile-index-validate"
```

### Consumer usage

Using `withParams` to spin up a number of tasks from

```yaml
# ...

steps:
# consume the output of a grouper
- - name: consume
template: consume
arguments:
parameters:
- name: group_id
value: "{{item}}" # groupId eg "000", "001", "002" etc..

# All the grouped data as a folder
artifacts:
- name: group_data
from: "{{ steps.group.outputs.artifacts.output }}"

withParam: "{{ steps.group.outputs.parameters.output }}"

# ...

- name: consume
inputs:
# Id of the grouping file to consume
# to be used with `group_data`
parameters:
- name: group_id # "000", "001" ... etc

# grouped input data for the consumer, this will be a folder full of JSON files
# one file per groupId
artifacts:
- name: group_data
path: /tmp/input/

parameters:
- name: group_id # "000", "001" ... etc

script:
image: "019359803926.dkr.ecr.ap-southeast-2.amazonaws.com/eks:argo-tasks-latest"
command: [bash]
source: |
echo {{ inputs.parameters.group_id}}
ls -alh /tmp/input/

# for example using with a --from-file
# ./test-cli --from-file=/tmp/group/input/{{inputs.parameters.group_id}}.json
```
33 changes: 25 additions & 8 deletions templates/argo-tasks/group.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,31 +10,48 @@ spec:
- name: main
inputs:
artifacts:
#
# JSON array of things to group
#
# @example
# ```json
# ["a.json", "b.json", "c.json"]
# ```
- name: input
path: /tmp/group/input.json
optional: true

parameters:
- name: size
description: group into this number of records per group
description: Group into this number of records per group

- name: version
description: container version to use
description: argo-task Container version to use
default: "v2"

outputs:
parameters:
# Grouped output of the input
#
# JSON array of all the group ids, which correspond to a output artifact file
#
# @example
# ```json
# ["000", "001", "002", "003"]
# ```
#
# Workflows should use the "000" string to access the data from the output artifact folder
- name: output
valueFrom:
path: /tmp/group/output.json

artifacts:
# Grouped output of the input
# Grouped output of the input as one file per output groupId
#
# - /output/000.json
# - /output/001.json
# - ...
#
- name: output
path: /tmp/group/output.json
archive:
none: {}
path: /tmp/group/output/

container:
image: "019359803926.dkr.ecr.ap-southeast-2.amazonaws.com/eks:argo-tasks-{{= inputs.parameters.version }}"
Expand Down
Loading