-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build & publish images through GHA #105
Changes from 5 commits
27c529f
f7c702a
6469620
23e9fd9
200bfc1
d1141c1
2534e61
4a0f0f3
1e7f552
6376f91
37a765f
909c17a
1c7f528
fe60490
b9493d5
1b43c7f
127da2a
5a31da0
bbf65f4
8cbe7da
3a044dd
fec0a76
f9dc7b4
0af7f24
ec564f3
f5e395a
2067ca9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
name: Docker build | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
pull_request: | ||
schedule: | ||
- cron: "0 1 * * *" # Daily “At 01:00” UTC | ||
workflow_dispatch: | ||
|
||
jobs: | ||
docker: | ||
runs-on: ubuntu-latest | ||
|
||
strategy: | ||
fail-fast: false | ||
matrix: | ||
cuda: ["11.8.0"] | ||
python: ["3.10", "3.11", "3.12"] | ||
linux: ["ubuntu20.04"] | ||
rapids: ["24.10", "24.12"] | ||
image: | ||
- tag: "gpuci/dask" | ||
context: "./dask" | ||
- tag: "gpuci/distributed" | ||
context: "./distributed" | ||
- tag: "gpuci/dask_image" | ||
context: "./dask_image" | ||
# dask-image gpuCI isn't dependent on RAPIDS | ||
exclude: | ||
- image: "gpuci/dask_image" | ||
rapids: "24.10" | ||
|
||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v3 | ||
charlesbluca marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- name: Set up Docker Buildx | ||
uses: docker/setup-buildx-action@v3 | ||
|
||
- name: Login to DockerHub | ||
uses: docker/login-action@v3 | ||
if: github.repository == 'rapidsai/dask-build-environment' && (github.event_name == 'push' || github.event_name == 'workflow_dispatch') | ||
with: | ||
username: ${{ secrets.DOCKERHUB_USERNAME }} | ||
password: ${{ secrets.DOCKERHUB_TOKEN }} | ||
charlesbluca marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- name: Compute ucx-py version | ||
id: ucx_py | ||
env: | ||
rapids: ${{ matrix.rapids }} | ||
run: | | ||
echo "version=$(curl -sL https://version.gpuci.io/rapids/${rapids})" >> $GITHUB_ENV | ||
|
||
- name: Generate tag | ||
id: tag | ||
env: | ||
cuda: ${{ matrix.cuda }} | ||
python: ${{ matrix.python }} | ||
linux: ${{ matrix.linux }} | ||
rapids: ${{ matrix.rapids }} | ||
image: ${{ matrix.image.tag }} | ||
run: | | ||
case ${image} in | ||
"gpuci/dask_image") # doesn't depend on RAPIDS / ucx-py for gpuCI | ||
tag="${image}:cuda${cuda}-devel-${linux}-py${python}" | ||
;; | ||
*) | ||
tag="${image}:${rapids}-cuda${cuda}-devel-${linux}-py${python}" | ||
;; | ||
esac | ||
|
||
echo "tag=${tag}" >> $GITHUB_OUTPUT | ||
|
||
- name: Build and push | ||
uses: docker/build-push-action@v6 | ||
with: | ||
context: ${{ matrix.image.context }} | ||
push: ${{ github.repository == 'rapidsai/dask-build-environment' && (github.event_name == 'push' || github.event_name == 'workflow_dispatch') }} | ||
platforms: linux/amd64 | ||
charlesbluca marked this conversation as resolved.
Show resolved
Hide resolved
|
||
tags: ${{ steps.tag.outputs.tag }} | ||
build-args: | | ||
RAPIDS_VER=${{ matrix.rapids }} | ||
UCX_PY_VER=${{ steps.ucx_py.outputs.version }} | ||
CUDA_VER=${{ matrix.cuda }} | ||
LINUX_VER=${{ matrix.linux }} | ||
PYTHON_VER=${{ matrix.python }} | ||
|
||
- name: Report | ||
run: echo Built ${{ steps.tag.outputs.tag }} | ||
charlesbluca marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -13,19 +13,23 @@ jobs: | |||||||||||||||||||||||||||||||||||||||||
steps: | ||||||||||||||||||||||||||||||||||||||||||
- uses: actions/checkout@v4 | ||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||
- name: Parse current axis YAML | ||||||||||||||||||||||||||||||||||||||||||
- name: Fetch current RAPIDS versions from build matrix | ||||||||||||||||||||||||||||||||||||||||||
id: parse_yaml | ||||||||||||||||||||||||||||||||||||||||||
uses: the-coding-turtle/[email protected] | ||||||||||||||||||||||||||||||||||||||||||
with: | ||||||||||||||||||||||||||||||||||||||||||
file: ci/axis/dask.yaml | ||||||||||||||||||||||||||||||||||||||||||
export_to_envs: true | ||||||||||||||||||||||||||||||||||||||||||
return_to_outputs: false | ||||||||||||||||||||||||||||||||||||||||||
file: .github/workflows/build.yml | ||||||||||||||||||||||||||||||||||||||||||
export_to_envs: false | ||||||||||||||||||||||||||||||||||||||||||
return_to_outputs: true | ||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was the only use of git grep axis That file should be deleted as part of this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably should've provided more clarity around this in the PR description, but this file is actually getting pointed to by some of our Jenkins infra, which I'm assuming would start failing if we deleted this file: My thinking was that we would:
Does that seem like a reasonable approach? Happy to do things a different way, and/or add more writing here to clarify the other ops work that should be considered a follow up to this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ooooo thank you for that explanation! Yes I totally support that plan, I'd forgotten about the Jenkins half of this. |
||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||
- name: Get old RAPIDS / UCX-Py versions | ||||||||||||||||||||||||||||||||||||||||||
env: | ||||||||||||||||||||||||||||||||||||||||||
OLD_RAPIDS_VER: ${{ steps.parse_yaml.outputs.jobs_docker_strategy_matrix_rapids_0 }} | ||||||||||||||||||||||||||||||||||||||||||
RAPIDS_VER: ${{ steps.parse_yaml.outputs.jobs_docker_strategy_matrix_rapids_1 }} | ||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm uncomfortable with implicitly depending on the ordering of the I also think it's wasteful (and a source of unnecessary risk of failed builds due to network issues) to have all these calls to Would you consider something like the following over in env:
OLD_RAPIDS_VER: "24.10"
OLD_UCX_PY_VER: "0.40"
RAPIDS_VER: "24.12"
UCX_PY_VER: "0.41"
jobs:
docker:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
cuda: ["11.8.0"]
python: ["3.10", "3.11", "3.12"]
linux: ["ubuntu20.04"]
rapids: [${{ env.OLD_RAPIDS_VER }}, ${{ env.NEW_RAPIDS_VER }}] And then passing those values as outputs through to whatever needs it in this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that'd also allow you to get rid of all this other machinery: dask-build-environment/.github/workflows/update-gpuci.yml Lines 38 to 44 in 8977069
There's exactly 1 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I definitely think the ordering concern alone is sufficient to hardcode these values somewhere other than the build matrix itself, shouldn't be too difficult to refactor the builds to rely on an internal version mapping instead of
Were you thinking of replacing this step with a request to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another minor annoyance here I didn't foresee - it looks like the https://github.com/rapidsai/dask-build-environment/actions/runs/11114625627/workflow There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Was able to get around this by using some hardcoded representative values in the build matrix, that we then later manually map to the hardcoded RAPIDS/ucx-py versions in 0af7f24 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I looked back through this workflow more closely to be sure I understood what it was doing. I didn't understand that it's job was to trigger new PRs here, so some of my earlier comments about it might not make sense. Am I right that this dask-build-environment/.github/workflows/update-gpuci.yml Lines 86 to 98 in 8977069
Like #104? If so... I'd just remove this entire There's a new RAPIDS release once every 2 months, and every RAPIDS release corresponds to exactly one Specifically, I'm proposing:
If you want to keep this automated, then ignore my other comments about There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
blegh ok, sorry for misleading you there. I like the solution you came up with! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Pretty much! Only thing I'd add is that it's specifically when a new cuDF release and ucx-py release are detected, not if it's just or the other.
Definitely a strong +1 on adding something like this to a checklist to increase the visibility of the ops work around Dask GPU CI, worth noting that there are similar workflows in the respective Dask repos to automate this version bump too:
Which reflect other steps that could probably be done manually as part of a checklist - maybe we should file an issue here to track this switch? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure! Up to you and the other Dask folks how you want to handle it. Also fine to tell me "nah we'll keep this as-is", I don't feel that strongly about it. Just right now, in this moment, this feels like a lot of complexity to automate something that changes very infrequently. |
||||||||||||||||||||||||||||||||||||||||||
run: | | ||||||||||||||||||||||||||||||||||||||||||
echo OLD_RAPIDS_VER=$RAPIDS_VER_0 >> $GITHUB_ENV | ||||||||||||||||||||||||||||||||||||||||||
echo OLD_UCX_PY_VER=$(curl -sL https://version.gpuci.io/rapids/$RAPIDS_VER_0) >> $GITHUB_ENV | ||||||||||||||||||||||||||||||||||||||||||
echo RAPIDS_VER=$RAPIDS_VER_1 >> $GITHUB_ENV | ||||||||||||||||||||||||||||||||||||||||||
echo UCX_PY_VER=$(curl -sL https://version.gpuci.io/rapids/$RAPIDS_VER_1) >> $GITHUB_ENV | ||||||||||||||||||||||||||||||||||||||||||
echo OLD_RAPIDS_VER=$OLD_RAPIDS_VER >> $GITHUB_ENV | ||||||||||||||||||||||||||||||||||||||||||
echo OLD_UCX_PY_VER=$(curl -sL https://version.gpuci.io/rapids/$OLD_RAPIDS_VER) >> $GITHUB_ENV | ||||||||||||||||||||||||||||||||||||||||||
echo RAPIDS_VER=$RAPIDS_VER >> $GITHUB_ENV | ||||||||||||||||||||||||||||||||||||||||||
echo UCX_PY_VER=$(curl -sL https://version.gpuci.io/rapids/$RAPIDS_VER) >> $GITHUB_ENV | ||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||
- name: Get latest cuDF nightly version | ||||||||||||||||||||||||||||||||||||||||||
id: cudf_latest | ||||||||||||||||||||||||||||||||||||||||||
|
@@ -54,31 +58,27 @@ jobs: | |||||||||||||||||||||||||||||||||||||||||
- name: Update new RAPIDS versions | ||||||||||||||||||||||||||||||||||||||||||
uses: jacobtomlinson/gha-find-replace@v2 | ||||||||||||||||||||||||||||||||||||||||||
with: | ||||||||||||||||||||||||||||||||||||||||||
include: 'ci/**' | ||||||||||||||||||||||||||||||||||||||||||
charlesbluca marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||||||||||||||||||||||
find: "${{ env.RAPIDS_VER }}" | ||||||||||||||||||||||||||||||||||||||||||
replace: "${{ env.NEW_RAPIDS_VER }}" | ||||||||||||||||||||||||||||||||||||||||||
regex: false | ||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||
- name: Update old RAPIDS versions | ||||||||||||||||||||||||||||||||||||||||||
uses: jacobtomlinson/gha-find-replace@v3 | ||||||||||||||||||||||||||||||||||||||||||
with: | ||||||||||||||||||||||||||||||||||||||||||
include: 'ci/**' | ||||||||||||||||||||||||||||||||||||||||||
find: "${{ env.OLD_RAPIDS_VER }}" | ||||||||||||||||||||||||||||||||||||||||||
replace: "${{ env.RAPIDS_VER }}" | ||||||||||||||||||||||||||||||||||||||||||
regex: false | ||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||
- name: Update new UCX-Py versions | ||||||||||||||||||||||||||||||||||||||||||
uses: jacobtomlinson/gha-find-replace@v3 | ||||||||||||||||||||||||||||||||||||||||||
with: | ||||||||||||||||||||||||||||||||||||||||||
include: 'ci/**' | ||||||||||||||||||||||||||||||||||||||||||
find: "${{ env.UCX_PY_VER }}" | ||||||||||||||||||||||||||||||||||||||||||
replace: "${{ env.NEW_UCX_PY_VER }}" | ||||||||||||||||||||||||||||||||||||||||||
regex: false | ||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||
- name: Update old UCX-Py versions | ||||||||||||||||||||||||||||||||||||||||||
uses: jacobtomlinson/gha-find-replace@v3 | ||||||||||||||||||||||||||||||||||||||||||
with: | ||||||||||||||||||||||||||||||||||||||||||
include: 'ci/**' | ||||||||||||||||||||||||||||||||||||||||||
find: "${{ env.OLD_UCX_PY_VER }}" | ||||||||||||||||||||||||||||||||||||||||||
replace: "${{ env.UCX_PY_VER }}" | ||||||||||||||||||||||||||||||||||||||||||
regex: false | ||||||||||||||||||||||||||||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there's a preference on how something like this would get triggered on a schedule? Ideally would like to have the status checks broadcast on Slack to improve visibility, so happy to move in whatever direction makes that easiest with our current devops stack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks ok to me. cc @raydouglass for additional review
FWIW this post might be relevant for your slack notification needs: https://github.blog/changelog/2022-12-06-github-actions-workflow-notifications-in-slack-and-microsoft-teams/