Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Tag jobs by required runner capabilities #1394

Closed
wants to merge 34 commits into from

Conversation

upsj
Copy link
Member

@upsj upsj commented Aug 20, 2023

Right now, we use tags like amdci and horeka to determine where to run our tests. It would be more scalable if instead we just listed which capabilities a runner has in its tags (nvidia-gpu, nvidia-multi-gpu, amd-gpu, amd-multi-gpu and cpu come to mind). That way, we can move to other platforms much more seamlessly if necessary.

Also, it seems that none of our AMD tests running on nla1? We should change that.

The branch is a bit messy since it is based on a rebased ctest-resources branch, only the last commit is interesting.

This PR is not final yet, I want to use it to collect necessary changes to enable running on apptainer FTP/Horeka.

@upsj upsj added the 1:ST:WIP This PR is a work in progress. Not ready for review. label Aug 20, 2023
@upsj upsj self-assigned this Aug 20, 2023
@ginkgo-bot ginkgo-bot added reg:build This is related to the build system. reg:testing This is related to testing. reg:ci-cd This is related to the continuous integration system. reg:documentation This is related to documentation. reg:example This is related to the examples. reg:benchmarking This is related to benchmarking. type:solver This is related to the solvers type:preconditioner This is related to the preconditioners type:matrix-format This is related to the Matrix formats type:reordering This is related to the matrix(LinOp) reordering type:multigrid This is related to multigrid mod:all This touches all Ginkgo modules. labels Aug 20, 2023
otherwise, the default stream is used in some places, e.g. initializing cublas. But it is not clear with which device the default stream is associated. Thus, this now sets the device id correctly for the new stream
this is necessary, since some test call the kernels directly and not through the executor. In this case, the setting of the device id by the executor is skipped, which leads to these kernel not run.
@upsj upsj force-pushed the ctest-resources branch 2 times, most recently from 0799a12 to 4c2defa Compare September 1, 2023 14:47
Base automatically changed from ctest-resources to develop September 21, 2023 12:28
@upsj
Copy link
Member Author

upsj commented Jan 9, 2024

I'll redo this properly in the future

@upsj upsj closed this Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:WIP This PR is a work in progress. Not ready for review. mod:all This touches all Ginkgo modules. reg:benchmarking This is related to benchmarking. reg:build This is related to the build system. reg:ci-cd This is related to the continuous integration system. reg:documentation This is related to documentation. reg:example This is related to the examples. reg:testing This is related to testing. type:matrix-format This is related to the Matrix formats type:multigrid This is related to multigrid type:preconditioner This is related to the preconditioners type:reordering This is related to the matrix(LinOp) reordering type:solver This is related to the solvers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants