-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Find a way to test models with GPU #684
Comments
Not sure if it's of interest or relevant, but Gitlab has (paid) SaaS GPU runners already (https://docs.gitlab.com/ee/ci/runners/saas/gpu_saas_runner.html)... though not sure if it would make any sense to call a Gitlab runner from Github. |
We could of course switch to Gitlab alltogether... with the planned "S3 first" approach it is unclear what part of the collection stays on Github anyway |
Hi Could you clarify a bit more on the problem? pytorch state dict wont' run or it doesn't reproduce the result? Cannot we simply provide some guidelines on how to properly export pytorch model so we can test it? Can we just say we only support torch script (amazingly it can also train / finetune) Having to run on GPU will associated with additional costs and maintainance burden. Potentially we can just use the Triton on the BioEngine to test it with CPU and GPU, however, we don't really support pytorch state dict. From security point of view, it's not safe to assign a paid GPU node to run arbitrary code on unaccepted PRs. |
I noticed one particular test case I would like us to cover: hardcoded device handling, e.g. `tensor.cuda(), model.to("cpu"), etc.
Yes exactly, I opened this issue just now to write exactly that! It should be a test instance. maybe with some stricter settings. We could keep the CPU based tests in GH actions...
The security question was always a reason for zenodo... |
We really have to test contributed models on CPU and GPU. Especially for Pytorch state dict models it is not guaranteed that it works on the other hardware if tested on one. (e.g. harcoded
.cuda()
method calls, etc..)I believe @oeway has mentioned that there is a way to setup self-hosted github runners... and as ususal he seems to be right: https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners
GH itself seems has GPU workflows on their radar as well: github/roadmap#505
The text was updated successfully, but these errors were encountered: