-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PytatoPyOpenCLArrayContext: don't trust the arg limit reported by the GPU #198
Conversation
GPU Apparently, CUDA doesn't like it when argument sizes get close to the reported limit. Avoids PTX/JIT errors of the type CUDA_ERROR_INVALID_IMAGE: device kernel image is invalid CUDA_ERROR_FILE_NOT_FOUND: file not found
Interesting. That's a weird error message for "too many arguments". |
LMK when you think this is ready. |
I have tested this and it seems to return us to the previous state described in illinois-ceesd/mirgecom#679. This was a regression from my gist which set a fixed limit of 1024 on all devices. @MTCam also experimented with this PR. This is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. LGTM with two minor tweaks.
LGTM, thanks! |
Apparently, CUDA doesn't like it when argument sizes get close to the reported limit.
Avoids PTX/JIT errors of the type:
CUDA_ERROR_INVALID_IMAGE: device kernel image is invalid
CUDA_ERROR_FILE_NOT_FOUND: file not found
Partially addresses illinois-ceesd/mirgecom#679