Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix launch configuration error with ScatterGatherTest.TorchGatherAllRankAllSelectedDim #3364

Merged
merged 1 commit into from
Nov 7, 2024

Conversation

naoyam
Copy link
Collaborator

@naoyam naoyam commented Nov 7, 2024

This test uses pseudo random numbers to generate index tensors, which can result in requiring too large grid dimensions. For instance, there was this error reported today:

C++ exception with description " INTERNAL ASSERT FAILED at "/opt/pytorch/nvfuser/csrc/runtime/executor_params.cpp":41, please report a bug with repro script to NVFuser at https://github.com/NVIDIA/Fuser/issues. Invalid number of blocks in y direction: 69923
Exception raised from assertValid at /opt/pytorch/nvfuser/csrc/runtime/executor_params.cpp:41 (most recent call first):

The true fix would be making sure the scheduler to use a proper launch configurations, but these index operations are only there as experimental ops, I think this fix should be good enough for now.

can result in requiring too large grid dimensions.

The true fix would be making sure the scheduler to use a proper launch
configurations, but these index operations are only there as
experimental ops, I think this fix should be good enough for now.
@naoyam
Copy link
Collaborator Author

naoyam commented Nov 7, 2024

!build

@naoyam naoyam merged commit 30e7bff into main Nov 7, 2024
16 checks passed
@naoyam naoyam deleted the fix_launch_param_error branch November 7, 2024 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants