-
Notifications
You must be signed in to change notification settings - Fork 55
Issues: NVIDIA/Fuser
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Indexing error with circular buffering and unswitch
bug
Something isn't working
#4189
opened Apr 3, 2025 by
naoyam
InsertReshardingsPass failed to decompose a set with both shard additions and deletions.
Multi-GPU
#4188
opened Apr 3, 2025 by
wujingyue
Error scheduling Hopper matmul with use_smem_epilogue and splitk
Matmuls
#4159
opened Mar 31, 2025 by
jacobhinkle
getUnrollFactor
incorrectly assumes that all vectorized inputs are using the same vectorized width
#4149
opened Mar 26, 2025 by
liqiangxl
InsertReshardingsPass decomposes matmul/linear+allreduce.
Multi-GPU
#4133
opened Mar 24, 2025 by
wujingyue
max_persistent_buffer_size may be smaller than total_reduction_numel
bug
Something isn't working
#4075
opened Mar 13, 2025 by
naoyam
Refactor New feature or request
IndexLowering::handle(const LoadStoreOp* ldst)
enhancement
#4058
opened Mar 11, 2025 by
rdspring1
RFE: Take contiguity caching into nvFuser
enhancement
New feature or request
#4043
opened Mar 7, 2025 by
csarofeen
inplace update done via aliased outputs should have more strict checks
#4036
opened Mar 7, 2025 by
jjsjann123
benchmarking suite should initialize cuda graphs / profiler interaction
Python Benchmarks
#4008
opened Mar 4, 2025 by
tfogal
take_along_axis validation error (race condition?)
bug
Something isn't working
#4003
opened Mar 3, 2025 by
naoyam
checking for compatible allocation domain on
Fusion::replaceOutput
#3994
opened Feb 28, 2025 by
jjsjann123
cudaErrorMisalignedAddress
when sweeping matmul problems with NN, TN, and TT layouts.
Matmuls
#3966
opened Feb 25, 2025 by
rdspring1
Optimize TMA Store logic to handle pipelining and aliasing.
Matmuls
#3961
opened Feb 25, 2025 by
rdspring1
Missing block sync after epilogue compute but before stmatrix (Correctness)
Matmuls
#3960
opened Feb 25, 2025 by
rdspring1
Swizzle tiles in matmul without introducing larger grid due to nondivisible splits
Matmuls
#3942
opened Feb 21, 2025 by
jacobhinkle
Allow separate sub-DAG for load and compute warp groups with warp-specialized circular buffering.
Matmuls
TMA
#3941
opened Feb 21, 2025 by
rdspring1
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.