You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[KVCache] TIR attention kernel support for MLA (#17618)
This PR introduces the MLA attention kernels written in TIR.
It also implements the KV cache MLA computation logic.
A new unit test file is added to ensure the correctness of the
TIR kernels.
This PR also fixes a few TIR prefill kernel tile size initialization.
0 commit comments