[Kernel][Attention] Separate Attention.kv_scale
into k_scale
and v_scale
#5657
Job | Run time |
---|---|
6s | |
6s |
Attention.kv_scale
into k_scale
and v_scale
#5657
Job | Run time |
---|---|
6s | |
6s |