You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit adds an additional check for src[1] dimensions to be 2
when determining if a tensor supports repacking.
The motivation for this change is to ensure that both source tensors
are strictly 2D before using repack. The repack implementation does
not support broadcasting in dimensions 2 and 3, which occurs when
src1 has more dimensions than src0 (like when nr != [1,1] in
test-backend-ops.cpp).
Without this check, operations with broadcasting would use repack and
produce incorrect results because repack assumes.
With this check, broadcasting operations fall back to the standard
CPU implementation which correctly handles the index mapping
(i02 = i12/r2, i03 = i13/r3).
This fixes test failures like:
```console
MUL_MAT(type_a=q4_0,type_b=f32,m=16,n=1,k=256,bs=[1,1],nr=[2,1])
MUL_MAT(type_a=q4_K,type_b=f32,m=16,n=16,k=256,bs=[1,1],nr=[1,2])
```
which were consistently failing across all architectures (x86, ARM,
macOS) with high NMSE values (~0.4-0.7).
0 commit comments