0.10.1
Implemented the fallback implementation for the matmulQ40vQ80 operation. Distributed Llama now supports all CPU architectures, with optimizations specifically for ARM and AVX2 CPUs.
Implemented the fallback implementation for the matmulQ40vQ80 operation. Distributed Llama now supports all CPU architectures, with optimizations specifically for ARM and AVX2 CPUs.