Mllama kv scale fix #335

gshtras · 2024-12-18T16:41:34Z

mllama.py calls the caching function directly, so using tensors instead of floats there too

…entation

gshtras added 2 commits December 18, 2024 16:36

Using tensors in the explicit cache function calls from mllama implem…

216e382

…entation

Properly creating the tensor

04e6424

shajrawi approved these changes Dec 18, 2024

View reviewed changes

gshtras merged commit fa1ff83 into main Dec 18, 2024
8 of 9 checks passed

gshtras deleted the mllama_kv_Scale_fix branch December 18, 2024 16:53

Provide feedback