You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* [SW-204341] explicit scale format for ops
Added wrapper around fp8 functions
Wrapper decides which flavor of the function to call,
according to scale format
Helper modules call the wrapper
Decide which cast flavor to call,
according to scale format
* [SW-204341] Adjust softmax API , remove commented-out code
* [SW-204341] Fixes from CR 1
* [SW-204341] Fixed CR 2
* [SW-204341] add missing arg is fsdpa
Signed-off-by: Uri Livne <[email protected]>
* [SW-204341] Enhance SDPA for measure and quant
* [SW-204341] remove sdpa quantized ops
* reland per op class with more enchancments
* [SW-204341] reland specfic arguments , rename class to wrapper
* added call with self in patched lm head
rebased on top of master next
force push
* fix mistake in conflict resolution
resotore MethodType fix
* antoher fix
* modified fp8 mtamul test to test quantized matmul func
* another fix of rebase mistake
* hopefully last rebase mistake fix
* restore backward compatibly import protection
---------
Signed-off-by: Uri Livne <[email protected]>
0 commit comments