Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update llama tests for block size 32 (#696)
The block_seq_stride default is changing to 32 instead of 16, so this PR updates the tests to use the block_seq_stride flag and the new numpy inputs for block size 32 to benchmark correctly. This PR also removes the decomposed fp16 tests that are not needed anymore. --------- Signed-off-by: aviator19941 <[email protected]>
- Loading branch information