Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

disable MHA_AVX2 #173

Merged
merged 2 commits into from
Mar 18, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion neural_speed/core/layers/mha_dense.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,9 @@ bool bestla_reordered_attn_fp32_support(const attn_shape_t* params) {
// TODO(Yi): check K V's layout
if (_cd->AMX_BF16()) return true;
#endif
return !_cd->AVX512F() || _cd->AVX2(); // use avx2 and f16c on avx2 platforms
// use avx2 and f16c on avx2 platforms
// todo: check avx2 mha on sever
return !_cd->AVX512F() && _cd->AVX2();
yuchengliu1 marked this conversation as resolved.
Show resolved Hide resolved
}
// kv cache sizes in bytes per layer per batch per beam for;
void bestla_reordered_attn_fp32_batch_kv_info(const kv_shape_t* params, kv_cache_info_t* out) {
Expand Down
Loading