Replies: 1 comment
-
|
The default head size is |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Usually, the attention head size
head_dim = hidden_dim // num_attention_headsin many model architectures including Llama.Some models use more flexible
head_dimsizes such asFor Llama models, here is one pending PR for HF
Looking at
src/llama.cpp, I feel like the information is handled around here but I'm not sure.Could anybody help me understand how the information is loaded into
hparamsand can be used inbuild_*()?Thank you!
Beta Was this translation helpful? Give feedback.
All reactions