You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So here the softmax probability is calculated along the dim -1, which is the column direction.
But then the weighted sum is taken along the row direction according to this line
y = torch.matmul(V.transpose(1,0), weights).transpose(1,0)
I think we should do something like this
y = torch.matmul(weights,V)
How do you think?
I hope I'm the one to be corrected.
The text was updated successfully, but these errors were encountered:
pangzss
changed the title
Question regarding the self attention calculation.
Issue regarding the last step of self attention (weighted sum step)
Jun 2, 2021
would be correct, only if the weights array was symmetric, but this isn't the case.
Oddly enough, the produced results doesn't change much when the corrected formula/command is used.
Hi, I noticed that the last step of the self-attention calculation doesn't seem so right:
So here the softmax probability is calculated along the dim -1, which is the column direction.
But then the weighted sum is taken along the row direction according to this line
I think we should do something like this
How do you think?
I hope I'm the one to be corrected.
The text was updated successfully, but these errors were encountered: