You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I followed your instructions and replaced class MultiheadAttenion with LinearMultiheadAttention and kept the seq_len 512 and proj_k 128, my configurations were as follow ( hidden dim 512, max_text length 512) .
After more Debuging i found that attn_mask is torch.Size([1, 512, 512]) attn_output_weights torch.Size([64, 512, 128]) .
The text was updated successfully, but these errors were encountered:
I followed your instructions and replaced class MultiheadAttenion with LinearMultiheadAttention and kept the
seq_len 512
andproj_k 128
, my configurations were as follow( hidden dim 512, max_text length 512)
.After more Debuging i found that
attn_mask is torch.Size([1, 512, 512])
attn_output_weights torch.Size([64, 512, 128])
.The text was updated successfully, but these errors were encountered: