You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TL, DR: current implementation of build_cls_mask() produces cls_mask for [CLS] being as the first token. But in CoCa, [CLS] is the end token.
In Issue 312, build_cls_mask() was introduced by @gpucce in TextTransformer in CoCa to "preventing the CLS token at the end of the sequence from attending to padded tokens".
TL, DR: current implementation of
build_cls_mask()
producescls_mask
for [CLS] being as the first token. But in CoCa, [CLS] is the end token.In Issue 312,
build_cls_mask()
was introduced by @gpucce inTextTransformer
in CoCa to "preventing the CLS token at the end of the sequence from attending to padded tokens".Taking
text = torch.tensor([[1,2,3,4,0,0,0]])
as an example,This output
In @lucidrains implementation
taking the same
text
as the exampleit produces (which I believe should be the desired outcome)
Since [CLS] token is appended at the end of a sequence,
I feel that the current implementation in
open_clip
is wrong? Do I miss anything?The text was updated successfully, but these errors were encountered: