You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, for single image input, the tokens are not appended to each row as expected in the paper. Specifically, only one token is appended to the flatten patch tokens of the image.
I think the
image_newline
here is the implementation ofRow-ended tokens
in the paper.LLaVA/llava/model/llava_arch.py
Lines 82 to 86 in c121f04
However, for single image input, the tokens are not appended to each row as expected in the paper. Specifically, only one token is appended to the flatten patch tokens of the image.
LLaVA/llava/model/llava_arch.py
Lines 191 to 196 in c121f04
The text was updated successfully, but these errors were encountered: