You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Examine the content of decoder_input_ids. You will see that the bos_token has been added twice.
Expected behavior
The bos_token should only be prepended once to the input to the decoder.
The reason for this bug occurring is that BertTokenizer already wraps the text in bos_token and eos_token. Then, during the EncoderDecoderModel forward pass (specifically during construction of the decoder_input_ids from the labels), another bos_token is added.
This means that in order to use EncoderDecoderModel correctly, one has to tokenize the decoder inputs in a way such that an eos_token is added at the end of the text, but a bos_token is NOT added at the beginning. This is funky and clearly not wanted behavior.
The decoder part of an EncoderDecoderModel is rarely initialized with an encoder-only Transformer. Since the tokenizer of decoder-only Transformers do not add bos_tokens nor eos_tokens, I think the correct way to change the behavior of the EncoderDecoderModel would be to add both the eos_token as well as the bos_token in the model forward pass. To clear up confusion, we should change the official example to constructing an EncoderDecoderModel from an encoder-only Transformer and a decoder-only Transformer.
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
transformers
version: 4.40.0Who can help?
@ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Add a breakpoint at:
transformers/src/transformers/models/encoder_decoder/modeling_encoder_decoder.py
Line 625 in e655029
Run the example script of EncoderDecoderModel
transformers/src/transformers/models/encoder_decoder/modeling_encoder_decoder.py
Lines 562 to 577 in e655029
Examine the content of
decoder_input_ids
. You will see that the bos_token has been added twice.Expected behavior
The bos_token should only be prepended once to the input to the decoder.
The reason for this bug occurring is that
BertTokenizer
already wraps the text in bos_token and eos_token. Then, during theEncoderDecoderModel
forward pass (specifically during construction of thedecoder_input_ids
from thelabels
), another bos_token is added.This means that in order to use EncoderDecoderModel correctly, one has to tokenize the decoder inputs in a way such that an eos_token is added at the end of the text, but a bos_token is NOT added at the beginning. This is funky and clearly not wanted behavior.
The decoder part of an EncoderDecoderModel is rarely initialized with an encoder-only Transformer. Since the tokenizer of decoder-only Transformers do not add bos_tokens nor eos_tokens, I think the correct way to change the behavior of the EncoderDecoderModel would be to add both the eos_token as well as the bos_token in the model forward pass. To clear up confusion, we should change the official example to constructing an EncoderDecoderModel from an encoder-only Transformer and a decoder-only Transformer.
The text was updated successfully, but these errors were encountered: