Replies: 1 comment
-
Qwen预训练模型和指令微调模型的说明,请看我们的文档。基模型和指令微调模型均以<|endoftext|>表示单文档的序列终止,指令微调模型中使用ChatML模板,每条信息的终止符可以视为<|im_end|>。 使用Qwen公开的模型进行推理,停止条件应当以generation_config.json里的eos_token_id为准,而非tokenizer_config.json,以符合transformers的设计。 微调的选择:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
为什么不同版本之间的tokenizer_config.json里对应的eos_token都不同呢?下游sft训练的时候应该用哪个呢?以及推理的时候应该用哪个作为end呢?
qwen2.5-14B如下:
"eos_token": "<|endoftext|>",
"errors": "replace",
"model_max_length": 131072,
"pad_token": "<|endoftext|>",
qwen2-72b-instruct如下:
"eos_token": "<|im_end|>",
qwen1.5-14b-chat如下:
"eos_token": "<|endoftext|>"
Beta Was this translation helpful? Give feedback.
All reactions