-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A question about eos_token_id #25
Comments
I'm not super sure but I think that in practice it makes no difference but maybe @soldni could chime in here? For reference here is the tokenizer vocab https://huggingface.co/allenai/OLMoE-1B-7B-0125-Instruct/raw/main/tokenizer.json |
I think I found the problem. The configuration file needs to have different settings for different versions, but for version 0924, the yml configuration file needs to be modified. |
hey @Harry-Miral, I think you are right! |
Should we change something in any of the README or uploaded model configs? |
Hi,OLMoE Team
Thank you for your work, so that more people can join this work.
In the readme I saw that tokenizer.eos_token_id is set to 50279. But in the .yml configuration file I found that eos_token_id is 0. How should I understand this difference? Maybe it's because of the vocab_size of 50280?Thank you!
The text was updated successfully, but these errors were encountered: