Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve model chat template handling by parsing out the end of output token automatically #6

Open
crimson-knight opened this issue Dec 19, 2024 · 0 comments

Comments

@crimson-knight
Copy link
Owner

Models use special tokens to designate boundaries within prompts. Most models have a special "end of generation" token that they use when returning the results.

In llama models this is <|eot_id|>, and must be specified on the model class like follows:

model = Llamero::BaseModel.new(model_name: "meta-llama-3-8b-instruct-Q6_K.gguf", chat_template_end_of_generation_token: "<|eot_id|>")

Without this specification, the structured response will fail to parse. It's also inconvenient to have to remember to find this and configure it on the model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant