-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose ignore_eos_token on /generate and/or /v1/completions endpoints #337
Comments
Hey @martinbomio, thanks for the feature request! This should be a pretty simple one to add. I can definitely take a look soon if no one else picks it up first. |
@tgaddair @jeffreyftang thanks for the quick addition! When could we expect to have this released? (we are using the docker image right now) |
Hey @martinbomio, this should be available in the Let me know if you need a new version of the Python client to be pushed to PyPI. |
@tgaddair that's great! I tested passing |
Hey @martinbomio , I think the current implementation didn't update the OpenAI endpoints as this param isn't native to their spec. @jeffreyftang what do you think about adding this param to the OpenAI endpoints as well? |
Yep, that was my reasoning. However, happy to add to the OpenAI endpoints as well so long as we're not concerned with going beyond the OpenAI spec. |
@jeffreyftang thanks for the quick PR! Would you mind sharing which base model you used for the example you posted in this PR #340? I am trying the latest version and I can't seem get the max tokens on certain prompts even if the |
nvm, I can confirm it is working as expected! |
Glad it's working! For posterity, I was using |
Feature request
Hi! Seems like internally, the lorax gRPC server allows setting ignore_eos_token, but there's no way to set that param in the /generate endpoint of the router.
Motivation
This parameter is useful for benchmarking and being able to consistently and correctly compare runs.
Your contribution
Would need some guidance but could contribute it
The text was updated successfully, but these errors were encountered: