Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose ignore_eos_token on /generate and/or /v1/completions endpoints #337

Closed
martinbomio opened this issue Mar 17, 2024 · 9 comments · Fixed by #340
Closed

Expose ignore_eos_token on /generate and/or /v1/completions endpoints #337

martinbomio opened this issue Mar 17, 2024 · 9 comments · Fixed by #340
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@martinbomio
Copy link

Feature request

Hi! Seems like internally, the lorax gRPC server allows setting ignore_eos_token, but there's no way to set that param in the /generate endpoint of the router.

Motivation

This parameter is useful for benchmarking and being able to consistently and correctly compare runs.

Your contribution

Would need some guidance but could contribute it

@tgaddair tgaddair added enhancement New feature or request good first issue Good for newcomers labels Mar 17, 2024
@tgaddair
Copy link
Contributor

Hey @martinbomio, thanks for the feature request! This should be a pretty simple one to add. I can definitely take a look soon if no one else picks it up first.

@martinbomio
Copy link
Author

@tgaddair @jeffreyftang thanks for the quick addition! When could we expect to have this released? (we are using the docker image right now)

@tgaddair
Copy link
Contributor

Hey @martinbomio, this should be available in the latest docker image to try out now :)

Let me know if you need a new version of the Python client to be pushed to PyPI.

@martinbomio
Copy link
Author

@tgaddair that's great!

I tested passing ignore_eos_token to the openai compatible api /v1/completions and I still do not get the expected output tokens on some prompts. Is this expected? I can always switch to the /generate one and try

@tgaddair
Copy link
Contributor

Hey @martinbomio , I think the current implementation didn't update the OpenAI endpoints as this param isn't native to their spec.

@jeffreyftang what do you think about adding this param to the OpenAI endpoints as well?

@jeffreyftang
Copy link
Contributor

I think the current implementation didn't update the OpenAI endpoints as this param isn't native to their spec.

Yep, that was my reasoning. However, happy to add to the OpenAI endpoints as well so long as we're not concerned with going beyond the OpenAI spec.

@martinbomio
Copy link
Author

@jeffreyftang thanks for the quick PR! Would you mind sharing which base model you used for the example you posted in this PR #340? I am trying the latest version and I can't seem get the max tokens on certain prompts even if the ignore_eos_token is set

@martinbomio
Copy link
Author

nvm, I can confirm it is working as expected!

@jeffreyftang
Copy link
Contributor

Glad it's working! For posterity, I was using mistralai/Mistral-7B-Instruct-v0.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants