Expose ignore_eos_token on /generate and/or /v1/completions endpoints #337

martinbomio · 2024-03-17T13:06:22Z

Feature request

Hi! Seems like internally, the lorax gRPC server allows setting ignore_eos_token, but there's no way to set that param in the /generate endpoint of the router.

Motivation

This parameter is useful for benchmarking and being able to consistently and correctly compare runs.

Your contribution

Would need some guidance but could contribute it

tgaddair · 2024-03-17T20:40:50Z

Hey @martinbomio, thanks for the feature request! This should be a pretty simple one to add. I can definitely take a look soon if no one else picks it up first.

martinbomio · 2024-03-19T22:08:23Z

@tgaddair @jeffreyftang thanks for the quick addition! When could we expect to have this released? (we are using the docker image right now)

tgaddair · 2024-03-19T22:11:40Z

Hey @martinbomio, this should be available in the latest docker image to try out now :)

Let me know if you need a new version of the Python client to be pushed to PyPI.

martinbomio · 2024-03-20T01:33:01Z

@tgaddair that's great!

I tested passing ignore_eos_token to the openai compatible api /v1/completions and I still do not get the expected output tokens on some prompts. Is this expected? I can always switch to the /generate one and try

tgaddair · 2024-03-20T01:36:20Z

Hey @martinbomio , I think the current implementation didn't update the OpenAI endpoints as this param isn't native to their spec.

@jeffreyftang what do you think about adding this param to the OpenAI endpoints as well?

jeffreyftang · 2024-03-20T02:32:51Z

I think the current implementation didn't update the OpenAI endpoints as this param isn't native to their spec.

Yep, that was my reasoning. However, happy to add to the OpenAI endpoints as well so long as we're not concerned with going beyond the OpenAI spec.

martinbomio · 2024-03-20T14:57:11Z

@jeffreyftang thanks for the quick PR! Would you mind sharing which base model you used for the example you posted in this PR #340? I am trying the latest version and I can't seem get the max tokens on certain prompts even if the ignore_eos_token is set

martinbomio · 2024-03-20T15:11:30Z

nvm, I can confirm it is working as expected!

jeffreyftang · 2024-03-20T15:16:14Z

Glad it's working! For posterity, I was using mistralai/Mistral-7B-Instruct-v0.1.

tgaddair added enhancement New feature or request good first issue Good for newcomers labels Mar 17, 2024

jeffreyftang self-assigned this Mar 19, 2024

jeffreyftang mentioned this issue Mar 19, 2024

enh: Expose ignore_eos_token option in generate requests #340

Merged

tgaddair closed this as completed in #340 Mar 19, 2024

jeffreyftang mentioned this issue Mar 20, 2024

enh: Add ignore_eos_token param to completions and chat completions endpoints #344

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose ignore_eos_token on /generate and/or /v1/completions endpoints #337

Expose ignore_eos_token on /generate and/or /v1/completions endpoints #337

martinbomio commented Mar 17, 2024

tgaddair commented Mar 17, 2024

martinbomio commented Mar 19, 2024

tgaddair commented Mar 19, 2024

martinbomio commented Mar 20, 2024

tgaddair commented Mar 20, 2024

jeffreyftang commented Mar 20, 2024

martinbomio commented Mar 20, 2024

martinbomio commented Mar 20, 2024

jeffreyftang commented Mar 20, 2024

Expose ignore_eos_token on /generate and/or /v1/completions endpoints #337

Expose ignore_eos_token on /generate and/or /v1/completions endpoints #337

Comments

martinbomio commented Mar 17, 2024

Feature request

Motivation

Your contribution

tgaddair commented Mar 17, 2024

martinbomio commented Mar 19, 2024

tgaddair commented Mar 19, 2024

martinbomio commented Mar 20, 2024

tgaddair commented Mar 20, 2024

jeffreyftang commented Mar 20, 2024

martinbomio commented Mar 20, 2024

martinbomio commented Mar 20, 2024

jeffreyftang commented Mar 20, 2024