Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Support for stop words #350

Closed
2 tasks done
bryanhpchiang opened this issue Sep 1, 2023 · 9 comments
Closed
2 tasks done

[Bug] Support for stop words #350

bryanhpchiang opened this issue Sep 1, 2023 · 9 comments
Assignees

Comments

@bryanhpchiang
Copy link

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.

Describe the bug

The ability for the user to specify stop_words does not seem to be exposed. I see that the TurboMind instances has a stop_words attribute but it doesn't seem to be customizable.

Reproduction

.

Error traceback

No response

@lvhan028
Copy link
Collaborator

lvhan028 commented Sep 1, 2023

stop_words is hard-coded in model.py
Let's expose it like other fields of the chat template. @AllentDan

@AllentDan
Copy link
Collaborator

Added in #352

@bryanhpchiang
Copy link
Author

Awesome, thanks for the fast update! Would it be possible to pass in a string / sequence of strings and have those be converted to the tokens under the hood (similar to the OpenAI API interface?)

Or possibly provide documentation on how stop_words should be created from a list of stop sequence strings.

@lvhan028
Copy link
Collaborator

lvhan028 commented Sep 1, 2023

I am afraid using string or sequence might not be a good idea for two reasons:

  1. There are many codecs of string
  2. It'll bring lots of trouble to serve/turbomind/chatbot.py

@lvhan028
Copy link
Collaborator

lvhan028 commented Sep 1, 2023

@AllentDan Can you make a document, presenting chat templates and their usage?

@AllentDan
Copy link
Collaborator

Yeah, I would add documents later but in other pull requests.

@bryanhpchiang
Copy link
Author

I am afraid using string or sequence might not be a good idea for two reasons:

That's understandable. So how does every other library/API offer this? vLLM, OpenAI, etc.

@lvhan028
Copy link
Collaborator

lvhan028 commented Sep 1, 2023

My mistake. We'll carefully consider that

@lvhan028
Copy link
Collaborator

Merged to main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants