stop token and prompt input issues #63

Energiz3r · 2023-04-17T22:46:46Z

Hey there, I'm using this for an API endpoint and have come up against some issues I can't solve

With Vicuna and Vicuna 1.1 the stop token changed from ### to </s> but there appears to be no way to tell pyllamacpp what the stop token is. With the v0 model, it continues generating non stop, outputting prompts for the human. (probably a separate issue: With 1.1 it appears broken altogether and throws tensor errors outputting gibberish to console)

I thought of implementing a stop token detection inside the new_text_callback fn and just calling exit(), which actually does what I want, but before I got that far I was trying to get the model to stop parroting the prompt back as part of the output, which oddly it seems to take computational power to do token by token. Any idea why this is happening? Can I get it to stop parroting the prompt?

That leads to my third question - should I supply conversation history / system prompts as part of prompt or is there a different way to pass that to the generate() call?

I realise I could partially solve / hack around these limitations by not attempting to stream back to the front end (and stripping the prompt from the output with some string manipulation) but that is kind of a bummer, I'm hoping the functionality of this could be improved to allow for that?

I tried interactive mode but that appears to be a blocking process which talks directly to the console, meaning it won't be usable for a web endpoint implementation

If any of this is user error please let me know, I've just done the best I could with the doco available

The text was updated successfully, but these errors were encountered:

flamby · 2023-04-26T08:37:57Z

Hi,

Am I right assuming Vicuna is not yet supported?

Thanks

absadiki · 2023-05-02T20:42:29Z

Hi @Energiz3r,

The old generate function was happening in the c++ side, that's why it was blocking the thread. You could've just use a separate thread for the model generation to solve the issue.

I tried to implement a generator function to overcome those limitations anyway,
I added also a prompt_context, prefix and suffix to condition the generation.
you can get the tokens now one by one and do whatever you want with them or just stop whenever you want. The stop word (aka antiprompt) is added as well
you can take a look how I did it here

Let me know if you still have an issue.

absadiki · 2023-05-02T20:43:14Z

Hi,

Am I right assuming Vicuna is not yet supported?

Thanks

Hi @flamby,

Vicuna should be supported as well basically.
Have you tried it and found any issues ?

isLouisHsu mentioned this issue Apr 23, 2023

Calling _pyllamacpp.llama_eval raises TypeError #74

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stop token and prompt input issues #63

stop token and prompt input issues #63

Energiz3r commented Apr 17, 2023 •

edited

Loading

flamby commented Apr 26, 2023

absadiki commented May 2, 2023

absadiki commented May 2, 2023

stop token and prompt input issues #63

stop token and prompt input issues #63

Comments

Energiz3r commented Apr 17, 2023 • edited Loading

flamby commented Apr 26, 2023

absadiki commented May 2, 2023

absadiki commented May 2, 2023

Energiz3r commented Apr 17, 2023 •

edited

Loading