You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue Description:
I have OpenLLM installed on my system, which is equipped with an RTX 3090 GPU. I am running argilla/CapybaraHermes-2.5-Mistral-7B and have several applications and projects running mostly smoothly. However, I've encountered an unusual issue where, intermittently, I receive responses like:
"The meaning of life is <|im_start|> assistant
A question that has been asked many times"
It seems like the system is consistently appending the word <|im_start|> assistant to responses, which is unexpected behavior.
Steps to Reproduce:
Install OpenLLM on a system with an RTX 3090 GPU.
Run argilla/CapybaraHermes-2.5-Mistral-7B.
Engage with the system and observe responses over time.
Expected Behavior:
Responses from OpenLLM should be relevant to the input provided and not contain extraneous or unexpected content like the word "<|im_start|> assistant."
Actual Behavior:
Intermittently, responses from OpenLLM contain the phrase "<|im_start|> assistant" appended to them, which does not align with the expected behavior.
It has been suggested that the issue might be related to the model template being used. However, it's worth noting that I am using the default template that comes with the argilla/CapybaraHermes-2.5-Mistral-7B model, and I have not made any modifications to it.
Does anyone else have this issue, or have a model that they can recommend for some chat and instruct uses.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Issue Description:
I have OpenLLM installed on my system, which is equipped with an RTX 3090 GPU. I am running argilla/CapybaraHermes-2.5-Mistral-7B and have several applications and projects running mostly smoothly. However, I've encountered an unusual issue where, intermittently, I receive responses like:
"The meaning of life is <|im_start|> assistant
A question that has been asked many times"
It seems like the system is consistently appending the word
<|im_start|> assistant
to responses, which is unexpected behavior.Steps to Reproduce:
Install OpenLLM on a system with an RTX 3090 GPU.
Run argilla/CapybaraHermes-2.5-Mistral-7B.
Engage with the system and observe responses over time.
Expected Behavior:
Responses from OpenLLM should be relevant to the input provided and not contain extraneous or unexpected content like the word "<|im_start|> assistant."
Actual Behavior:
Intermittently, responses from OpenLLM contain the phrase "<|im_start|> assistant" appended to them, which does not align with the expected behavior.
Additional Information:
Operating System: Ubuntu 22.04.4 LTS
Python Version: Python 3.10.12
OpenLLM Version: openllm, 0.4.44 (compiled: False)
RTX 3090 Driver Version: Driver Version: 545.23.08
It has been suggested that the issue might be related to the model template being used. However, it's worth noting that I am using the default template that comes with the argilla/CapybaraHermes-2.5-Mistral-7B model, and I have not made any modifications to it.
Does anyone else have this issue, or have a model that they can recommend for some chat and instruct uses.
Beta Was this translation helpful? Give feedback.
All reactions