-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lots of hallucinations? #2
Comments
Hi @joshmouch thanks for this feedback! Going through point by point:
Across the board (esp language switching and spelling), if you could please share session_ids we can use to debug that would be super helpful! |
@joshmouch Was it switching to German? The instructions in |
If I am running it in the development mode, |
Oh strange, sorry about that. I haven't extensively tested the reload
behavior, and sometimes I do get dupe sessions or strange behavior. I'd
just lean on refreshing for now and I'll try to look at it when I can!
Best,
Noah
…On Fri, Jan 31, 2025 at 6:13 AM Lance Chatwell ***@***.***> wrote:
If I am running it in the development mode, npm run dev and I make
changes to the files, it suddenly starts speaking Spanish. It seems to me
like it starts a new thread, or call to the realtime API, but then it
answers in Spanish and English in parallel.
—
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BI6EWJHR6DO6HUNTOFGNMBT2NOAG5AVCNFSM6AAAAABVNHE366VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMRXGQ2TCMBZHE>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Generally it seems less stable than AVM that is available from OpenAI. I get too many hallucinations and misunderstanding which is not common in AVM, is this using a smaller model? Even with gpt-4o (not mini) I've been having issues. Also, strangely, using push to talk doesn't override/truncate the current AI Message (neither does sending a text message). I've read online the conversation.item.truncate function doesn't work, is this a known issue that OAI is working on resolving? |
I know this is a simple demo, but I was expecting fewer hallucinations and something I could hypothetically use in a non demo app.
Random switching to other languages. Thinking i spelled Jose when i said J-o-s-h. Mr. Butte when i spell b-u-t-t. ;). Phone numbers incorrectly formatted. And most tool calls never getting used... like I asked to be signed up for a promotion because I saw there was a tool for it, but I never got it to call the tool.
Is this maybe holding out for o1 or o3 before it would work in a production scenario? Is it maybe expected that the models should be fine tuned in a real usage scenario?
Im wondering if maybe in practice there needs to be some fine tuning that occurs before this is used. But if that's the case then I think the training instructions and expectations should be included with the demo. Maybe a way to evaluate how well the agents are working?
The text was updated successfully, but these errors were encountered: