-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems with transcription into a specific language #9
Comments
Can you provide a full example of the script? |
I investigated the problem via xdebug, everything is probably fine on the php side, the encoding problem occurs at the FFI stage and beyond. Here is my code, it runs inside a laravel command
|
I found the cause of the problem. Most likely the problem is in the garbage collector, I set the parameter It doesn't solve my problem, but at least I found the cause |
Hi @DimaSmirnoff27, Thanks so much for taking the time to dig into this issue and sharing the details. You've saved me tons of work, for real. I’ve been trying to replicate the error on my end but haven’t had any luck so far. I tested on my Mac, a Linux aarch64 VM, an Intel Linux VM, and even a PHP container (serversideup), but everything seems to work fine in those environments. It’s been a bit frustrating since not being able to reproduce the issue makes it tough to track down and fix. Would you mind sharing the base Docker image you’re using? It’d help me recreate your exact setup and hopefully uncover what’s going wrong. Thanks again for your patience and for flagging this. |
Also, the workaround you found with making the cdata owned does hint at this being a GC issue. But even so, the error message in the logs doesn’t seem to reflect the correct string passed in, so it feels like there might be more going on here. One thing to keep in mind is that making it owned means you’ll have to manually free the memory allocated by that function call. Since the params are used until the transcription process ends, one possible workaround could be maintaining a list of references in the class and then manually disposing of them in That said, without being able to replicate the issue myself, it’s hard to pin this down completely or test a proper fix. |
Here is a copy of my project, there is only a test command for transcription. In README I described how to run the docker and the command. |
Hi!
I have installed the package and run the Low-Level sample code but I encountered a problem with transcription to specific language when explicitly specifying the language in the parameters
->withLanguage('uk')
. But if I remove->withLanguage('uk')
from the parameters the transcription works fine with automatic language detection.I also tried running High-Level sample code, the problem did not go away when the language was explicitly specified
The log for whisper shows this:
[2024-12-21 12:45:37] whisper.error: whisper_lang_id: unknown language ;h���' []
My environment:
Laravel 11.31 inside docker container, image php:8.3.3-fpm
Inside the container libffi-dev is installed, also the ffi extension is installed
I ran the code with the medium and small models, the result is the same
The text was updated successfully, but these errors were encountered: