Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Load Spanish Model in Custom Whisper Environment #12

Open
pabloeorsi opened this issue Apr 19, 2024 · 5 comments
Open

Unable to Load Spanish Model in Custom Whisper Environment #12

pabloeorsi opened this issue Apr 19, 2024 · 5 comments

Comments

@pabloeorsi
Copy link

Greetings!

I hope this message finds you well. I'm reaching out from Argentina, currently attempting to load a Spanish model into Whisper. I've successfully installed custom_whisper in Docker using the suggested docker-compose.yml, and everything is running smoothly. However, when I try to load a model.bin from another Spanish model, I encounter the following error:

RuntimeError: Unsupported model binary version. This executable supports models with binary version v6 or below, but the model has binary version v1868833084. This usually means that the model was generated by a later version of CTranslate2. (Forward compatibility is not guaranteed.)

I've tested several models, all yielding the same result. Are there any Spanish models compatible with this setup? I'm pleased with how easy it was to set up, but the Hungarian language isn't suitable for my needs.

Thank you for your assistance!

@cociweb
Copy link
Owner

cociweb commented Apr 19, 2024

So you mean that bumping ctranslate2 to 4.1 will solve your problem? Can you please link your model?

@pabloeorsi
Copy link
Author

Thank you for your prompt reply. Unfortunately, I'm not certain if bumping ctranslate2 to version 4.1 would solve the issue. Regarding the model, the one I'm interested in is located at this link: https://huggingface.co/guillaumekln/faster-whisper-medium

Thank you very much!!

@cociweb
Copy link
Owner

cociweb commented Apr 19, 2024

This is the standard faster-whisper model which is more than 1 years old. So bumping ctranslate2 to the latest version will not solve your problem, the current one should handle it seamlessly....
This specific RuntimeError can be occured as well, when the downloaded model file is corrupt. It can happen if your downloaded file is not fully downloaded (timeout happened eg. slow internet connection) or infected by any virus. - You can check the md5 hash with md5sum.

Since You are trying to reach the standard faster-whisper model, You can select the default medium model instead of the custom link (they are totally the same).

Unfortunatelly, my custom_whisper addon uses the same mechanism for model folder purge at statup as the official whisper addon, then You won't achieve any better result.

Nowadays, HF is very slow. As a workaround I would suggest to download the desired model (and their vocab and config files) and republish them on a private webserver (or even under your ha instance) then you can use it as custom model.

I'm planning to release a new, more complex, totally rewritten addon, Where you can handle pre-downloaded models as well. please stay tuned until that.

@pabloeorsi
Copy link
Author

Hello!!

I'm sharing the complete error in case it helps. I understand that the models don't have the same structure, which is why it's not finding the files. It really works very well with the Hungarian models, and I would love to have it in Spanish.

I'll keep an eye on the repository! Thank you very much for sharing!!!

C:\Users\Home Assistant\Desktop>docker-compose up
[+] Running 1/1
 ✔ Container custom_whisper  Recreated                                                                             0.2s
Attaching to custom_whisper
custom_whisper  | INFO:__main__:Downloading custom model My customized medium Whisper model to /data
custom_whisper  | INFO:wyoming_faster_whisper.download:model.bin is downloaded into /data/custom from url: https://huggingface.co/guillaumekln/faster-whisper-medium/blob/main/model.bin
custom_whisper  | INFO:wyoming_faster_whisper.download:vocabulary.txt is downloaded into /data/custom from url: https://huggingface.co/guillaumekln/faster-whisper-medium/blob/main/vocabulary.txt
custom_whisper  | INFO:wyoming_faster_whisper.download:config.json is downloaded into /data/custom from url: https://huggingface.co/guillaumekln/faster-whisper-medium/blob/main/config.json
custom_whisper  | WARNING:wyoming_faster_whisper.download:Download failed on  hash.json from https://huggingface.co/guillaumekln/faster-whisper-medium/blob/main/hash.json! Info: HTTP Error 404: Not Found
custom_whisper  | WARNING:wyoming_faster_whisper.download:Retreive of hash failed on custom model: [Errno 2] No such file or directory: '/data/custom/hash.json'
custom_whisper  | INFO:__main__:Succesfully downloaded the custom model: custom
custom_whisper  | Traceback (most recent call last):
custom_whisper  |   File "<frozen runpy>", line 198, in _run_module_as_main
custom_whisper  |   File "<frozen runpy>", line 88, in _run_code
custom_whisper  |   File "/usr/local/lib/python3.11/site-packages/wyoming_faster_whisper/__main__.py", line 202, in <module>
custom_whisper  |     run()
custom_whisper  |   File "/usr/local/lib/python3.11/site-packages/wyoming_faster_whisper/__main__.py", line 197, in run
custom_whisper  |     asyncio.run(main())
custom_whisper  |   File "/usr/local/lib/python3.11/asyncio/runners.py", line 190, in run
custom_whisper  |     return runner.run(main)
custom_whisper  |            ^^^^^^^^^^^^^^^^
custom_whisper  |   File "/usr/local/lib/python3.11/asyncio/runners.py", line 118, in run
custom_whisper  |     return self._loop.run_until_complete(task)
custom_whisper  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
custom_whisper  |   File "/usr/local/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
custom_whisper  |     return future.result()
custom_whisper  |            ^^^^^^^^^^^^^^^
custom_whisper  |   File "/usr/local/lib/python3.11/site-packages/wyoming_faster_whisper/__main__.py", line 174, in main
custom_whisper  |     whisper_model = WhisperModel(
custom_whisper  |                     ^^^^^^^^^^^^^
custom_whisper  |   File "/usr/local/lib/python3.11/site-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 58, in __init__
custom_whisper  |     self.model = ctranslate2.models.Whisper(
custom_whisper  |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
custom_whisper  | RuntimeError: Unsupported model binary version. This executable supports models with binary version v6 or below, but the model has binary version v1868833084. This usually means that the model was generated by a later version of CTranslate2. (Forward compatibility is not guaranteed.)
custom_whisper exited with code 0

@cociweb
Copy link
Owner

cociweb commented Apr 20, 2024

well, maybe it is in connection with the missing hash file in the repo. (I do not plan any new bugfix to introduce on this repo anymore) - I would like to release the new addon/solution within a few weeks (as soon as I have some free time).
I have two workaround suggestions:

  1. create a temporal fork of the above mentioned repo and make a hash.json file with the content of the md5sum(model.bin), etc.
  2. try to use the pre-compressed files, so your command line should look like in your docker-compose file:
    command: --model medium-int8 --language es --beam-size 1 --compute-type default
    you can substitute the following values to the --model argument for multilang models:
  • tiny
  • tiny-int8
  • base
  • base-int8
  • small
  • small-int8
  • medium-int8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants