Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install fails on Windows // Deepspeed fails to install, No module named 'TTS' , piper-phonemize~=1.0.0 not available #24

Open
ryanull24 opened this issue Jun 27, 2024 · 23 comments
Labels
help wanted Extra attention is needed

Comments

@ryanull24
Copy link

When trying to run speech. py I am getting this error:

speech.py", line 333, in <module>
    from TTS.tts.configs.xtts_config import XttsConfig
ModuleNotFoundError: No module named 'TTS'

I am on Windows 11 with python 3.11.9

I really don t have a lot of experience with python and running programs, but what I did was:

git clone repo
create a virtual environment .venv
activate said virtual environment .venv\Scripts\Activate
pip install -r requirements.txt
run speech.py - getting TTS module error
go back to virtual environment and install TTS
get above error

Also an error when installing deepspeed with pip install -r requirements.txt

Collecting deepspeed (from -r requirements.txt (line 6))
  Using cached deepspeed-0.14.4.tar.gz (1.3 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [23 lines of output]
      [WARNING] Unable to import torch, pre-compiling ops will be disabled. Please visit https://pytorch.org/ to see how to properly install torch on your system.
       [WARNING]  unable to import torch, please install it if you want to pre-compile any deepspeed ops.
      DS_BUILD_OPS=1
      Traceback (most recent call last):
        File "F:\Project\LLM\openedai-speech\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
          main()
        File "F:\Project\LLM\openedai-speech\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "F:\Project\LLM\openedai-speech\.venv\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\USER\AppData\Local\Temp\pip-build-env-3vt8r0ws\overlay\Lib\site-packages\setuptools\build_meta.py", line 327, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=[])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\USER\AppData\Local\Temp\pip-build-env-3vt8r0ws\overlay\Lib\site-packages\setuptools\build_meta.py", line 297, in _get_build_requires
          self.run_setup()
        File "C:\Users\USER\AppData\Local\Temp\pip-build-env-3vt8r0ws\overlay\Lib\site-packages\setuptools\build_meta.py", line 497, in run_setup
          super().run_setup(setup_script=setup_script)
        File "C:\Users\USER\AppData\Local\Temp\pip-build-env-3vt8r0ws\overlay\Lib\site-packages\setuptools\build_meta.py", line 313, in run_setup
          exec(code, locals())
        File "<string>", line 149, in <module>
      AssertionError: Unable to pre-compile ops without torch installed. Please install torch before attempting to pre-compile ops.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

I made sure torch was installed under the same virtual environment.

@matatonic
Copy link
Owner

matatonic commented Jun 27, 2024

deepspeed is probably being removed from default installations next release, you can comment it out of requirements.txt and reinstall.

thanks for the report!

Update: pip remove deepspeed

@matatonic
Copy link
Owner

if you need deepspeed you also need to install the cuda-toolkit for your os, which is perhaps more complex than installing the rest of the software...

@ryanull24
Copy link
Author

ryanull24 commented Jun 27, 2024

I do not really need deepspeed, I commented it out, and I am still getting the same 'no Module named TTS' when trying to run speech.py

And something else to note. On windows, piper-tts seems not to be able to be installed because of piper-phonemize

ERROR: Cannot install -r requirements.txt (line 4) because these package versions have conflicting dependencies.

The conflict is caused by:
    piper-tts 1.2.0 depends on piper-phonemize~=1.1.0
    piper-tts 1.1.0 depends on piper-phonemize~=1.0.0

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip to attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

But this is an issue with piper-phonemize on Windows, looking at their github.

@matatonic
Copy link
Owner

Is this install in wsl2? I'm not much help with windows, sorry, but if you can use them, the docker images should work for you with docker desktop + wsl2.

@matatonic matatonic reopened this Jun 27, 2024
@matatonic matatonic changed the title Cannot run - No module named 'TTS' on Windows // Deepspeed fails to install Install fails on Windows // Deepspeed fails to install, No module named 'TTS' , piper-phonemize~=1.0.0 not available Jun 27, 2024
@ryanull24
Copy link
Author

No, I m trying to use it standalone, on Windows, no docker, no anything like that, for 2 reasons.

  1. I am too dumb to understand Docker properly
  2. WSL2 would keep Vmmem.exe alive, using up memory and ending up crashing software like Adobe Lightroom. I don t know if this was fixed in the last year or so since I last used Docker.

I ll see if I can find a solution. It might be a me issue

@ryanull24
Copy link
Author

I managed to get it to start - thanks chatGPT, by adding

try:
    from TTS.tts.configs.xtts_config import XttsConfig
    from TTS.tts.models.xtts import Xtts
    from TTS.utils.manage import ModelManager
except ModuleNotFoundError as e:
    print(f"ModuleNotFoundError: {e}")

Now it starts, without any arguments, but then, after connecting to Open WebUI, and trying TTS, I get some more errors:

Traceback (most recent call last):
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\uvicorn\protocols\http\httptools_impl.py", line 435, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\uvicorn\middleware\proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\fastapi\applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\errors.py", line 186, in __call__
    raise exc
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\base.py", line 189, in __call__
    with collapse_excgroups():
  File "C:\Program Files\Python311\Lib\contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\_utils.py", line 93, in collapse_excgroups
    raise exc
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\base.py", line 191, in __call__
    response = await self.dispatch_func(request, call_next)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\Project\LLM\openedai-speech\openedai.py", line 126, in log_requests
    response = await call_next(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\base.py", line 165, in call_next
    raise app_exc
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\base.py", line 151, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\middleware\exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\routing.py", line 756, in __call__
    await self.middleware_stack(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\routing.py", line 776, in app
    await route.handle(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\routing.py", line 297, in handle
    await self.app(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\starlette\routing.py", line 72, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\fastapi\routing.py", line 278, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\USER\AppData\Roaming\Python\Python311\site-packages\fastapi\routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\Project\LLM\openedai-speech\speech.py", line 225, in generate_speech
    tts_proc = subprocess.Popen(tts_args, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Program Files\Python311\Lib\subprocess.py", line 1538, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 2] The system cannot find the file specified

Now trying to run speech.py with arguments, for example --xtts_device cpu --preload xtts, I m getting:

Traceback (most recent call last):
  File "F:\Project\LLM\openedai-speech\speech.py", line 339, in <module>
    xtts = xtts_wrapper(args.preload, device=args.xtts_device, unload_timer=args.unload_timer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\Project\LLM\openedai-speech\speech.py", line 61, in __init__
    model_path = ModelManager().download_model(model_name)[0]
                 ^^^^^^^^^^^^
NameError: name 'ModelManager' is not defined

At this point, I m pretty sure I m doing something wrong.

@matatonic
Copy link
Owner

it may need to be fixed to run piper.exe instead of piper, but honestly I think you're the first to try directly in windows.

@matatonic
Copy link
Owner

matatonic commented Jun 29, 2024

Try making sure you're using python 3.11, piper-phonemize does not work with python 3.12 yet.

@matatonic
Copy link
Owner

Any update here? I'll close the issue if you don't have any more to add, I still highly recommend you try the docker setup.

@matatonic matatonic added the help wanted Extra attention is needed label Aug 27, 2024
@luobendewugong
Copy link

luobendewugong commented Sep 9, 2024

it may need to be fixed to run piper.exe instead of piper, but honestly I think you're the first to try directly in windows.

Hello, we do have a strong appeal, directly installed on Windows, because you are very familiar with the overall architecture, could you provide some ideas directly installed on Windows?

Thank you very much!

@piovis2023
Copy link

@ryanull24 did you manage to get it working on windows without docker?/

Thanks

@ryanull24
Copy link
Author

@ryanull24 did you manage to get it working on windows without docker?/

Thanks

I gave up. Look at my comment above. I had it start apparently, but it would throw errors in Open WebUI so I gave up on it. I am by no means familiar with coding, and I did not know where to start looking into things.

@piovis2023
Copy link

Thanks @ryanull24 Yes I saw your comment.
I get how you feel about not knowing where to start. Hopefully the awesome devs here (@matatonic , I'm looking at you mate :) ), have made some progress.
I too don't want to use docker on Windows 11 but for different reasons. I want to streamline my techstack.

@matatonic
Copy link
Owner

Well, I didn't try this but you may be able to install a piper.exe binary directly without using pip, not sure if this will work either though, but it might get farther.

@piovis2023
Copy link

@matatonic thanks for the quick reply. Really appreciate it. Where can I get my hands on a piper.exe file? Do you have one around? I'd be happy to report the feedback,

@matatonic
Copy link
Owner

@piovis2023 try this one: https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_windows_amd64.zip
from: https://github.com/rhasspy/piper/releases

@piovis2023
Copy link

HI @matatonic Thanks again for the file. Unfortunately, it didn't work. I tried the piper, piper-phonemize, and piper-phonemize-crossover repos to go back to the roots. Aside from trying to compile the whole thing from scratch, there are absolutely no solutions for Windows 11 OS.

It sucks that this amazing project doesn't work on Windows 11 :(

@seancheung
Copy link

seancheung commented Nov 13, 2024

This is due to piper-phonemize does not support Windows arch sadly. But you can use the latest binary release file from piper which supports Windows:

  1. Download the release file and extract it to a local folder
  2. Comment out piper-tts in requirements.txt
  3. Install deps as usual
  4. There are syntax errors in startup.bat. Replace it with the following. (There are also errors in download_voices_tts-1.bat and download_voices_tts-1-hd.bat, I just remove calling them. And the env file reading syntax is also wrong. I guess these were translated from bash script by GPT or something).

startup.bat

@echo off

@REM set /p < speech.env
set TTS_HOME=voices
set HF_HOME=voices

@REM call download_voices_tts-1.bat
@REM call download_voices_tts-1-hd.bat %PRELOAD_MODEL%

if defined PRELOAD_MODEL (
    set "preload=--preload"
)
python speech.py %preload% %PRELOAD_MODEL% %EXTRA_ARGS%
  1. update speech.py

speech.py

# line 226, remove the first arg which is "piper"
tts_args = ["--model", str(piper_model), "--data-dir", "voices", "--download-dir", "voices", "--output-raw"]
# line 232, add executable parameter
tts_proc = subprocess.Popen(tts_args, stdin=subprocess.PIPE, stdout=subprocess.PIPE, executable="absolute path to piper.exe")

Now it should work running startup.bat

@piovis2023
Copy link

You are a legend @seancheung ! . I'll let you know how it goes after Ive given it a go!

@seancheung
Copy link

seancheung commented Nov 14, 2024

You are a legend @seancheung ! . I'll let you know how it goes after Ive given it a go!

I have written a one-click-run windows batch script:

run.bat

@echo off
setlocal enabledelayedexpansion

cd /D "%~dp0"

@rem test the conda binary
echo Miniconda version:
call conda --version || ( echo. && echo Miniconda not found. && goto end )

@rem deactivate existing conda envs as needed to avoid conflicts
(call conda deactivate && call conda deactivate && call conda deactivate) 2>nul

set CONDA_DIR=%cd%\.conda
set PYTHON_VER=3.11

@rem create the installer env
if not exist "%CONDA_DIR%" (
	echo Packages to install: %PACKAGES_TO_INSTALL%
	call conda create --no-shortcuts -y -k --prefix "%CONDA_DIR%" python=%PYTHON_VER% || ( echo. && echo Conda environment creation failed. && goto end )
)

@rem check if conda environment was actually created
if not exist "%CONDA_DIR%\python.exe" ( echo. && echo Conda environment is empty. && goto end )

@rem environment isolation
set PYTHONNOUSERSITE=1
set PYTHONPATH=
set PYTHONHOME=
set "CUDA_PATH=%CONDA_DIR%"
set "CUDA_HOME=%CUDA_PATH%"

set TTS_DIR=openedai-speech
if not exist "%cd%\%TTS_DIR%" (
	echo Cloning repo...
	where git || ( echo. && echo git not found. && goto end )
	git clone "https://github.com/matatonic/openedai-speech.git" openedai-speech
)

@rem activate installer env
call conda activate "%CONDA_DIR%" || ( echo. && echo Miniconda env not found. && goto end )
echo Conda env set to: %CONDA_PREFIX%

set "PIPER_URL=https://github.com/rhasspy/piper/releases/download/2023.11.14-2/piper_windows_amd64.zip"
set "PIPER_TMP_FILE=%cd%\piper_windows_amd64.zip"
if not exist "%cd%\piper" (
	echo Downloading piper...
	where curl || ( echo. && echo curl not found. && goto end )
	where tar || ( echo. && echo tar not found. && goto end )
	curl -L -o "%PIPER_TMP_FILE%" "%PIPER_URL%"
	tar -xf "%PIPER_TMP_FILE%"
	if exist "%PIPER_TMP_FILE%" del "%PIPER_TMP_FILE%"
)

set INIT_LOCKFILE=%cd%\.lock
if not exist "%INIT_LOCKFILE%" (
	echo Installing dependencies...
	@REM Remove piper-tts from requirements.txt
	findstr /v /c:"piper-tts" "%cd%\%TTS_DIR%\requirements.txt" > "%cd%\%TTS_DIR%\requirements-win.txt"
	@REM Install torch with cuda
	call pip install torch==2.5.1+cu121 torchaudio==2.5.1+cu121 --extra-index-url https://download.pytorch.org/whl/cu121
	call pip install -r "%cd%\%TTS_DIR%\requirements-win.txt"
	echo. 2>"%INIT_LOCKFILE%"
)

@rem launch
set TTS_HOME=voices
set HF_HOME=voices
set TTS_PORT=8001
set "PATH=%PATH%;%cd%\piper;"
cd /D "%~dp0\%TTS_DIR%"
call python speech.py --port %TTS_PORT% --log-level DEBUG

:end
pause

Just copy this file to a empty folder, and double click it. It will clone this repo, download piper and make required modifications.

The folder will look like this after a successful initialization:
image

Notice:

  1. You need conda/miniconda preinstalled. If you do not want conda (recommened for python env isolation) or prefer venv, remove all conda related codes from the script yourself.
  2. You might need to remove the torch cuda installation step based on your needs.
  3. speech.env file is ignored. You need to change env variables (like TTS_HOME) directly in the script.
  4. Any time you want a deps re-installation, just delete that .lock file.

Edit
After initialization, there are no voice models (for tts-1) or audio files (for tts-1-hd) because I did not call download_voices_tts-1.bat or download_voices_tts-1-hd.bat.
You can download voice models from piper-tts's voice page and add them to config/voice_to_speaker.yaml. This is also covered in this repo's homepage.

@matatonic
Copy link
Owner

@seancheung Awesome, thank you!

@piovis2023
Copy link

Thanks so much @seancheung . However it didn't work for me.

My setup: Windows 11, Python 3.10 and 3.11 installed, Cuda 11.8, Cuda 12.4, CUDNN 9, Conda, pytorch.

I did exactly what you mentioned above. Here were the errors:
2024-11-14 18:42:39.574 | INFO | main:default_exists:119 - config/pre_process_map.yaml does not exist, setting defaults from pre_process_map.default.yaml
2024-11-14 18:42:39.574 | INFO | main:default_exists:119 - config/voice_to_speaker.yaml does not exist, setting defaults from voice_to_speaker.default.yaml
INFO: Started server process [27572]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8001 (Press CTRL+C to quit)
2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:120 - Request path: /
2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:121 - Request method: GET
2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:122 - Request headers: Headers({'host': 'localhost:8001', 'connection': 'keep-alive', 'sec-ch-ua': '"Not(A:Brand";v="99", "Google Chrome";v="133", "Chromium";v="133"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': '"Windows"', 'upgrade-insecure-requests': '1', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.7', 'sec-fetch-site': 'none', 'sec-fetch-mode': 'navigate', 'sec-fetch-user': '?1', 'sec-fetch-dest': 'document', 'accept-encoding': 'gzip, deflate, br, zstd', 'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8'})
2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:123 - Request query params:
2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:124 - Request body: b''
2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:128 - Response status code: 200
2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:129 - Response headers: MutableHeaders({'content-length': '0', 'content-type': 'text/plain; charset=utf-8'})
INFO: 127.0.0.1:62857 - "GET / HTTP/1.1" 200 OK
2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:120 - Request path: /favicon.ico
2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:121 - Request method: GET
2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:122 - Request headers: Headers({'host': 'localhost:8001', 'connection': 'keep-alive', 'sec-ch-ua-platform': '"Windows"', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36', 'sec-ch-ua': '"Not(A:Brand";v="99", "Google Chrome";v="133", "Chromium";v="133"', 'sec-ch-ua-mobile': '?0', 'accept': 'image/avif,image/webp,image/apng,image/svg+xml,image/,/;q=0.8', 'sec-fetch-site': 'same-origin', 'sec-fetch-mode': 'no-cors', 'sec-fetch-dest': 'image', 'referer': 'http://localhost:8001/', 'accept-encoding': 'gzip, deflate, br, zstd', 'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8'})
2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:123 - Request query params:
2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:124 - Request body: b''
2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:128 - Response status code: 404
2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:129 - Response headers: MutableHeaders({'content-length': '22', 'content-type': 'application/json'})
INFO: 127.0.0.1:62857 - "GET /favicon.ico HTTP/1.1" 404 Not Found
2024-11-14 18:45:30.973 | DEBUG | openedai:log_requests:120 - Request path: /v1/audio/tts
2024-11-14 18:45:30.975 | DEBUG | openedai:log_requests:121 - Request method: GET
2024-11-14 18:45:30.975 | DEBUG | openedai:log_requests:122 - Request headers: Headers({'host': 'localhost:8001', 'connection': 'keep-alive', 'sec-ch-ua': '"Not(A:Brand";v="99", "Google Chrome";v="133", "Chromium";v="133"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': '"Windows"', 'upgrade-insecure-requests': '1', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,
/*;q=0.8,application/signed-exchange;v=b3;q=0.7', 'sec-fetch-site': 'none', 'sec-fetch-mode': 'navigate', 'sec-fetch-user': '?1', 'sec-fetch-dest': 'document', 'accept-encoding': 'gzip, deflate, br, zstd', 'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8'})
2024-11-14 18:45:30.976 | DEBUG | openedai:log_requests:123 - Request query params:
2024-11-14 18:45:30.976 | DEBUG | openedai:log_requests:124 - Request body: b''
2024-11-14 18:45:30.976 | DEBUG | openedai:log_requests:128 - Response status code: 404
2024-11-14 18:45:30.977 | DEBUG | openedai:log_requests:129 - Response headers: MutableHeaders({'content-length': '22', 'content-type': 'application/json'})
INFO: 127.0.0.1:62858 - "GET /v1/audio/tts HTTP/1.1" 404 Not Found

Can I give you any other info or screenshots to provide some guidance?

Thanks again

@seancheung
Copy link

seancheung commented Nov 15, 2024

Thanks so much @seancheung . However it didn't work for me.

My setup: Windows 11, Python 3.10 and 3.11 installed, Cuda 11.8, Cuda 12.4, CUDNN 9, Conda, pytorch.

I did exactly what you mentioned above. Here were the errors: 2024-11-14 18:42:39.574 | INFO | main:default_exists:119 - config/pre_process_map.yaml does not exist, setting defaults from pre_process_map.default.yaml 2024-11-14 18:42:39.574 | INFO | main:default_exists:119 - config/voice_to_speaker.yaml does not exist, setting defaults from voice_to_speaker.default.yaml INFO: Started server process [27572] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8001 (Press CTRL+C to quit) 2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:120 - Request path: / 2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:121 - Request method: GET 2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:122 - Request headers: Headers({'host': 'localhost:8001', 'connection': 'keep-alive', 'sec-ch-ua': '"Not(A:Brand";v="99", "Google Chrome";v="133", "Chromium";v="133"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': '"Windows"', 'upgrade-insecure-requests': '1', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.7', 'sec-fetch-site': 'none', 'sec-fetch-mode': 'navigate', 'sec-fetch-user': '?1', 'sec-fetch-dest': 'document', 'accept-encoding': 'gzip, deflate, br, zstd', 'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8'}) 2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:123 - Request query params: 2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:124 - Request body: b'' 2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:128 - Response status code: 200 2024-11-14 18:44:42.686 | DEBUG | openedai:log_requests:129 - Response headers: MutableHeaders({'content-length': '0', 'content-type': 'text/plain; charset=utf-8'}) INFO: 127.0.0.1:62857 - "GET / HTTP/1.1" 200 OK 2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:120 - Request path: /favicon.ico 2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:121 - Request method: GET 2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:122 - Request headers: Headers({'host': 'localhost:8001', 'connection': 'keep-alive', 'sec-ch-ua-platform': '"Windows"', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36', 'sec-ch-ua': '"Not(A:Brand";v="99", "Google Chrome";v="133", "Chromium";v="133"', 'sec-ch-ua-mobile': '?0', 'accept': 'image/avif,image/webp,image/apng,image/svg+xml,image/,/;q=0.8', 'sec-fetch-site': 'same-origin', 'sec-fetch-mode': 'no-cors', 'sec-fetch-dest': 'image', 'referer': 'http://localhost:8001/', 'accept-encoding': 'gzip, deflate, br, zstd', 'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8'}) 2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:123 - Request query params: 2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:124 - Request body: b'' 2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:128 - Response status code: 404 2024-11-14 18:44:42.725 | DEBUG | openedai:log_requests:129 - Response headers: MutableHeaders({'content-length': '22', 'content-type': 'application/json'}) INFO: 127.0.0.1:62857 - "GET /favicon.ico HTTP/1.1" 404 Not Found 2024-11-14 18:45:30.973 | DEBUG | openedai:log_requests:120 - Request path: /v1/audio/tts 2024-11-14 18:45:30.975 | DEBUG | openedai:log_requests:121 - Request method: GET 2024-11-14 18:45:30.975 | DEBUG | openedai:log_requests:122 - Request headers: Headers({'host': 'localhost:8001', 'connection': 'keep-alive', 'sec-ch-ua': '"Not(A:Brand";v="99", "Google Chrome";v="133", "Chromium";v="133"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': '"Windows"', 'upgrade-insecure-requests': '1', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36', 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/*;q=0.8,application/signed-exchange;v=b3;q=0.7', 'sec-fetch-site': 'none', 'sec-fetch-mode': 'navigate', 'sec-fetch-user': '?1', 'sec-fetch-dest': 'document', 'accept-encoding': 'gzip, deflate, br, zstd', 'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8'}) 2024-11-14 18:45:30.976 | DEBUG | openedai:log_requests:123 - Request query params: 2024-11-14 18:45:30.976 | DEBUG | openedai:log_requests:124 - Request body: b'' 2024-11-14 18:45:30.976 | DEBUG | openedai:log_requests:128 - Response status code: 404 2024-11-14 18:45:30.977 | DEBUG | openedai:log_requests:129 - Response headers: MutableHeaders({'content-length': '22', 'content-type': 'application/json'}) INFO: 127.0.0.1:62858 - "GET /v1/audio/tts HTTP/1.1" 404 Not Found

Can I give you any other info or screenshots to provide some guidance?

Thanks again

You were requesting with a wrong url ("GET /v1/audio/tts HTTP/1.1"), which gave you 404 error. The correct request is "POST /v1/audio/speech" as addressed in the repo's homepage.

You also need to download voice models(you interested in) from https://github.com/rhasspy/piper/blob/master/VOICES.md and put them in voices(default) folder:

image

Then copy pre_process_map.default.yaml and voice_to_speaker.default.yaml to config folder and rename them:

image

When using tts-1 model, map each voice with onnx model file:
image

To do a test:

 curl http://localhost:8001/v1/audio/speech -H "Content-Type: application/json" -d '{"input": "The quick brown fox jumped over the lazy dog.","voice":"alloy","model":"tts-1","response_format":"mp3"}' -o speech.mp3

You should see successful info logs in the console like the following:

image

As for voice clone, use tts-1-hd model, map each voice with wav file (which you should download/create yourself):

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants