Include MeloTTS or Openvoice #16

djdookie · 2024-06-20T00:28:05Z

Is there a way to include and serve MeloTTS and/or OpenVoice?
They're state-of-the-art TTS (and voice cloning) and pretty fast, even on CPU only.

https://github.com/myshell-ai/MeloTTS
https://github.com/myshell-ai/OpenVoice

matatonic · 2024-06-21T15:43:11Z

Not yet, but I want to have the best options available so will take a look at these when I get some more time.

ground-creative · 2024-06-27T12:45:24Z

While we wait.

https://github.com/ground-creative/openvoice-api-python

bi1101 · 2024-09-19T09:38:17Z

In my opinion, Tortoise TTS currently offers the best balance between quality and speed. It can achieve up to 7x real-time generation, surpassing xtts, which is capped at 3x. In this video demonstration, the model generated a 20-second audio clip in just 3 seconds with optimization. It seems that performance improves even further with longer text inputs. In terms of audio quality, Tortoise TTS is on par with xtts. Additionally, the Tortoise repository is actively maintained and regularly updated, whereas Coqui has already shut down.

Another promising option is Parler TTS, which is backed by Hugging Face and has planned improvements for the future. One major advantage of Parler TTS is its support for batching, allowing it to handle high traffic more efficiently, faster than queuing and generating sample per sample.

matatonic · 2024-09-19T10:16:54Z

An older version had parler TTS support (original version) but I removed it because it just seemed like random voices, which doesn't fit this project. the new parler version with stable voice identities is back on my radar, but I haven't tested it yet for quality or speed.

Re tortoise, that's news to me that it's faster, it has always been slower, I'll give it another look.

matatonic · 2024-09-19T10:22:10Z

The openai speech API doesn't support batching according to the API reference, so I don't plan to include batch support.

For cases outside API compatibility, especially batching, I recommend you implement inference with the model directly in your code and not via a network API. It would be much more efficient.

bi1101 · 2024-09-19T10:26:02Z

The openai speech API doesn't support batching according to the API reference, so I don't plan to include batch support.

I think there's been a misunderstanding. When I mentioned batching, I was referring to the server intelligently switching to batching mode when it receives concurrent requests. This allows it to process those requests in parallel. From reviewing your code, I can see that there is parallelism implemented, but it isn’t fully optimized using Parler’s native code, which offers a significant performance boost in such cases.

matatonic · 2024-09-19T12:18:19Z

The openai speech API doesn't support batching according to the API reference, so I don't plan to include batch support.

I think there's been a misunderstanding. When I mentioned batching, I was referring to the server intelligently switching to batching mode when it receives concurrent requests. This allows it to process those requests in parallel. From reviewing your code, I can see that there is parallelism implemented, but it isn’t fully optimized using Parler’s native code, which offers a significant performance boost in such cases.

I think I get you now, yeah so to implement continuous batching for processing parallel requests, not batch processing of a single batched request.

I hadn't considered that yet, but it is a much better solution to parallel processing than the current setup. Thanks for the suggestion.

matatonic added the enhancement New feature or request label Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include MeloTTS or Openvoice #16

Include MeloTTS or Openvoice #16

djdookie commented Jun 20, 2024

matatonic commented Jun 21, 2024

ground-creative commented Jun 27, 2024

bi1101 commented Sep 19, 2024

matatonic commented Sep 19, 2024

matatonic commented Sep 19, 2024

bi1101 commented Sep 19, 2024

matatonic commented Sep 19, 2024

Include MeloTTS or Openvoice #16

Include MeloTTS or Openvoice #16

Comments

djdookie commented Jun 20, 2024

matatonic commented Jun 21, 2024

ground-creative commented Jun 27, 2024

bi1101 commented Sep 19, 2024

matatonic commented Sep 19, 2024

matatonic commented Sep 19, 2024

bi1101 commented Sep 19, 2024

matatonic commented Sep 19, 2024