Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

author suggested openai API key could be added to awesometts too? thanks #314

Open
ccchan234 opened this issue Feb 21, 2024 · 7 comments
Open

Comments

@ccchan234
Copy link

i deadly need it.
please.

thankyou.

ps, i do not like hyper tts's UI,
in which the sound 's source is not added to the resultant voice,
making me unable to know which sound is from which sound voice.

thanks

@MarcusXavierr
Copy link

Yeah, I also prefer awesomeTTS.
Sadly it doesn't support OpenAI which is the service I have an account.

@luc-vocab what should be done to add support for OpenAI service?

Just implement a file that respects some interface? I could do a PR

@luc-vocab
Copy link
Collaborator

I will add OpenAI to AwesomeTTS. But please answer the question first: what feature, if implemented in HyperTTS, will convince you to migrate to HyperTTS ? Over the coming years it will be difficult for me to maintain both, and HyperTTS is much easier to evolve because it's more modern.

@MarcusXavierr
Copy link

@luc-vocab I'm using HyperTTS right now. My only issue with it is that there's a bug when you generate audio for the flashcard.
When I generate the audio for the text 'Hello, my name is Doug,' it only plays 'Doug' on the first attempt. I need to play the sound twice to hear the entire phrase.

@luc-vocab
Copy link
Collaborator

@MarcusXavierr are you using Bluetooth headphones?

@ccchan234
Copy link
Author

ccchan234 commented Aug 26, 2024

I will add OpenAI to AwesomeTTS. But please answer the question first: what feature, if implemented in HyperTTS, will convince you to migrate to HyperTTS ? Over the coming years it will be difficult for me to maintain both, and HyperTTS is much easier to evolve because it's more modern.

hi,
for me, which model generated the sound should be stated in the sound file. e.g. in
awoesomeTTS, the sound file is
[sound:azure-886d8fe5-affa78d6-e7f29489-fed9aca8-fcf7c88f.mp3]
[sound:googletts-47ced8f2-ed11e2af-67948cb1-da641227-01ce51e8.mp3]
[sound:watson-6b76525a-ed448fb7-09f5f79c-723c4f0d-7f66ccc2.mp3]

so that i could listen to it days later, and i still know which sound is by which model,
and i can RATE them.

with hyperTTS,
afair, it didn't state the model e.g. azure/googletts/watson.
so that i cant rate the models LATER.

thank you.

i always wonder, that new versions should be better to replace the old verions.
otherwise it's called downgrade, not upgrade.

@MarcusXavierr
Copy link

@luc-vocab Yeah, I'm using Bluetooth headphones. But this issue doesn't happen with AwesomeTTS.

Another issue is that the add-on erases the back text when it generates audio. So if you press a shortcut to generate audio for the front card and type something into the back while the audio is being generated, when it finishes, it writes the audio tag on the front card, as it erases whatever you typed on the back.

@luc-vocab
Copy link
Collaborator

@MarcusXavierr can you confirm you're using batch audio in both cases, and the exact same service in both cases ? The problem with bluetooth is the audio fade-in that many headphones do and which suppresses the beginning of a word. One way to fix that is by introducing a pause at the beginning of the word: https://www.vocab.ai/tutorials/hypertts-tips-and-tricks#add-pauses

for your second issue, if you're simultaneously generating audio and touching the text in the target field, indeed that could lead to undesired behavior. Just curious why do you do this ? And what does your note type look like ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants