English
This codebase and all models are released under CC-BY-NC-SA-4.0 License. Please refer to LICENSE for more details.
This is fish speech for silly tavern. My crude implementation.
First pre-process your wavs with their tool. https://github.com/fishaudio/audio-preprocess Then edit the fishtts.js for speaker names and your server IP. The files are in SillyTavern/public/scripts/extensions/tts/
I have included styletts support too because I don't feel like editing the index to remove it. https://github.com/longtimegone/StyleTTS2-Sillytavern-api
An example on how to start the server:
CUDA_VISIBLE_DEVICES=3 python api_json.py \ --listen 0.0.0.0:8000 \ --llama-checkpoint-path "checkpoints/fish-speech-1.4" \ --decoder-checkpoint-path "checkpoints/fish-speech-1.4/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" \ --decoder-config-name firefly_gan_vq \ --half \ --compile
You can do so from the tools folder. No need to install it. Just have all the dependencies in your conda environment. May however have to install the audio pre-processor.
Windows peeps, I'm sorry. Can't help you.