-
Long-Form Speech Generation with Spoken Language Models,
arXiv, 2412.18603
, arxiv, pdf, cication: -1Se Jin Park, Julian Salazar, Aren Jansen, ..., Yong Man Ro, RJ Skerry-Ryan · (google.github)
-
TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch,
arXiv, 2412.08237
, arxiv, pdf, cication: -1Xingchen Song, Mengtao Xing, Changwei Ma, ..., Zhendong Peng, Zhiyong Wu
-
Debatts: Zero-Shot Debating Text-to-Speech Synthesis,
arXiv, 2411.06540
, arxiv, pdf, cication: -1Yiqiao Huang, Yuancheng Wang, Jiaqi Li, ..., Shunsi Zhang, Zhizheng Wu
-
Very Attentive Tacotron: Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech,
arXiv, 2410.22179
, arxiv, pdf, cication: -1Eric Battenberg, RJ Skerry-Ryan, Daisy Stanton, ..., Julian Salazar, David Kao · (sequence-layers - google) · (x)
-
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector,
arXiv, 2411.02625
, arxiv, pdf, cication: -1Deok-Hyeon Cho, Hyung-Seok Oh, Seung-Bin Kim, ..., Seong-Whan Lee
-
tts-generation-webui - rsxdalv
-
alltalk_tts - erew123