Skip to content

Latest commit

 

History

History
67 lines (39 loc) · 3.01 KB

tts.md

File metadata and controls

67 lines (39 loc) · 3.01 KB

Text To Speech

Survey

TTS

  • Long-Form Speech Generation with Spoken Language Models, arXiv, 2412.18603, arxiv, pdf, cication: -1

    Se Jin Park, Julian Salazar, Aren Jansen, ..., Yong Man Ro, RJ Skerry-Ryan · (google.github)

  • TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch, arXiv, 2412.08237, arxiv, pdf, cication: -1

    Xingchen Song, Mengtao Xing, Changwei Ma, ..., Zhendong Peng, Zhiyong Wu

  • Debatts: Zero-Shot Debating Text-to-Speech Synthesis, arXiv, 2411.06540, arxiv, pdf, cication: -1

    Yiqiao Huang, Yuancheng Wang, Jiaqi Li, ..., Shunsi Zhang, Zhizheng Wu

  • Very Attentive Tacotron: Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech, arXiv, 2410.22179, arxiv, pdf, cication: -1

    Eric Battenberg, RJ Skerry-Ryan, Daisy Stanton, ..., Julian Salazar, David Kao · (sequence-layers - google) Star · (x)

Voco

Emotion

  • EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector, arXiv, 2411.02625, arxiv, pdf, cication: -1

    Deok-Hyeon Cho, Hyung-Seok Oh, Seung-Bin Kim, ..., Seong-Whan Lee

VITS

Efficient

Projects

Multilingual

Evaluation

Misc