Skip to content
View hcy71o's full-sized avatar

Block or report hcy71o

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 12,598 870 Updated Apr 28, 2025

EraX Text to Speech base on F5-TTS Base V1

Python 41 9 Updated Apr 24, 2025
Python 13 5 Updated Oct 27, 2024

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions

Python 75 5 Updated Oct 11, 2024

GLM-4-Voice | 端到端中英语音对话模型

Python 2,875 239 Updated Dec 5, 2024

Concrete: TFHE Compiler that converts python programs into FHE equivalent

C++ 1,343 162 Updated Apr 28, 2025

Distributed Training Over-The-Internet

906 32 Updated Dec 3, 2024

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Python 416 24 Updated Apr 28, 2025

StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion

175 12 Updated Sep 27, 2024

The official implementation of GTCRN, an ultra-lightweight SE model.

Python 323 54 Updated Mar 13, 2025

Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement

Python 379 60 Updated Mar 21, 2025

ACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis

Python 136 8 Updated Sep 20, 2024

Evaluation Protocol for Large-Scale Zero-Shot TTS Literature

Python 78 10 Updated Mar 12, 2025

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,903 197 Updated Apr 19, 2025

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 11,549 1,616 Updated Apr 25, 2025

Official repository of Wavehax vocoder

Python 46 3 Updated Nov 30, 2024

Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch

Python 345 9 Updated Jan 12, 2025

Official repository of NeXt-TDNN for speaker verification

Python 70 7 Updated Oct 10, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 45,568 5,032 Updated Apr 25, 2025

Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch

Python 462 44 Updated Mar 12, 2025

The official Implementation of PeriodWave and PeriodWave-Turbo

Python 187 11 Updated Apr 14, 2025

Official inference repo for FLUX.1 models

Python 21,493 1,520 Updated Feb 6, 2025

Code for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction (ACL24))

Python 44 2 Updated Aug 6, 2024

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 7,213 642 Updated May 31, 2024

Simple text to phones converter for multiple languages

Python 1,369 184 Updated Sep 26, 2024

Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (arXiv:2401.01498)

Python 61 4 Updated Apr 4, 2024

Spectral Analysis in Python

Python 354 91 Updated Jan 27, 2025

vits2 backbone with multilingual-bert(한국어 지원)

Python 26 1 Updated Apr 6, 2024
Next
Showing results