Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Singing Synthesis with NEUTRINO #260

Closed
Patchethium opened this issue Jan 1, 2022 · 2 comments
Closed

Singing Synthesis with NEUTRINO #260

Patchethium opened this issue Jan 1, 2022 · 2 comments

Comments

@Patchethium
Copy link
Contributor

内容

While I was playing with #252 , I found that guided synthesis with a clip from a song also produces a decent result:

neutrino.mp4

The ground truth is from NEUTRINO, which also exports f0 data into a binary file. So I was thinking, with the fact that we can get phoneme alignment from its musicxml, we may be able to produce a kind of singing synthesis feature with more precise data produced by NEUTRINO and get rid of the buggy Julius.

Before I create a WIP PR referring to this issue, anyone kind enough to do this favor for me is welcomed.

Pros

I don't know... NEUTRINO's accessibility is already terrible enough, I doubt anyone will make use of this feature, while 3 of the 5 libraries already has a UTAU. Maybe just to see how far this idea can reach.

Cons

long notes tend to produce more artifact (the last phoneme in the video above)

the decoder_forwarder may not be able to handle the synthesis for an audio with the length of a song (3-5 minutes), we may need to divide them into batches

実現方法

read f0 and musicxml files from NEUTRINO, then resample and send them to decoder_forwarder

@qwerty2501
Copy link
Contributor

qwerty2501 commented Jan 2, 2022

I don't know... NEUTRINO's accessibility is already terrible enough, I doubt anyone will make use of this feature, while 3 of the 5 libraries already has a UTAU. Maybe just to see how far this idea can reach.

The only thing I can say is, I guess some users may happy that is fan of 春日部つむぎ , if this feature is implemented.
See this video
Currently, she can not sing well.

However, I don't know if this feature should be implemented as text to speech software.

@Patchethium
Copy link
Contributor Author

However, I don't know if this feature should be implemented as text to speech software.

Right, maybe it should be made into an assistance tool like Kotonosync with VOICEROID, I better create another repository and name it like Zundamonsync or sth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants