What does mel, lf0,loss mean? #661

feipinxiang · 2023-05-18T03:40:28Z

feipinxiang
May 18, 2023

What does mel, lf0,loss mean? How to interpret this log page? When is the right time to stop training?
Just an amateur, non-professional AI developer😆

Answered by Grownz

May 27, 2023

I'm no professional, either, but i've read some of the machine learning papers.

Mel, LF0, G and loss are all terms used in the context of deep learning models, particularly in the field of natural language processing (NLP). In NLP, mel, LF0 and G features are used to represent the spectral content of audio signals in a way that is more perceptually meaningful than the representative raw frequency values.

A brief explanation of each term:

Mel: The Mel-frequency scale is a nonlinear frequency scale that maps sound frequencies to a linear scale based on the human auditory system's perception of sound.

LF0: LF0 stands for "low-frequency formant," which refers to the lowest frequency formant i…

View full answer

Grownz · 2023-05-27T22:13:48Z

Grownz
May 27, 2023

I'm no professional, either, but i've read some of the machine learning papers.

Mel, LF0, G and loss are all terms used in the context of deep learning models, particularly in the field of natural language processing (NLP). In NLP, mel, LF0 and G features are used to represent the spectral content of audio signals in a way that is more perceptually meaningful than the representative raw frequency values.

A brief explanation of each term:

Mel: The Mel-frequency scale is a nonlinear frequency scale that maps sound frequencies to a linear scale based on the human auditory system's perception of sound.

LF0: LF0 stands for "low-frequency formant," which refers to the lowest frequency formant in a vowel sound.

G: The term G refers to the "glottal closure" feature, which represents the position of the glottis (the opening in the larynx) during speech production.

Loss: In deep learning models for NLP, loss refers to a measure of how well the model's predictions match the true labels for a given input sequence. Common losses used in NLP include cross-entropy loss and mean squared error (MSE) loss.

The most important graphs are "loss/g/LF0" and "loss/g/mel". Ideally, they should converge smoothly over time. As long as there are extreme outliers, your model is not trained enough. But in the end you have to test your model to find out if it meets your quality requirements. Often it fails because of bad source material, but you will quickly find this out in early tests.

I hope that answers some of the question :)

4 replies

wyhauyeung Jun 5, 2023

i am learning all these metrics too. But do you know why loss/g/fm is consistently increasing (while loss/g/total is consistently decreasing)? when i asked bing/chatgpt, i think fm is feature matching, when it is increasing, might not be a good sign.

feipinxiang Jul 2, 2023
Author

thank you for your reply. I also read some papers and found that training a model is a very delicate work, not suitable for amateurs like me.At the moment I can only make it work, there is still a lot of mechanical feeling in the tone.

nikich340 Jul 28, 2023

Fm first is decreasing (because it moves from pre-trained variant to your dataset voice, as I understand) and then increasing (as quality becomes better), so it's fine.

nikich340 Jul 28, 2023

G stands for Generator, while D stands for Discriminator (terms from competitive learning).

nikich340 · 2023-07-28T09:47:33Z

nikich340
Jul 28, 2023

I suggest to preview audios in "Audio" tab, and watch "Scalars" tab with ~0.98 smoothing.
Usually it takes 500-1000 epochs for good results (you can try more even if losses stagnates, but always make D and G models backups).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What does mel, lf0,loss mean? #661

{{title}}

Replies: 2 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

What does mel, lf0,loss mean? #661

feipinxiang May 18, 2023

Replies: 2 comments · 4 replies

Grownz May 27, 2023

wyhauyeung Jun 5, 2023

feipinxiang Jul 2, 2023 Author

nikich340 Jul 28, 2023

nikich340 Jul 28, 2023

nikich340 Jul 28, 2023

feipinxiang
May 18, 2023

Replies: 2 comments 4 replies

Grownz
May 27, 2023

feipinxiang Jul 2, 2023
Author

nikich340
Jul 28, 2023