Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dynamic learning rate decay for convergence #39

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

begeekmyfriend
Copy link

The evaluation sounds better than that with fixed learning rate.

Signed-off-by: begeekmyfriend [email protected]

@seungwonpark
Copy link
Owner

Hi, your code looks great, and thanks for kindly sending PR!
Can you please show the audio samples you got (w/ the number of epochs) for comparison?

@begeekmyfriend
Copy link
Author

melgan_eval_mandarin.zip
I have synthesized voices from 4 anchors (1 male and 3 females). And the checkpoint is only at epoch 375 and still under training. I think it helps convergence.

Signed-off-by: begeekmyfriend <[email protected]>
@@ -13,18 +12,35 @@
from .validation import validate


def cosine_decay(init_val, final_val, step, decay_steps):
alpha = final_val / init_val
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be init_val / final_val?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The learning rate decays. You might write a demo for testing.

init_val = 1e-4
final_val = 1e-5

Copy link

@casper-hansen casper-hansen Jan 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the following source it's "Minimum learning rate value as a fraction of learning_rate."
https://docs.w3cub.com/tensorflow~python/tf/train/cosine_decay/

Given the values, it looks like it's correct. The naming is just off - it should be the smallest value in the numerator and largest value in the denominator.

@bob80333
Copy link

Is this different from pytorch's built-in CosineAnnealingLR?

@Liujingxiu23
Copy link

@begeekmyfriend
Your tried different lr and found that the cos-lr is the best? And why it is suitable in melgan?
I am confused, when should we use unchanged lr, when should we use lr with decline,for example exponential decline in tacotron, and when shoud we use cos-lr.

@begeekmyfriend
Copy link
Author

It is just a preference. Pick it or other as you like.

@Liujingxiu23
Copy link

Liujingxiu23 commented Jan 28, 2021

@begeekmyfriend Thank you for your quick reply. I used your branch of tacotron and found it is one of the best among a lot of code branchs. I will try the cos-lr as well as the apex in tfgan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants