-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do I know when to stop Stage 2 training? #29
Comments
Hi, If you want a strict indicator, you may need to save the model every ~2k epoch, and run the evaluation code on each saved model to get the 1-NNA score. When the 1-NNA score stop improving, the model is trained enough. Notice that there will be some noise in the 1-NNA score, you will see it fluctuating even when converged (this is normal). Another (none strict) approach is visually inspecting the sampled images and see if the samples stop improving. My experience is the model usually start converging at around 6k-10k epoch. And I usually train it up to 20k to make sure it is fully converged. |
Would you recommend to run the 1-NNA during training? And in general how long does it take to run for you the 1-NNA evaluation? [This information is valid only for the diffusion part. 2nd step of the training] |
running the full 1-NNA evaluation on ~600 samples takes ~1hour on single gpu. Most of the time is spending on evaluating the EMD score and then is the sampling. If you want to eval 1-NNA score during training, it would be better to
|
Hi @ZENGXH ,
Is there any indicator to judge whether stage2 is trained enough? I used a different data set, I don't know what epoch/iteration to stop.
Best regards,
Yingjie
The text was updated successfully, but these errors were encountered: