-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vanilla DDPM with cosine beta schedule, obtain results worse than DDIM #25
Comments
Hi @jasonrayshd , For cosine noise schedule, it may suffer terrible numerical issues for t near to T. In my previous implementations in DPM-Solver paper, I changed the start time from |
Got it. Thank you for your great suggestions! Looking forward to your excellent work in the future! |
@LuChengTHU @jasonrayshd i have tried cosine scheduler, and the result is better than ddim&ddpm, but i don't know the reason about it, and i'm not sure about is observation could be seen in High dimension data.
` |
Hi guys, I've fixed the numerical issue in the cosine beta schedule; please try the newest file for dpmsolver and see details in this function. You can also try this script with the |
Hi,
Thank you for your excellent codes and detailed documentation on how to incorporate DPM-solver in our own project!
I try to substitute DDIM with DPM-solver but fail to obtain comparable results.
Training details of my diffusion model:
(1) Dataset: CelebA-HQ 256x256
(2) Vanilla DDPM ( L2 Loss, predict noise), T=1000, UNet, trained in raw pixel space no latent space used.
(3) Beta schedules: Cosine schedule (according to Improved Denoising Diffusion Probabilistic Models)
Code snippet that uses DPM-solver in my project:
Result sampled by DDIM after 500 steps:
Result sampled by DPM-solver after 25 steps with cosine schedule (schedule used in training) betas:
Result sampled by DPM-solver after 25 steps with linear schedule betas:
I have tried to tune parameters of DPM-solver, e.g. multi-step instead of single-step, more iterative steps, but neither works.
Is this result from cosine schedule used when training diffusion model? Could you please give any suggestions on possible improvements? Thank you for your attention!
The text was updated successfully, but these errors were encountered: