Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconstruction Error #10

Open
zwx8981 opened this issue Aug 14, 2023 · 4 comments
Open

Reconstruction Error #10

zwx8981 opened this issue Aug 14, 2023 · 4 comments

Comments

@zwx8981
Copy link

zwx8981 commented Aug 14, 2023

Very nice work! I have a question, while EDICT is invertible by design, why the reconstruction error is not zero as shown in Table 1?

@bram-w
Copy link
Contributor

bram-w commented Aug 14, 2023

Thank you! The reconstruction error from Table 1 is pixel-level so the VAE encoding/decoding process introduces some level of error (which is why the LDM VAE and 'EDICT` columns are all equal). EDICT is exact at the latent level (up to floating point precision, but that's too small to register in the table) but since the pixel measurement involves the VAE it inherits that level of error. Does that make sense?

@zwx8981
Copy link
Author

zwx8981 commented Aug 15, 2023

Thanks for the reply, that makes sense! Is it possible to modify the VAE to fix the error?

@zjhJOJO
Copy link

zjhJOJO commented Dec 24, 2023

Thanks for the reply, that makes sense! Is it possible to modify the VAE to fix the error?

I also want to know how to reduce the error further by modifying the VAE. Can anyone provide some suggestions?

@tvaranka
Copy link

tvaranka commented Feb 6, 2024

Thanks for the reply, that makes sense! Is it possible to modify the VAE to fix the error?

I also want to know how to reduce the error further by modifying the VAE. Can anyone provide some suggestions?

The VAE is trained separately from the diffusion model. If you were to modify the VAE, it would make the outputs inconsistent with the diffusion model.

So, to improve the reconstruction error from VAE you would have to train both the VAE and diffusion model from scratch.

This is exactly what was done in latest Stable Diffusion versions. The latest versions of Stable Diffusion have improved their VAE, see the table below from SDXL paper.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants