Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why do most GAN-based models use 256x256 image data as input #35

Open
mjz0110 opened this issue Jul 3, 2024 · 2 comments
Open

Why do most GAN-based models use 256x256 image data as input #35

mjz0110 opened this issue Jul 3, 2024 · 2 comments
Assignees
Labels
question Further information is requested

Comments

@mjz0110
Copy link

mjz0110 commented Jul 3, 2024

Hello! Thank you for your fantastic work.After reading your paper, I have some quetions.
Why do most GAN-based models use 256x256 image data as input? What problems might arise from using other image sizes? Can UVCGANv2 be trained with images of size 540x640, for example?

@usert5432 usert5432 self-assigned this Jul 3, 2024
@usert5432 usert5432 added the question Further information is requested label Jul 3, 2024
@usert5432
Copy link
Collaborator

Hi @mjz0110, Thank you for your interest in our work!

Why do most GAN-based models use 256x256 image data as input?

I am not entirely sure why. I believe it is mostly historical. When you develop a new GAN model, you have to compare it against the predecessors. So you have to use the same image sizes as were used before. Since many early works used image sizes of 256x256, now everybody has to compare against them. But this is only my guess, I do not know why.

What problems might arise from using other image sizes?

I think, the biggest issue is that the architecture of the discriminator (PatchGAN 70x70) is fine-tuned for typical scales of objects in 256x256 images. Using different image sizes may lead to a very disappointing performance. I can't think of any other immediate problems.

Can UVCGANv2 be trained with images of size 540x640, for example?

In principle, as long as the image dimensions are divisible by 32, the default UVCGANv2 configurations can be trained on them.

@mjz0110
Copy link
Author

mjz0110 commented Jul 9, 2024

Thanks for your reply! It was very helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants