Do you have any suggestion on how to implement this model for 2D infinite generation? #10

ivanlen · 2021-07-02T21:40:00Z

I was wondering if you have any suggestion on how to implement the code for 2d image generation such and satellite images or 3D panorama images.

universome · 2021-07-08T12:55:28Z

Hi! In my head, it was supposed to look the following way. You sample 9 anchors and assemble them into a 3x3 grid (right now there are 3 anchors sampled that are assembled into a 1x3 grid). This grid is projected onto the coordinates plane in such a way that its left lower point is at location (-d, -d) and its upper right point is at (d, d). Here, d is a hyperparameter denoting the distance between anchors, in the paper we used d=2 (assuming that a frame has the size of 1x1 units).

Now, you randomly sample a square frame on this grid, render it and pass to the discriminator. For each coordinates location, you interpolate the styles codes from the above 3x3 anchors grid.

Note that it will be quite slow (i.e. ~2x) to train compared to the current 1D implementation if one does not implement a specialized CUDA kernel for the fused interpolate+multiply operation.

ivanlen · 2021-07-08T13:02:50Z

It completely make sense what you say. Thanks!
I am still reading the code, comparing with StyleGan2 and with your manuscript, and trying to understand some lines before the implementation. It's a big piece of code and it is taking my a while, but I am advancing slowly.
As soon as I understand the important lines I will start to code and with some testings.

By the way, congrats for the manuscript, is super interesting and very well written, I enjoyed it a lot.

I will try not to spam here, but if I have some doubts with the implementation I'll come around.

Cheers!

universome · 2021-07-11T17:51:52Z

I agree that the code is not too transparent and might take a lot of time to get one's head around. I checked several places that I think are the main places to change and think about the following ones:

GridInput class here. Looks like you can remove input_column altogether because for 2D generation one would need to generate vertical patches independently too, this input_column would become an "input_pixel", which will make it quite useless
patchwise_op function here, which is hard-coded for vertical patches only
(the most difficult one): fast_bilinear_mult_row function here, which is hard-coded to do interpolated only for 1D
Right now, we feed the central latent code w and its context ws_context in a very inconvenient way as separate arguments everywhere (like here) (I did it to preserve backward compatibility to StyleGAN2). Maybe it's worth dropping this and having a single w_grid argument of shape [batch_size, grid_h, grid_w, num_ws, w_dim] (or smth like this). But not sure about that. The problem with the current variant is that it is quite annoying to arange w and ws_context into a grid every time you need to interpolate.

Also, it might be useful to think what tricks can be borrowed from the recent Alias-Free GAN which is also a coordinate-based GAN model.

Feel completely free to ask any further questions if you'll have some!

ivanlen · 2021-07-23T15:21:37Z

Hi @universome thank you very much for your hints.
I was quite busy with other stuff, but during these days I will check the code together with your suggestions and see if I can advance in the 2d implementation.

Thank you again for your answer, I think that it was what I needed to advance. Also I will check the Alias-Free GAN.
If I have further questions during the implementation I will definitely let you know.
Cheers!

zengxianyu · 2021-12-29T17:16:13Z

Have you made any progress on the 2D version? I'm also interested

ivanlen · 2021-12-29T18:38:10Z

Have you made any progress on the 2D version? I'm also interested

Not really, I have some drafts notebooks in which I was testing stuff, but still very far from something that can be shared or a PR.

I don't have much free time lately... I hope to be able to continue this soon, but I don't know when I am going to be able.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do you have any suggestion on how to implement this model for 2D infinite generation? #10

Do you have any suggestion on how to implement this model for 2D infinite generation? #10

ivanlen commented Jul 2, 2021

universome commented Jul 8, 2021

ivanlen commented Jul 8, 2021

universome commented Jul 11, 2021 •

edited

Loading

ivanlen commented Jul 23, 2021

zengxianyu commented Dec 29, 2021

ivanlen commented Dec 29, 2021

Do you have any suggestion on how to implement this model for 2D infinite generation? #10

Do you have any suggestion on how to implement this model for 2D infinite generation? #10

Comments

ivanlen commented Jul 2, 2021

universome commented Jul 8, 2021

ivanlen commented Jul 8, 2021

universome commented Jul 11, 2021 • edited Loading

ivanlen commented Jul 23, 2021

zengxianyu commented Dec 29, 2021

ivanlen commented Dec 29, 2021

universome commented Jul 11, 2021 •

edited

Loading