Difference between prepare_celeba_tfrecords.py and prepare_celeba_hq_tfrecords.py #76

udithhaputhanthri · 2021-02-22T03:46:47Z

Hi, Thanks for the great paper.

I wonder what is the aim of prepare_celeba_hq_tfrecords.py comparing to prepare_celeba_tfrecords.py.

I have successfully generated celeba dataset using the above ** prepare_celeba_tfrecords.py** script. Model training using those tfrecords also was perfect.

But when it comes to Celeba-HQ dataset, even though prepare_celeba_hq_tfrecords.py is able to generate the tfrecords (~230GB), training was not started properly. Basically, training will be terminated when the script calling batches = make_dataloader()

So I have changed the prepare_celeba_tfrecords.py a bit to accommodate CELEBA-HQ dataset. The changes I have done is,

removed all the preprocessing/ dataset organizing parts which uses those list_eval_partition.txt, identity_CelebA.txt files in celeba dataset.
CELEBA-HQ images were zipped and reshaped to 256x256 when reading
for loop (for i in range(5)) in prepare_celeba_tfrecords.py is changed to for i in range(6) to accommodate the extra resolution level.

By doing these changes, I was able to generate the tfrecords with 2-> 8 resolution levels as in CELEBA-HQ config file and, the training was also perfectly running. Generated images also realistic.

But here my concern is, my generated tfrecords are only ~11GB but in previous case (generating tfrecords with prepare_celeba_hq_tfrecords.py), it was ~230GB (train and test).

So I would like to know that where this large dataset difference is coming from ?

udithhaputhanthri · 2021-02-24T15:14:47Z

@podgorskiy
I have found that,

prepare_celeba_hq_tfrecords.py script generated the data of resolution levels: 1024, 512, 256, 128, 64, 32, 16 (7 levels)
prepare_celeba_tfrecords.py script generated the data of resolution levels: 128, 64, 32, 16, 8, 4 (6 levels)

I think this causes the larger size (~230GB) of the CELEBA-HQ tfrecords. But now I am having the problem that is, should the model is trained using 1024x1024 CELEBA-HQ dataset or 256x256 dataset. After going through the paper, I thought it should be 256, but in the prepare_celeba_hq_tfrecords.py, below 34 line uses the 1024x1024 data. It will cause the generation of the above ~230GB dataset.

I will be really thankful if you can give me a clue about what happened here.

udithhaputhanthri · 2021-02-24T15:18:22Z

accidentally closed

udithhaputhanthri closed this as completed Feb 24, 2021

udithhaputhanthri reopened this Feb 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between prepare_celeba_tfrecords.py and prepare_celeba_hq_tfrecords.py #76

Difference between prepare_celeba_tfrecords.py and prepare_celeba_hq_tfrecords.py #76

udithhaputhanthri commented Feb 22, 2021

udithhaputhanthri commented Feb 24, 2021

udithhaputhanthri commented Feb 24, 2021 •

edited

Loading

Difference between prepare_celeba_tfrecords.py and prepare_celeba_hq_tfrecords.py #76

Difference between prepare_celeba_tfrecords.py and prepare_celeba_hq_tfrecords.py #76

Comments

udithhaputhanthri commented Feb 22, 2021

udithhaputhanthri commented Feb 24, 2021

udithhaputhanthri commented Feb 24, 2021 • edited Loading

udithhaputhanthri commented Feb 24, 2021 •

edited

Loading