Unable to Reproduce Paper Results with Pi-Model #11

iamshahd · 2022-09-07T22:32:11Z

Hi,

First of all, I'd like to thank you tremendously for sharing this clean TensorFlow implementation - it saved me a lot of time!

Second, I am wondering if by any chance you have compared your pi-model results with the original paper. I tried training the pi-model on SVHN using the same hyperparameters reported with 1000 labels, but for some reason the model overfits around the 50th-60th epoch, and the final test accuracy of the best ckpt is only 85%, versus almost 95% in the paper.

tensorfreitas · 2022-09-20T15:49:38Z

Hi @iamshahd,

At the time I was not using a good GPU so I was not able to get the exact results as I stated in the docs:

The results are not exactly the ones reported in the paper with 1000 labels, but I have to admit that I do not have the hardware to find the best parameters with structured batches (the experiments were run in a 860M NVIDIA card).

With proper hyperparameter tuning it should be close to the reported results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to Reproduce Paper Results with Pi-Model #11

Unable to Reproduce Paper Results with Pi-Model #11

iamshahd commented Sep 7, 2022 •

edited

Loading

tensorfreitas commented Sep 20, 2022 •

edited

Loading

Unable to Reproduce Paper Results with Pi-Model #11

Unable to Reproduce Paper Results with Pi-Model #11

Comments

iamshahd commented Sep 7, 2022 • edited Loading

tensorfreitas commented Sep 20, 2022 • edited Loading

iamshahd commented Sep 7, 2022 •

edited

Loading

tensorfreitas commented Sep 20, 2022 •

edited

Loading