Skip to content
This repository has been archived by the owner on Jun 9, 2023. It is now read-only.

Unable to Reproduce Paper Results with Pi-Model #11

Open
iamshahd opened this issue Sep 7, 2022 · 1 comment
Open

Unable to Reproduce Paper Results with Pi-Model #11

iamshahd opened this issue Sep 7, 2022 · 1 comment

Comments

@iamshahd
Copy link

iamshahd commented Sep 7, 2022

Hi,

First of all, I'd like to thank you tremendously for sharing this clean TensorFlow implementation - it saved me a lot of time!

Second, I am wondering if by any chance you have compared your pi-model results with the original paper. I tried training the pi-model on SVHN using the same hyperparameters reported with 1000 labels, but for some reason the model overfits around the 50th-60th epoch, and the final test accuracy of the best ckpt is only 85%, versus almost 95% in the paper.

@tensorfreitas
Copy link
Owner

tensorfreitas commented Sep 20, 2022

Hi @iamshahd,

At the time I was not using a good GPU so I was not able to get the exact results as I stated in the docs:

The results are not exactly the ones reported in the paper with 1000 labels, but I have to admit that I do not have the hardware to find the best parameters with structured batches (the experiments were run in a 860M NVIDIA card).

With proper hyperparameter tuning it should be close to the reported results.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants