-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eval with the schedules #20
Comments
Thanks for checking. The augmentations applied in If you try running the schedule a few more times, do the prob and mag values still stay the same? |
In 'augmentation_transforms_hp.py' line 164
As I understand about your codes, It saves the list of (prob, mag), not saving the ops in search parts and when eval parts start, it brings the HP_TRANSFORM keys and concat it with the list of (prob, mag) by parse log function. for example, [('AutoContrast', 0.0, 0), However, If the HP_TRANSFORM keys return in random order, it could be applied wrong policy for example, [('Brightness', 0.0, 0), -> 'Auto_contrast' was intened to be placed at first, but didn't Do I miss something..? Thank you |
Ah, thanks for the great catch! I've reordered the dict to the python 2 ordering which I believe stays consistent. |
Thanks for your help! I'm so happy to help you. :) 👍 I had hard time to reproduce the results and I think this may cause the problem. I'll try with your new version! |
Hi again @arcelien I tried to reproduce with your code as paper. But I failed to make a result as you presented At first I did search with dataset 'rcifar10' and population 16 and I took the best test acc pbt_policy_10.txt then I use wideresnet_28_10 to train cifar100. However I get only 82.11% acc ... I tried to check baseline with txt which is only filled with 0 and its acc was 82.27% hmm did I miss something? or Is the applied policy so random that I did not get good results? |
Hi, thanks for trying to reproduce! This is much lower than expected, and likely because of the all 0 schedule you're seeing (this effectively does not use any PBA augmentation). I'll investigate this in Python 3. Could you give me your configuration details? i.e. python version, ray version, tensorflow version, which script you ran and your terminal commands. Also, have you tried reproducing with the original schedule using: Thanks! |
Thanks for reply! I used Google Colab for Baseline. Because of colab policy, I had to turn off at checkpoint 150 , restore with ckpt and retrain until 200 epochs. As I checked, Colab uses use 3.6.8(python), ray(0.8.0), tensorflow (1.14) and I use
(the hp_policy line was changed into the policy that was only filled with zero) I used search and eval for local GPUs server which had python 3.5.3. ray (0.8.0), tensorflow(1.4.1) and I used the
for search
that used the different hp_policy which was from the search I haven't tried with your best policy yet.. I will give a try with it! Thanks!! |
Apologies for the late reply. I tried to use your setup but I couldn't reproduce your issue where you run into a final schedule of all 0s. Perhaps you can try to use python 2.7 and give it a shot to debug the difference? |
Thanks for replying! My original issue was that I couldn't reproduce the high accuracy results that your paper showed. To be precise, I tried measuring the paper's baseline performance using a zero filled schedule, and it actually returned better performances than the method given in your paper. Is there any way to check the baseline performance other than using a txt filled with zeros? Anyways, I will try running all processes (baseline to search and eval) again in gcp with Python 2.7 as you suggested. I'll get back to you with the results! |
If you want to just reproduce the results of the search without running it yourself, you can use the preexisting schedules at https://github.com/arcelien/pba/tree/master/schedules. These are automatically loaded when you run one of the provided scripts. So, for example, to run CIFAR10 on WRN-28-10, you can try:
If you want a faster check, you can try reduced CIFAR10 instead with:
|
Hi,
As I know, in python 3.5.X or python 2.X , 'dict' structure has random keys order. For example,
If I do this
a = {'one' : 1, 'two' : 2, 'three', 3}
a.keys()
['one', 'two', 'three'] <- this was what I expected
['three', 'one', 'two'] <- However, this parts are always different when python session renewed.
I saw this kind of code in 'augmentation_transforms_hp.py' at almost last sentence.
Is this Random ordering on purpose to perturb?
I found this when I tried to visualize the schedule with pba.ipynb. When I renew the session in Ipython notebook, it changed the schedule. For example,
('Brightness', 0.2, 3), ('Auto_contrast', 0.4, 3) .... <- when first tried
('sharpeness', 0.2, 3) ... <- same prob, mag, but the op changed.
Thank you!
The text was updated successfully, but these errors were encountered: