Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval with the schedules #20

Open
serereuk opened this issue Jul 30, 2019 · 10 comments
Open

Eval with the schedules #20

serereuk opened this issue Jul 30, 2019 · 10 comments

Comments

@serereuk
Copy link

Hi,

As I know, in python 3.5.X or python 2.X , 'dict' structure has random keys order. For example,

If I do this

a = {'one' : 1, 'two' : 2, 'three', 3}
a.keys()
['one', 'two', 'three'] <- this was what I expected
['three', 'one', 'two'] <- However, this parts are always different when python session renewed.

I saw this kind of code in 'augmentation_transforms_hp.py' at almost last sentence.

Is this Random ordering on purpose to perturb?

I found this when I tried to visualize the schedule with pba.ipynb. When I renew the session in Ipython notebook, it changed the schedule. For example,

('Brightness', 0.2, 3), ('Auto_contrast', 0.4, 3) .... <- when first tried
('sharpeness', 0.2, 3) ... <- same prob, mag, but the op changed.

Thank you!

@arcelien
Copy link
Owner

Thanks for checking. The augmentations applied in pba.ipynb are explicitly shuffled. Do you have a line number you are referring to? I don't believe any dict is being read in order; they should be indexed with a key.

If you try running the schedule a few more times, do the prob and mag values still stay the same?

@serereuk
Copy link
Author

In 'augmentation_transforms_hp.py' line 164

HP_TRANSFORM_NAMES = NAME_TO_TRANSFORM.keys()

As I understand about your codes, It saves the list of (prob, mag), not saving the ops in search parts and when eval parts start, it brings the HP_TRANSFORM keys and concat it with the list of (prob, mag) by parse log function. for example,

[('AutoContrast', 0.0, 0),
('Brightness', 0.1, 0),
...
]

However, If the HP_TRANSFORM keys return in random order, it could be applied wrong policy for example,

[('Brightness', 0.0, 0), -> 'Auto_contrast' was intened to be placed at first, but didn't
('AutoContrast', 0.1, 0), -> same problems here
...
]

Do I miss something..?

Thank you

@arcelien
Copy link
Owner

Ah, thanks for the great catch! I've reordered the dict to the python 2 ordering which I believe stays consistent.

@serereuk
Copy link
Author

Thanks for your help! I'm so happy to help you. :) 👍

I had hard time to reproduce the results and I think this may cause the problem.

I'll try with your new version!

@serereuk
Copy link
Author

serereuk commented Aug 7, 2019

Hi again @arcelien

I tried to reproduce with your code as paper. But I failed to make a result as you presented

At first I did search with dataset 'rcifar10' and population 16 and I took the best test acc pbt_policy_10.txt

then I use wideresnet_28_10 to train cifar100. However I get only 82.11% acc ...

I tried to check baseline with txt which is only filled with 0 and its acc was 82.27%
(200 epochs, lr 0.1, wd 0.0005 all same)

hmm did I miss something? or Is the applied policy so random that I did not get good results?

@arcelien
Copy link
Owner

arcelien commented Aug 8, 2019

Hi, thanks for trying to reproduce! This is much lower than expected, and likely because of the all 0 schedule you're seeing (this effectively does not use any PBA augmentation). I'll investigate this in Python 3.

Could you give me your configuration details? i.e. python version, ray version, tensorflow version, which script you ran and your terminal commands.

Also, have you tried reproducing with the original schedule using: pba/scripts/table_2_cifar100.sh wrn_28_10?

Thanks!

@serereuk
Copy link
Author

serereuk commented Aug 10, 2019

Thanks for reply!

I used Google Colab for Baseline. Because of colab policy, I had to turn off at checkpoint 150 , restore with ckpt and retrain until 200 epochs.

As I checked, Colab uses use 3.6.8(python), ray(0.8.0), tensorflow (1.14) and I use

bash ./scripts/table_2_cifar100.sh wrn_28_10

(the hp_policy line was changed into the policy that was only filled with zero)

I used search and eval for local GPUs server which had python 3.5.3. ray (0.8.0), tensorflow(1.4.1) and I used the

bash ./scripts/search.sh rcifar10

for search

bash . /scripts/table_2_cifar100.sh wrn_28_10

that used the different hp_policy which was from the search

I haven't tried with your best policy yet.. I will give a try with it!

Thanks!!

@arcelien
Copy link
Owner

Apologies for the late reply. I tried to use your setup but I couldn't reproduce your issue where you run into a final schedule of all 0s. Perhaps you can try to use python 2.7 and give it a shot to debug the difference?

@serereuk
Copy link
Author

Thanks for replying!

My original issue was that I couldn't reproduce the high accuracy results that your paper showed.

To be precise, I tried measuring the paper's baseline performance using a zero filled schedule, and it actually returned better performances than the method given in your paper.

Is there any way to check the baseline performance other than using a txt filled with zeros?

Anyways, I will try running all processes (baseline to search and eval) again in gcp with Python 2.7 as you suggested. I'll get back to you with the results!

@arcelien
Copy link
Owner

If you want to just reproduce the results of the search without running it yourself, you can use the preexisting schedules at https://github.com/arcelien/pba/tree/master/schedules.

These are automatically loaded when you run one of the provided scripts. So, for example, to run CIFAR10 on WRN-28-10, you can try:

bash scripts/table_1_cifar10.sh wrn_28_10

If you want a faster check, you can try reduced CIFAR10 instead with:

bash scripts/table_3_rcifar10.sh wrn_28_10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants