by Lucas Beyer, Xiaohua Zhai, Amélie Royer, Larisa Markeeva, Rohan Anil, Alexander Kolesnikov
We publish all teacher models, and configurations for the main experiments of the paper, as well as training logs and student models.
Please read the main big_vision README to learn how to run configs, and remember that each config file contains an example invocation in the top-level comment.
We provide the following colab to read and plot the logfiles of a few runs that we reproduced on Cloud.
The file bit_i1k.py is the configuration which reproduces our distillation runs on ImageNet-1k reported in Figures 1 and 5(left) and the first row of Table1.
We release both student and teacher models:
Model | Download link | Resolution | ImageNet top-1 acc. (paper) |
---|---|---|---|
BiT-R50x1 | link | 160 | 80.5 |
BiT-R50x1 | link | 224 | 82.8 |
BiT-R152x2 | link | 224 | 83.0 |
BiT-R152x2 | link | 384 | 84.3 |
The files bigsweep_flowers_pet.py and bigsweep_food_sun.py can be used to reproduce the distillation runs on these datasets and shown in Figures 3,4,9-12, and Table4.
While our open-source release does not currently support doing hyper-parameter sweeps, we still provide an example of the sweeps at the end of the configs for reference.
Links to all teacher models we used can be found in common.py.