Hiring research interns for neural architecture search projects: [email protected]
This is a collection of our AutoML-NAS work
AutoFormer (
NEW
): AutoFormer: Searching Transformers for Visual Recognition
Cream (
@NeurIPS'20
): Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search
- 💥 July, 2021: AutoFormer is now available on arXiv.
- 💥 Oct, 2020: Code for Cream is now released.
- 💥 Oct, 2020: Cream was accepted to NeurIPS'20
Coming soon!!!
AutoFormer is new one-shot architecture search framework dedicated to vision transformer search. It entangles the weights of different vision transformer blocks in the same layers during supernet training. Benefiting from the strategy, the trained supernet allows thousands of subnets to be very well-trained. Specifically, the performance of these subnets with weights inherited from the supernet is comparable to those retrained from scratch.
[Paper] [Models-Google Drive][Models-Baidu Disk (password: wqw6)] [Slides] [BibTex]
In this work, we present a simple yet effective architecture distillation method. The central idea is that subnetworks can learn collaboratively and teach each other throughout the training process, aiming to boost the convergence of individual models. We introduce the concept of prioritized path, which refers to the architecture candidates exhibiting superior performance during training. Distilling knowledge from the prioritized paths is able to boost the training of subnetworks. Since the prioritized paths are changed on the fly depending on their performance and complexity, the final obtained paths are the cream of the crop.
License under an MIT license.