Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
mlp-mixer_b16_224.yaml		mlp-mixer_b16_224.yaml
mlp-mixer_l16_224.yaml		mlp-mixer_l16_224.yaml

README.md

MLP-Mixer: An all-MLP Architecture for Vision (arxiv)

(Update 2021-11-15) Code is released and ported weights are uploaded

Introduction

Convolutional Neural Networks (CNNs) are the go-to model for computer vision. Recently, attention-based networks, such as the Vision Transformer, have also become popular. In this paper we show that while convolutions and attention are both sufficient for good performance, neither of them are necessary. We present MLP-Mixer, an architecture based exclusively on multi-layer perceptrons (MLPs). MLP-Mixer contains two types of layers: one with MLPs applied independently to image patches (i.e. "mixing" the per-location features), and one with MLPs applied across patches (i.e. "mixing" spatial information). When trained on large datasets, or with modern regularization schemes, MLP-Mixer attains competitive scores on image classification benchmarks, with pre-training and inference cost comparable to state-of-the-art models. We hope that these results spark further research beyond the realms of well established CNNs and Transformers.

For details see An all-MLP Architecture for Vision by Yuan, Li and Chen, Yunpeng and Wang, Tao and Yu, Weihao and Shi, Yujun and Jiang, Zi-Hang and Tay, Francis E.H. and Feng, Jiashi and Yan, Shuicheng

Model Zoo

The results are evaluated on ImageNet2012 validation set

Arch	Weight	Top-1 Acc	Top-5 Acc	Crop ratio	# Params
mlp_mixer_b16_224	pretrain 1k	76.60	92.23	0.875	60.0M
mlp_mixer_l16_224	pretrain 1k	72.06	87.67	0.875	208.2M

Note: pretrain 1k is trained directly on the ImageNet-1k dataset

Usage

from passl.modeling.backbones import build_backbone
from passl.modeling.heads import build_head
from passl.utils.config import get_config


class Model(nn.Layer):
    def __init__(self, cfg_file):
        super().__init__()
        cfg = get_config(cfg_file)
        self.backbone = build_backbone(cfg.model.architecture)
        self.head = build_head(cfg.model.head)

    def forward(self, x):

        x = self.backbone(x)
        x = self.head(x)
        return x


cfg_file  = 'configs/mlp_mixer/mlp-mixer_b16_224.yaml'
m = Model(cfg_file)

Reference

@article{tolstikhin2021mlp,
  title={Mlp-mixer: An all-mlp architecture for vision},
  author={Tolstikhin, Ilya and Houlsby, Neil and Kolesnikov, Alexander and Beyer, Lucas and Zhai, Xiaohua and Unterthiner, Thomas and Yung, Jessica and Keysers, Daniel and Uszkoreit, Jakob and Lucic, Mario and others},
  journal={arXiv preprint arXiv:2105.01601},
  year={2021}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mlp_mixer

mlp_mixer

README.md

MLP-Mixer: An all-MLP Architecture for Vision (arxiv)

Introduction

Model Zoo

Usage

Reference

Files

mlp_mixer

Directory actions

More options

Directory actions

More options

Latest commit

History

mlp_mixer

Folders and files

parent directory

README.md

MLP-Mixer: An all-MLP Architecture for Vision (arxiv)

Introduction

Model Zoo

Usage

Reference