Implements MLP-Mixer: An all-MLP Architecture for Vision.

Overview

MLP-Mixer-CIFAR10

This repository implements MLP-Mixer as proposed in MLP-Mixer: An all-MLP Architecture for Vision. The paper introduces an all MLP (Multi-layer Perceptron) architecture for computer vision tasks. Yannic Kilcher walks through the architecture in this video.

Experiments reported in this repository are on CIFAR-10.

What's included?

  • Distributed training with mixed-precision.
  • Visualization of the token-mixing MLP weights.
  • A TensorBoard callback to keep track of the learned linear projections of the image patches.
Screen.Recording.2021-05-25.at.5.49.20.PM.mov

Notebooks

Note: These notebooks are runnable on Colab. If you don't have access to a tensor-core GPU, please disable the mixed-precision block while running the code.

Results

MLP-Mixer achieves competitive results. The figure below summarizes top-1 accuracies on CIFAR-10 test set with respect to varying MLP blocks.


Notable hyperparameters are:

  • Image size: 72x72
  • Patch size: 9x9
  • Hidden dimension for patches: 64
  • Hidden dimension for patches: 128

The table below reports the parameter counts for the different MLP-Mixer variants:


ResNet20 (0.571969 Million) achieves 78.14% under the exact same training configuration. Refer to this notebook for more details.

Models

You can reproduce the results reported above. The model files are available here.

Acknowledgements

ML-GDE Program for providing GCP credits.

You might also like...
An All-MLP solution for Vision, from Google AI
An All-MLP solution for Vision, from Google AI

MLP Mixer - Pytorch An All-MLP solution for Vision, from Google AI, in Pytorch. No convolutions nor attention needed! Yannic Kilcher video Install $ p

Implementation of
Implementation of "A MLP-like Architecture for Dense Prediction"

A MLP-like Architecture for Dense Prediction (arXiv) Updates (22/07/2021) Initial release. Model Zoo We provide CycleMLP models pretrained on ImageNet

Model search is a framework that implements AutoML algorithms for model architecture search at scale
Model search is a framework that implements AutoML algorithms for model architecture search at scale

Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers speed up their exploration process for finding the right model architecture for their classification problems (i.e., DNNs with different types of layers).

A task-agnostic vision-language architecture as a step towards General Purpose Vision
A task-agnostic vision-language architecture as a step towards General Purpose Vision

Towards General Purpose Vision Systems By Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, and Derek Hoiem Overview Welcome to the official code base f

MLP-Like Vision Permutator for Visual Recognition (PyTorch)
MLP-Like Vision Permutator for Visual Recognition (PyTorch)

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition (arxiv) This is a Pytorch implementation of our paper. We present Vision

code for paper
code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Does Unsupervised Architecture Representation Learning Help Neural Architecture Search? Code for paper: Does Unsupervised Architecture Representation

Implementation of ResMLP, an all MLP solution to image classification, in Pytorch
Implementation of ResMLP, an all MLP solution to image classification, in Pytorch

ResMLP - Pytorch Implementation of ResMLP, an all MLP solution to image classification out of Facebook AI, in Pytorch Install $ pip install res-mlp-py

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch
Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.
Comments
  • Could patches number != MLP token mixing dimension?

    Could patches number != MLP token mixing dimension?

    I try to change the model into B/16 MLP-Mixer. is this setting, the patch number ( sequence length) != MLP token mixing dimension. But the code will report an error when it implements "x = layers.Add()([x, token_mixing])" because the two operation numbers have different shapes. Take an example, B/16 Settings: image 3232, 2D hidden layer 768, PP= 16*16, token mixing mlp dimentsion= 384, channel mlp dimension = 3072. Thus patch number ( sequence length) = 4, table value shape= (4, 768) When the code runs x = layers.Add()([x, token_mixing]) in the token mixing layer. rx shape=[4, 768], token_mixing shape = [384, 768]

    It is strange why the MLP-Mixer paper could set different parameters "patch number ( sequence length) != MLP token mixing dimensio"

    opened by LouiValley 2
  • Why the accuracy drops after epoch 100/100 (accuracy drops from 91% to 71%)

    Why the accuracy drops after epoch 100/100 (accuracy drops from 91% to 71%)

    I trained the Network ( NUM_MIXER_LAYERS =4 )

    At epoch 100:

    Epoch 100/100

    1/44 [..............................] - ETA: 1s - loss: 0.2472 - accuracy: 0.9160 3/44 [=>............................] - ETA: 1s - loss: 0.2424 - accuracy: 0.9162 5/44 [==>...........................] - ETA: 1s - loss: 0.2431 - accuracy: 0.9155 7/44 [===>..........................] - ETA: 1s - loss: 0.2424 - accuracy: 0.9154 9/44 [=====>........................] - ETA: 1s - loss: 0.2419 - accuracy: 0.9155 11/44 [======>.......................] - ETA: 1s - loss: 0.2423 - accuracy: 0.9150 13/44 [=======>......................] - ETA: 1s - loss: 0.2426 - accuracy: 0.9145 15/44 [=========>....................] - ETA: 1s - loss: 0.2430 - accuracy: 0.9142 17/44 [==========>...................] - ETA: 1s - loss: 0.2433 - accuracy: 0.9140 19/44 [===========>..................] - ETA: 1s - loss: 0.2435 - accuracy: 0.9138 21/44 [=============>................] - ETA: 0s - loss: 0.2438 - accuracy: 0.9136 23/44 [==============>...............] - ETA: 0s - loss: 0.2439 - accuracy: 0.9135 25/44 [================>.............] - ETA: 0s - loss: 0.2440 - accuracy: 0.9134 27/44 [=================>............] - ETA: 0s - loss: 0.2440 - accuracy: 0.9133 29/44 [==================>...........] - ETA: 0s - loss: 0.2442 - accuracy: 0.9132 31/44 [====================>.........] - ETA: 0s - loss: 0.2445 - accuracy: 0.9130 33/44 [=====================>........] - ETA: 0s - loss: 0.2447 - accuracy: 0.9129 35/44 [======================>.......] - ETA: 0s - loss: 0.2450 - accuracy: 0.9127 37/44 [========================>.....] - ETA: 0s - loss: 0.2454 - accuracy: 0.9125 39/44 [=========================>....] - ETA: 0s - loss: 0.2459 - accuracy: 0.9123 41/44 [==========================>...] - ETA: 0s - loss: 0.2463 - accuracy: 0.9121 43/44 [============================>.] - ETA: 0s - loss: 0.2469 - accuracy: 0.9119 44/44 [==============================] - 2s 46ms/step - loss: 0.2474 - accuracy: 0.9117 - val_loss: 1.1145 - val_accuracy: 0.7226

    Then it still have an extra training, 1/313 [..............................] - ETA: 24:32 - loss: 0.5860 - accuracy: 0.8125 8/313 [..............................] - ETA: 2s - loss: 1.2071 - accuracy: 0.6953  ..... 313/313 [==============================] - ETA: 0s - loss: 1.0934 - accuracy: 0.7161 313/313 [==============================] - 12s 22ms/step - loss: 1.0934 - accuracy: 0.7161 Test accuracy: 71.61

    opened by LouiValley 1
  • Consider either turning off auto-sharding or switching the auto_shard_policy to DATA

    Consider either turning off auto-sharding or switching the auto_shard_policy to DATA

    Excuse me, when I try to run it on the serve, it tips:

    Consider either turning off auto-sharding or switching the auto_shard_policy to DATA to shard this dataset. You can do this by creating a new tf.data.Options() object then setting options.experimental_distribute.auto_shard_policy = AutoShardPolicy.DATA before applying the options object to the dataset via dataset.with_options(options). 2021-11-21 11:59:20.861052: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.

    BTW, my TensorFlow version is 2.4.0, how to fix this problem?

    opened by LouiValley 1
Releases(Models)
Owner
Sayak Paul
Trying to learn how machines learn.
Sayak Paul
基于Paddlepaddle复现yolov5,支持PaddleDetection接口

PaddleDetection yolov5 https://github.com/Sharpiless/PaddleDetection-Yolov5 简介 PaddleDetection飞桨目标检测开发套件,旨在帮助开发者更快更好地完成检测模型的组建、训练、优化及部署等全开发流程。 PaddleD

36 Jan 07, 2023
PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021.

IBRNet: Learning Multi-View Image-Based Rendering PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021. IBRN

Google Interns 371 Jan 03, 2023
Membership Inference Attack against Graph Neural Networks

MIA GNN Project Starter If you meet the version mismatch error for Lasagne library, please use following command to upgrade Lasagne library. pip insta

6 Nov 09, 2022
Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept

Pranav 39 Nov 21, 2022
SustainBench: Benchmarks for Monitoring the Sustainable Development Goals with Machine Learning

Datasets | Website | Raw Data | OpenReview SustainBench: Benchmarks for Monitoring the Sustainable Development Goals with Machine Learning Christopher

67 Dec 17, 2022
Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

A Latent Transformer for Disentangled Face Editing in Images and Videos Official implementation for paper: A Latent Transformer for Disentangled Face

InterDigital 108 Dec 09, 2022
JupyterLite demo deployed to GitHub Pages 🚀

JupyterLite Demo JupyterLite deployed as a static site to GitHub Pages, for demo purposes. ✨ Try it in your browser ✨ ➡️ https://jupyterlite.github.io

JupyterLite 223 Jan 04, 2023
Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue

Realtime Unsupervised Depth Estimation from an Image This is the caffe implementation of our paper "Unsupervised CNN for single view depth estimation:

Ravi Garg 227 Nov 28, 2022
YOLOv5🚀 reproduction by Guo Quanhao using PaddlePaddle

YOLOv5-Paddle YOLOv5 🚀 reproduction by Guo Quanhao using PaddlePaddle 支持AutoBatch 支持AutoAnchor 支持GPU Memory 快速开始 使用AIStudio高性能环境快速构建YOLOv5训练(PaddlePa

QuanHao Guo 20 Nov 14, 2022
PyTorch implementation of Densely Connected Time Delay Neural Network

Densely Connected Time Delay Neural Network PyTorch implementation of Densely Connected Time Delay Neural Network (D-TDNN) in our paper "Densely Conne

Ya-Qi Yu 64 Oct 11, 2022
A copy of Ares that costs 30 fucking dollars.

Finalement, j'ai décidé d'abandonner cette idée, je me suis comporté comme un enfant qui été en colère. Comme m'ont dit certaines personnes j'ai des c

Bleu 24 Apr 14, 2022
A Python library for Deep Graph Networks

PyDGN Wiki Description This is a Python library to easily experiment with Deep Graph Networks (DGNs). It provides automatic management of data splitti

Federico Errica 194 Dec 22, 2022
Simple PyTorch hierarchical models.

A python package adding basic hierarchal networks in pytorch for classification tasks. It implements a simple hierarchal network structure based on feed-backward outputs.

Rajiv Sarvepalli 5 Mar 06, 2022
A Re-implementation of the paper "A Deep Learning Framework for Character Motion Synthesis and Editing"

What is This This is a simple re-implementation of the paper "A Deep Learning Framework for Character Motion Synthesis and Editing"(1). Only Sections

102 Dec 14, 2022
A curated list of the top 10 computer vision papers in 2021 with video demos, articles, code and paper reference.

The Top 10 Computer Vision Papers of 2021 The top 10 computer vision papers in 2021 with video demos, articles, code, and paper reference. While the w

Louis-François Bouchard 118 Dec 21, 2022
Part-Aware Data Augmentation for 3D Object Detection in Point Cloud

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud This repository contains a reference implementation of our Part-Aware Data Augment

Jaeseok Choi 62 Jan 03, 2023
Official implementation of EfficientPose

EfficientPose This is the official implementation of EfficientPose. We based our work on the Keras EfficientDet implementation xuannianz/EfficientDet

2 May 17, 2022
Pytorch Implementation of LNSNet for Superpixel Segmentation

LNSNet Overview Official implementation of Learning the Superpixel in a Non-iterative and Lifelong Manner (CVPR'21) Learning Strategy The proposed LNS

42 Oct 11, 2022
Identifying Stroke Indicators Using Rough Sets

Identifying Stroke Indicators Using Rough Sets With the spirit of reproducible research, this repository contains all the codes required to produce th

Muhammad Salman Pathan 0 Jun 09, 2022
Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch

Multimodal Temporal Context Network (MTCN) This repository implements the model proposed in the paper: Evangelos Kazakos, Jaesung Huh, Arsha Nagrani,

Evangelos Kazakos 13 Nov 24, 2022