Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.

Overview

Hire-Wave-MLP.pytorch

Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP

Results and Models

Hire-MLP on ImageNet-1K Classification

Model Parameters FLOPs Top 1 Acc. Log Ckpt
Hire-MLP-Tiny 18M 2.1G 79.7% github github
Hire-MLP-Small 33M 4.2G 82.1% github github
Hire-MLP-Base 58M 8.1G 83.2% github github
Hire-MLP-Large 96M 13.4G 83.8%

Usage

Install

  • PyTorch (1.7.0)
  • torchvision (0.8.1)
  • timm (0.3.2)
  • torchprofile
  • mmcv (v1.3.0)
  • mmdetection (v2.11)
  • mmsegmentation (v0.11)

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is:

│path/to/imagenet/
├──train/
│  ├── n01440764
│  │   ├── n01440764_10026.JPEG
│  │   ├── n01440764_10027.JPEG
│  │   ├── ......
│  ├── ......
├──val/
│  ├── n01440764
│  │   ├── ILSVRC2012_val_00000293.JPEG
│  │   ├── ILSVRC2012_val_00002138.JPEG
│  │   ├── ......
│  ├── ......

Training

Training Hire-MLP

To train Hire-MLP-Tiny on ImageNet-1K on a single node with 8 gpus:

python -m torch.distributed.launch --nproc_per_node=8 train.py --data-path /your_path_to/imagenet/ --output_dir /your_path_to/output/ --model hire_mlp_tiny --batch-size 256 --apex-amp --input-size 224 --drop-path 0.0 --epochs 300 --test_freq 50 --test_epoch 260 --warmup-epochs 20 --warmup-lr 1e-6 --no-model-ema

To train Hire-MLP-Base on ImageNet-1K on a single node with 8 gpus:

python -m torch.distributed.launch --nproc_per_node=8 train.py --data-path /your_path_to/imagenet/ --output_dir /your_path_to/output/ --model hire_mlp_base --batch-size 128 --apex-amp --input-size 224 --drop-path 0.2 --epochs 300 --test_freq 50 --test_epoch 260 --warmup-epochs 20 --warmup-lr 1e-6 --no-model-ema

Training Wave-MLP

On a single node with 8 gpus, you can train the Wave-MLP family on ImageNet-1K as follows :

WaveMLP_T_dw:

python -m torch.distributed.launch --nproc_per_node 8 --nnodes=1 --node_rank=0 train_wave.py /your_path_to/imagenet/ --output /your_path_to/output/ --model WaveMLP_T_dw --sched cosine --epochs 300 --opt adamw -j 8 --warmup-lr 1e-6 --mixup .8 --cutmix 1.0 --model-ema --model-ema-decay 0.99996 --aa rand-m9-mstd0.5-inc1 --color-jitter 0.4 --warmup-epochs 5 --opt-eps 1e-8 --repeated-aug --remode pixel --reprob 0.25 --amp --lr 1e-3 --weight-decay .05 --drop 0 --drop-path 0.1 -b 128

WaveMLP_T:

python -m torch.distributed.launch --nproc_per_node 8 --nnodes=1 --node_rank=0 train_wave.py /your_path_to/imagenet/ --output /your_path_to/output/ --model WaveMLP_T --sched cosine --epochs 300 --opt adamw -j 8 --warmup-lr 1e-6 --mixup .8 --cutmix 1.0 --model-ema --model-ema-decay 0.99996 --aa rand-m9-mstd0.5-inc1 --color-jitter 0.4 --warmup-epochs 5 --opt-eps 1e-8 --repeated-aug --remode pixel --reprob 0.25 --amp --lr 1e-3 --weight-decay .05 --drop 0 --drop-path 0.1 -b 128

WaveMLP_S:

python -m torch.distributed.launch --nproc_per_node 8 --nnodes=1 --node_rank=0 train_wave.py /your_path_to/imagenet/ --output /your_path_to/output/ --model WaveMLP_S --sched cosine --epochs 300 --opt adamw -j 8 --warmup-lr 1e-6 --mixup .8 --cutmix 1.0 --model-ema --model-ema-decay 0.99996 --aa rand-m9-mstd0.5-inc1 --color-jitter 0.4 --warmup-epochs 5 --opt-eps 1e-8 --repeated-aug --remode pixel --reprob 0.25 --amp --lr 1e-3 --weight-decay .05 --drop 0 --drop-path 0.1 -b 128

WaveMLP_M:

python -m torch.distributed.launch --nproc_per_node 8 --nnodes=1 --node_rank=0 train_wave.py /your_path_to/imagenet/ --output /your_path_to/output/ --model WaveMLP_M --sched cosine --epochs 300 --opt adamw -j 8 --warmup-lr 1e-6 --mixup .8 --cutmix 1.0 --model-ema --model-ema-decay 0.99996 --aa rand-m9-mstd0.5-inc1 --color-jitter 0.4 --warmup-epochs 5 --opt-eps 1e-8 --repeated-aug --remode pixel --reprob 0.25 --amp --lr 1e-3 --weight-decay .05 --drop 0 --drop-path 0.1 -b 128

Evaluation

To evaluate a pre-trained Hire-MLP-Tiny on ImageNet validation set with a single GPU:

python -m torch.distributed.launch --nproc_per_node=1 train.py --data-path /your_path_to/imagenet/ --output_dir /your_path_to/output/ --batch-size 256 --input-size 224 --model hire_mlp_tiny --apex-amp --no-model-ema --resume /your_path_to/hire_mlp_tiny.pth --eval

Acknowledgement

This repo is based on DeiT, pytorch-image-models, MMDetection, MMSegmentation, Swin Transformer, CycleMLP and AS-MLP.

Citation

If you find this project useful in your research, please consider cite:

@article{guo2021hire,
  title={Hire-mlp: Vision mlp via hierarchical rearrangement},
  author={Guo, Jianyuan and Tang, Yehui and Han, Kai and Chen, Xinghao and Wu, Han and Xu, Chao and Xu, Chang and Wang, Yunhe},
  journal={arXiv preprint arXiv:2108.13341},
  year={2021}
}
@article{tang2021image,
  title={An Image Patch is a Wave: Phase-Aware Vision MLP},
  author={Tang, Yehui and Han, Kai and Guo, Jianyuan and Xu, Chang and Li, Yanxi and Xu, Chao and Wang, Yunhe},
  journal={arXiv preprint arXiv:2111.12294},
  year={2021}
}

License

License: MIT

You might also like...
object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII
object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII

赛题背景 在商品知识产权领域,知识产权体现为在线商品的设计和品牌。不幸的是,在每一天,存在着非法商户通过一些对抗手段干扰商标识别来逃避侵权,这带来了很高的知识产权风险和财务损失。为了促进先进的多媒体人工智能技术的发展,以保护企业来之不易的创作和想法免受恶意使用和剽窃,因此提出了鲁棒性标识检测挑战赛

[ICCV 2021] Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain
[ICCV 2021] Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain

Amplitude-Phase Recombination (ICCV'21) Official PyTorch implementation of "Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neur

Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

An official implementation of paper Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification
FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification

FPGA & FreeNet Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification by Zhuo Zheng, Yanfei Zhong, Ailong M

Patch SVDD for Image anomaly detection

Patch SVDD Patch SVDD for Image anomaly detection. Paper: https://arxiv.org/abs/2006.16067 (published in ACCV 2020). Original Code : https://github.co

Hierarchical Metadata-Aware Document Categorization under Weak Supervision (WSDM'21)

Hierarchical Metadata-Aware Document Categorization under Weak Supervision This project provides a weakly supervised framework for hierarchical metada

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran

HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

HeatNet HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events glob

Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity
Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity

Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity, such as gratings, photonic-crystal slabs, metasurfaces, surface-emitting lasers, nano-antennas, and more.

Comments
  • Difference of PATM between code and paper

    Difference of PATM between code and paper

    In Section 3.2 of the paper, the authors say that "The final output of the block is the summation of these three branches." However, it seems that there are extra reweight and projection after the summation. Maybe this design is from CycleMLP, but not mentioned in the paper, which may cause confusion.

    opened by sbl1996 1
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
Owner
Nevermore
Nevermore
Bootstrapped Unsupervised Sentence Representation Learning (ACL 2021)

Install first pip3 install -e . Training python3 training/unsupervised_tuning.py python3 training/supervised_tuning.py python3 training/multilingual_

yanzhang_nlp 26 Jul 22, 2022
Official implementation of the paper "Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering"

Light Field Networks Project Page | Paper | Data | Pretrained Models Vincent Sitzmann*, Semon Rezchikov*, William Freeman, Joshua Tenenbaum, Frédo Dur

Vincent Sitzmann 130 Dec 29, 2022
Bayesian inference for Permuton-induced Chinese Restaurant Process (NeurIPS2021).

Permuton-induced Chinese Restaurant Process Note: Currently only the Matlab version is available, but a Python version will be available soon! This is

NTT Communication Science Laboratories 3 Dec 17, 2022
Object DGCNN and DETR3D, Our implementations are built on top of MMdetection3D.

Object DGCNN & DETR3D This repo contains the implementations of Object DGCNN (https://arxiv.org/abs/2110.06923) and DETR3D (https://arxiv.org/abs/2110

Wang, Yue 539 Jan 07, 2023
USAD - UnSupervised Anomaly Detection on multivariate time series

USAD - UnSupervised Anomaly Detection on multivariate time series Scripts and utility programs for implementing the USAD architecture. Implementation

116 Jan 04, 2023
Official Pytorch implementation of C3-GAN

Official pytorch implemenation of C3-GAN Contrastive Fine-grained Class Clustering via Generative Adversarial Networks [Paper] Authors: Yunji Kim, Jun

NAVER AI 114 Dec 02, 2022
links and status of cool gradio demos

awesome-demos This is a list of some wonderful demos & applications built with Gradio. Here's how to contribute yours! 🖊️ Natural language processing

Gradio 96 Dec 30, 2022
D2LV: A Data-Driven and Local-Verification Approach for Image Copy Detection

Facebook AI Image Similarity Challenge: Matching Track —— Team: imgFp This is the source code of our 3rd place solution to matching track of Image Sim

16 Dec 25, 2022
Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimization"

Riggable 3D Face Reconstruction via In-Network Optimization Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimizati

130 Jan 02, 2023
⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for *Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances* paper.

Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances This repository contains the code for Measuring the Co

Daniel Steinberg 0 Nov 06, 2022
A DCGAN to generate anime faces using custom mined dataset

Anime-Face-GAN-Keras A DCGAN to generate anime faces using custom dataset in Keras. Dataset The dataset is created by crawling anime database websites

Pavitrakumar P 190 Jan 03, 2023
Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow

xRBM Library Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow Installation Using pip: pip install xrbm Examples Tut

Omid Alemi 55 Dec 29, 2022
Python Interview Questions

Python Interview Questions Clone the code to your computer. You need to understand the code in main.py and modify the content in if __name__ =='__main

ClassmateLin 575 Dec 28, 2022
BrainGNN - A deep learning model for data-driven discovery of functional connectivity

A deep learning model for data-driven discovery of functional connectivity https://doi.org/10.3390/a14030075 Usman Mahmood, Zengin Fu, Vince D. Calhou

Usman Mahmood 3 Aug 28, 2022
Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022
VQMIVC - Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion

VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion (Interspeech

Disong Wang 262 Dec 31, 2022
🏖 Keras Implementation of Painting outside the box

Keras implementation of Image OutPainting This is an implementation of Painting Outside the Box: Image Outpainting paper from Standford University. So

Bendang 1.1k Dec 10, 2022
基于深度强化学习的原神自动钓鱼AI

原神自动钓鱼AI由YOLOX, DQN两部分模型组成。使用迁移学习,半监督学习进行训练。 模型也包含一些使用opencv等传统数字图像处理方法实现的不可学习部分。

4.2k Jan 01, 2023
Implementing DeepMind's Fast Reinforcement Learning paper

Fast Reinforcement Learning This is a repo where I implement the algorithms in the paper, Fast reinforcement learning with generalized policy updates.

Marcus Chiam 6 Nov 28, 2022