[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

Overview

involution

Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVPR'21)

By Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, and Qifeng Chen

TL; DR. involution is a general-purpose neural primitive that is versatile for a spectrum of deep learning models on different vision tasks. involution bridges convolution and self-attention in design, while being more efficient and effective than convolution, simpler than self-attention in form.

Getting Started

This repository is fully built upon the OpenMMLab toolkits. For each individual task, the config and model files follow the same directory organization as mmcls, mmdet, and mmseg respectively, so just copy-and-paste them to the corresponding locations to get started.

For example, in terms of evaluating detectors

git clone https://github.com/open-mmlab/mmdetection # and install

cp det/mmdet/models/backbones/* mmdetection/mmdet/models/backbones
cp det/mmdet/models/necks/* mmdetection/mmdet/models/necks
cp det/mmdet/models/utils/* mmdetection/mmdet/models/utils

cp det/configs/_base_/models/* mmdetection/mmdet/configs/_base_/models
cp det/configs/_base_/schedules/* mmdetection/mmdet/configs/_base_/schedules
cp det/configs/involution mmdetection/mmdet/configs -r

cd mmdetection
# evaluate checkpoints
bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

For more detailed guidance, please refer to the original mmcls, mmdet, and mmseg tutorials.

Currently, we provide an memory-efficient implementation of the involuton operator based on CuPy. Please install this library in advance. A customized CUDA kernel would bring about further acceleration on the hardware. Any contribution from the community regarding this is welcomed!

Model Zoo

The parameters/FLOPs↓ and performance↑ compared to the convolution baselines are marked in the parentheses. Part of these checkpoints are obtained in our reimplementation runs, whose performance may show slight differences with those reported in our paper. Models are trained with 64 GPUs on ImageNet, 8 GPUs on COCO, and 4 GPUs on Cityscapes.

Image Classification on ImageNet

Model Params(M) FLOPs(G) Top-1 (%) Top-5 (%) Config Download
RedNet-26 9.23(32.8%↓) 1.73(29.2%↓) 75.96 93.19 config model | log
RedNet-38 12.39(36.7%↓) 2.22(31.3%↓) 77.48 93.57 config model | log
RedNet-50 15.54(39.5%↓) 2.71(34.1%↓) 78.35 94.13 config model | log
RedNet-101 25.65(42.6%↓) 4.74(40.5%↓) 78.92 94.35 config model | log
RedNet-152 33.99(43.5%↓) 6.79(41.4%↓) 79.12 94.38 config model | log

Before finetuning on the following downstream tasks, download the ImageNet pre-trained RedNet-50 weights and set the pretrained argument in det/configs/_base_/models/*.py or seg/configs/_base_/models/*.py to your local path.

Object Detection and Instance Segmentation on COCO

Faster R-CNN

Backbone Neck Style Lr schd Params(M) FLOPs(G) box AP Config Download
RedNet-50-FPN convolution pytorch 1x 31.6(23.9%↓) 177.9(14.1%↓) 39.5(1.8↑) config model | log
RedNet-50-FPN involution pytorch 1x 29.5(28.9%↓) 135.0(34.8%↓) 40.2(2.5↑) config model | log

Mask R-CNN

Backbone Neck Style Lr schd Params(M) FLOPs(G) box AP mask AP Config Download
RedNet-50-FPN convolution pytorch 1x 34.2(22.6%↓) 224.2(11.5%↓) 39.9(1.5↑) 35.7(0.8↑) config model | log
RedNet-50-FPN involution pytorch 1x 32.2(27.1%↓) 181.3(28.5%↓) 40.8(2.4↑) 36.4(1.3↑) config model | log

RetinaNet

Backbone Neck Style Lr schd Params(M) FLOPs(G) box AP Config Download
RedNet-50-FPN convolution pytorch 1x 27.8(26.3%↓) 210.1(12.2%↓) 38.2(1.6↑) config model | log
RedNet-50-FPN involution pytorch 1x 26.3(30.2%↓) 199.9(16.5%↓) 38.2(1.6↑) config model | log

Semantic Segmentation on Cityscapes

Method Backbone Neck Crop Size Lr schd Params(M) FLOPs(G) mIoU Config download
FPN RedNet-50 convolution 512x1024 80000 18.5(35.1%↓) 293.9(19.0%↓) 78.0(3.6↑) config model | log
FPN RedNet-50 involution 512x1024 80000 16.4(42.5%↓) 205.2(43.4%↓) 79.1(4.7↑) config model | log
UPerNet RedNet-50 convolution 512x1024 80000 56.4(15.1%↓) 1825.6(3.6%↓) 80.6(2.4↑) config model | log

Citation

If you find our work useful in your research, please cite:

@InProceedings{Li_2021_CVPR,
author = {Li, Duo and Hu, Jie and Wang, Changhu and Li, Xiangtai and She, Qi and Zhu, Lei and Zhang, Tong and Chen, Qifeng},
title = {Involution: Inverting the Inherence of Convolution for Visual Recognition},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021}
}
African language Speech Recognition - Speech-to-Text

Swahili-Speech-To-Text Table of Contents Swahili-Speech-To-Text Overview Scenario Approach Project Structure data: models: notebooks: scripts tests: l

2 Jan 05, 2023
Learnable Motion Coherence for Correspondence Pruning

Learnable Motion Coherence for Correspondence Pruning Yuan Liu, Lingjie Liu, Cheng Lin, Zhen Dong, Wenping Wang Project Page Any questions or discussi

liuyuan 41 Nov 30, 2022
Weight estimation in CT by multi atlas techniques

maweight A Python package for multi-atlas based weight estimation for CT images, including segmentation by registration, feature extraction and model

György Kovács 0 Dec 24, 2021
The repo contains the code to train and evaluate a system which extracts relations and explanations from dialogue.

The repo contains the code to train and evaluate a system which extracts relations and explanations from dialogue. How do I cite D-REX? For now, cite

Alon Albalak 6 Mar 31, 2022
Materials for upcoming beginner-friendly PyTorch course (work in progress).

Learn PyTorch for Deep Learning (work in progress) I'd like to learn PyTorch. So I'm going to use this repo to: Add what I've learned. Teach others in

Daniel Bourke 2.3k Dec 29, 2022
Coursera - Quiz & Assignment of Coursera

Coursera Assignments This repository is aimed to help Coursera learners who have difficulties in their learning process. The quiz and programming home

浅梦 828 Jan 04, 2023
《K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters》(2020)

K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters This repository is the implementation of the paper "K-Adapter: Infusing Knowledge

Microsoft 118 Dec 13, 2022
[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.

[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.

VITA 112 Nov 07, 2022
Official PyTorch Implementation of "AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting".

AgentFormer This repo contains the official implementation of our paper: AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecast

Ye Yuan 161 Dec 23, 2022
PoseViz – Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.

PoseViz – 3D Human Pose Visualizer Multi-person, multi-camera 3D human pose visualization tool built using Mayavi. As used in MeTRAbs visualizations.

István Sárándi 79 Dec 30, 2022
git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Self-Attention Attribution This repository contains the implementation for AAAI-2021 paper Self-Attention Attribution: Interpreting Information Intera

60 Dec 29, 2022
A Pytorch implementation of "Manifold Matching via Deep Metric Learning for Generative Modeling" (ICCV 2021)

Manifold Matching via Deep Metric Learning for Generative Modeling A Pytorch implementation of "Manifold Matching via Deep Metric Learning for Generat

69 Dec 10, 2022
IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL.

IJON SPACE EXPLORER IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL. Using only a small (usually one line) annotati

Chair for Sys­tems Se­cu­ri­ty 146 Dec 16, 2022
Generic Foreground Segmentation in Images

Pixel Objectness The following repository contains pretrained model for pixel objectness. Please visit our project page for the paper and visual resul

Suyog Jain 157 Nov 21, 2022
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 27, 2022
Semi-supevised Semantic Segmentation with High- and Low-level Consistency

Semi-supevised Semantic Segmentation with High- and Low-level Consistency This Pytorch repository contains the code for our work Semi-supervised Seman

123 Dec 30, 2022
The materials used in the SaxonJS tutorial presented at Declarative Amsterdam, 2021

SaxonJS-Tutorial-2021, version 1.0.4 Last updated on 4 November, 2021. Table of contents Background Prerequisites Starting a web server Running a Java

Saxonica 11 Oct 23, 2022
Segmentation Training Pipeline

Segmentation Training Pipeline This package is a part of Musket ML framework. Reasons to use Segmentation Pipeline Segmentation Pipeline was developed

Musket ML 52 Dec 12, 2022
Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

ATLOP Code for AAAI 2021 paper Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. If you make use of this co

Wenxuan Zhou 146 Nov 29, 2022
AdaDM: Enabling Normalization for Image Super-Resolution

AdaDM AdaDM: Enabling Normalization for Image Super-Resolution. You can apply BN, LN or GN in SR networks with our AdaDM. Pretrained models (EDSR*/RDN

58 Jan 08, 2023