GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

Overview

GCNet for Object Detection

PWC PWC PWC PWC

By Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu.

This repo is a official implementation of "GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond" on COCO object detection based on open-mmlab's mmdetection. The core operator GC block could be find here. Many thanks to mmdetection for their simple and clean framework.

Update on 2020/12/07

The extension of GCNet got accepted by TPAMI (PDF).

Update on 2019/10/28

GCNet won the Best Paper Award at ICCV 2019 Neural Architects Workshop!

Update on 2019/07/01

The code is refactored. More results are provided and all configs could be found in configs/gcnet.

Notes: Both PyTorch official SyncBN and Apex SyncBN have some stability issues. During training, mAP may drops to zero and back to normal during last few epochs.

Update on 2019/06/03

GCNet is supported by the official mmdetection repo here. Thanks again for open-mmlab's work on open source projects.

Introduction

GCNet is initially described in arxiv. Via absorbing advantages of Non-Local Networks (NLNet) and Squeeze-Excitation Networks (SENet), GCNet provides a simple, fast and effective approach for global context modeling, which generally outperforms both NLNet and SENet on major benchmarks for various recognition tasks.

Citing GCNet

@article{cao2019GCNet,
  title={GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond},
  author={Cao, Yue and Xu, Jiarui and Lin, Stephen and Wei, Fangyun and Hu, Han},
  journal={arXiv preprint arXiv:1904.11492},
  year={2019}
}

Main Results

Results on R50-FPN with backbone (fixBN)

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R50-FPN Mask fixBN 2fc (w/o BN) - 1x 3.9 0.453 10.6 37.3 34.2 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 1x 4.5 0.533 10.1 38.5 35.1 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 1x 4.6 0.533 9.9 38.9 35.5 model
R50-FPN Mask fixBN 2fc (w/o BN) - 2x - - - 38.2 34.9 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 2x - - - 39.7 36.1 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 2x - - - 40.0 36.2 model

Results on R50-FPN with backbone (syncBN)

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R50-FPN Mask SyncBN 2fc (w/o BN) - 1x 3.9 0.543 10.2 37.2 33.8 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 4.5 0.547 9.9 39.4 35.7 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 4.6 0.603 9.4 39.9 36.2 model
R50-FPN Mask SyncBN 2fc (w/o BN) - 2x 3.9 0.543 10.2 37.7 34.3 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 2x 4.5 0.547 9.9 39.7 36.0 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 2x 4.6 0.603 9.4 40.2 36.3 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) - 1x - - - 38.8 34.6 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) GC(c3-c5, r16) 1x - - - 41.0 36.5 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) GC(c3-c5, r4) 1x - - - 41.4 37.0 model

Results on stronger backbones

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R101-FPN Mask fixBN 2fc (w/o BN) - 1x 5.8 0.571 9.5 39.4 35.9 model
R101-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 1x 7.0 0.731 8.6 40.8 37.0 model
R101-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 1x 7.1 0.747 8.6 40.8 36.9 model
R101-FPN Mask SyncBN 2fc (w/o BN) - 1x 5.8 0.665 9.2 39.8 36.0 model
R101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 7.0 0.778 9.0 41.1 37.4 model
R101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 7.1 0.786 8.9 41.7 37.6 model
X101-FPN Mask SyncBN 2fc (w/o BN) - 1x 7.1 0.912 8.5 41.2 37.3 model
X101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 8.2 1.055 7.7 42.4 38.0 model
X101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 8.3 1.037 7.6 42.9 38.5 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) - 1x - - - 44.7 38.3 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x - - - 45.9 39.3 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x - - - 46.5 39.7 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) - 1x - - - 47.1 40.4 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x - - - 47.9 40.9 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x - - - 47.9 40.8 model

Notes

  • GC denotes Global Context (GC) block is inserted after 1x1 conv of backbone.
  • DCN denotes replace 3x3 conv with 3x3 Deformable Convolution in c3-c5 stages of backbone.
  • r4 and r16 denote ratio 4 and ratio 16 in GC block respectively.
  • Some of models are trained on 4 GPUs with 4 images on each GPU.

Requirements

  • Linux(tested on Ubuntu 16.04)
  • Python 3.6+
  • PyTorch 1.1.0
  • Cython
  • apex (Sync BN)

Install

a. Install PyTorch 1.1 and torchvision following the official instructions.

b. Install latest apex with CUDA and C++ extensions following this instructions. The Sync BN implemented by apex is required.

c. Clone the GCNet repository.

 git clone https://github.com/xvjiarui/GCNet.git 

d. Compile cuda extensions.

cd GCNet
pip install cython  # or "conda install cython" if you prefer conda
./compile.sh  # or "PYTHON=python3 ./compile.sh" if you use system python3 without virtual environments

e. Install GCNet version mmdetection (other dependencies will be installed automatically).

python(3) setup.py install  # add --user if you want to install it locally
# or "pip install ."

Note: You need to run the last step each time you pull updates from github. Or you can run python(3) setup.py develop or pip install -e . to install mmdetection if you want to make modifications to it frequently.

Please refer to mmdetection install instruction for more details.

Environment

Hardware

  • 8 NVIDIA Tesla V100 GPUs
  • Intel Xeon 4114 CPU @ 2.20GHz

Software environment

  • Python 3.6.7
  • PyTorch 1.1.0
  • CUDA 9.0
  • CUDNN 7.0
  • NCCL 2.3.5

Usage

Train

As in original mmdetection, distributed training is recommended for either single machine or multiple machines.

./tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> [optional arguments]

Supported arguments are:

  • --validate: perform evaluation every k (default=1) epochs during the training.
  • --work_dir <WORK_DIR>: if specified, the path in config file will be replaced.

Evaluation

To evaluate trained models, output file is required.

python tools/test.py <CONFIG_FILE> <MODEL_PATH> [optional arguments]

Supported arguments are:

  • --gpus: number of GPU used for evaluation
  • --out: output file name, usually ends wiht .pkl
  • --eval: type of evaluation need, for mask-rcnn, bbox segm would evaluate both bounding box and mask AP.
Owner
Jerry Jiarui XU
Part of the journey is the end
Jerry Jiarui XU
The official project of SimSwap (ACM MM 2020)

SimSwap: An Efficient Framework For High Fidelity Face Swapping Proceedings of the 28th ACM International Conference on Multimedia The official reposi

Six_God 2.6k Jan 08, 2023
Synthesize photos from PhotoDNA using machine learning 🌱

Ribosome Synthesize photos from PhotoDNA. See the blog post for more information. Installation Dependencies You can install Python dependencies using

Anish Athalye 112 Nov 23, 2022
This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

CG3 This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning]. R

12 Oct 28, 2022
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 07, 2022
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to defend and evaluate Machine Learning models and ap

3.4k Jan 04, 2023
[NeurIPS'20] Multiscale Deep Equilibrium Models

Multiscale Deep Equilibrium Models 💥 💥 💥 💥 This repo is deprecated and we will soon stop actively maintaining it, as a more up-to-date (and simple

CMU Locus Lab 221 Dec 26, 2022
ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels

ROCKET + MINIROCKET ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge D

298 Dec 26, 2022
Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time

Semi Hand-Object Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time (CVPR 2021).

96 Dec 27, 2022
"Learning and Analyzing Generation Order for Undirected Sequence Models" in Findings of EMNLP, 2021

undirected-generation-dev This repo contains the source code of the models described in the following paper "Learning and Analyzing Generation Order f

Yichen Jiang 0 Mar 25, 2022
190 Jan 03, 2023
Small utility to demangle Nim symbols in callgrind files

nim_callgrind A small utility to demangle Nim symbols from callgrind files. Usage Run your (Nim) program with something like this: valgrind --tool=cal

kraptor 3 Feb 15, 2022
Generate Contextual Directory Wordlist For Target Org

PathPermutor Generate Contextual Directory Wordlist For Target Org This script generates contextual wordlist for any target org based on the set of UR

8 Jun 23, 2021
Rotation-Only Bundle Adjustment

ROBA: Rotation-Only Bundle Adjustment Paper, Video, Poster, Presentation, Supplementary Material In this repository, we provide the implementation of

Seong 51 Nov 29, 2022
Repo for parser tensorflow(.pb) and tflite(.tflite)

tfmodel_parser .pb file is the format of tensorflow model .tflite file is the format of tflite model, which usually used in mobile devices before star

1 Dec 23, 2021
Uncertain natural language inference

Uncertain Natural Language Inference This repository hosts the code for the following paper: Tongfei Chen*, Zhengping Jiang*, Adam Poliak, Keisuke Sak

Tongfei Chen 14 Sep 01, 2022
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

PyTorch Large-Scale Language Model A Large-Scale PyTorch Language Model trained on the 1-Billion Word (LM1B) / (GBW) dataset Latest Results 39.98 Perp

Ryan Spring 114 Nov 04, 2022
A program that uses computer vision to detect hand gestures, used for controlling movie players.

HandGestureDetection This program uses a Haar Cascade algorithm to detect the presence of your hand, and then passes it on to a self-created and self-

2 Nov 22, 2022
Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*. The algorithm was extremely

1 Mar 28, 2022
Connecting Java/ImgLib2 + Python/NumPy

imglyb imglyb aims at connecting two worlds that have been seperated for too long: Python with numpy Java with ImgLib2 imglyb uses jpype to access num

ImgLib2 29 Dec 21, 2022
PyTorch Implementation of the paper Learning to Reweight Examples for Robust Deep Learning

Learning to Reweight Examples for Robust Deep Learning Unofficial PyTorch implementation of Learning to Reweight Examples for Robust Deep Learning. Th

Daniel Stanley Tan 325 Dec 28, 2022