DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Last update: Oct 15, 2021

Overview

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Paper | Project page | Demo (Youtube) | Demo (Bilibili)

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.
Shiyi Lan, Zhiding Yu, Chris Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry Davis, Anima Anandkumar
International Conference on Computer Vision (ICCV) 2021

This repository contains the official Pytorch implementation of training & evaluation code and pretrained models for DiscoBox. DiscoBox is a state of the art framework that can jointly predict high quality instance segmentation and semantic correspondence from box annotations.

We use MMDetection v2.10.0 as the codebase.

All of our models are trained and tested using automatic mixed precision, which leverages float16 for speedup and less GPU memory consumption.

Installation

This implementation is based on PyTorch==1.9.0, mmcv==2.13.0, and mmdetection==2.10.0

Please refer to get_started.md for installation.

Or you can download the docker image from our dockerhub repository.

Models

Results on COCO val 2017

Backbone	Weights	AP	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]
ResNet-50	download	30.7	52.6	30.6	13.3	34.1	45.6
ResNet-101-DCN	download	35.3	59.1	35.4	16.9	39.2	53.0
ResNeXt-101-DCN	download	37.3	60.4	39.1	17.8	41.1	55.4

Results on COCO test-dev

We also evaluate the models in the section Results on COCO val 2017 with the same weights on COCO test-dev.

Backbone	Weights	AP	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]
ResNet-50	download	32.0	53.6	32.6	11.7	33.7	48.4
ResNet-101-DCN	download	35.8	59.8	36.4	16.9	38.7	52.1
ResNeXt-101-DCN	download	37.9	61.4	40.0	18.0	41.1	53.9

Training

COCO

ResNet-50 (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_r50_fpn_3x.py 8

ResNet-101-DCN (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_r101_dcn_fpn_3x.py 8

ResNeXt-101-DCN (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_x101_dcn_fpn_3x.py 8

Pascal VOC 2012

ResNet-50 (4 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_6x.py 4

ResNet-101 (4 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_6x.py 4

Testing

COCO

ResNet-50 (8 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_r50_fpn_3x.py \
     work_dirs/coco_r50_fpn_3x.pth 8 --eval segm

ResNet-101-DCN (8 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_r101_dcn_fpn_3x.py \
     work_dirs/coco_r101_dcn_fpn_3x.pth 8 --eval segm

ResNeXt-101-DCN (GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_x101_dcn_fpn_3x_fp16.py \
     work_dirs/coco_x101_dcn_fpn_3x.pth 8 --eval segm

Pascal VOC 2012 (COCO API)

ResNet-50 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_3x_fp16.py \
     work_dirs/voc_r50_6x.pth 4 --eval segm

ResNet-101 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_3x_fp16.py \
     work_dirs/voc_r101_6x.pth 4 --eval segm

Pascal VOC 2012 (Matlab)

Step 1: generate results

ResNet-50 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_3x_fp16.py \
     work_dirs/voc_r50_6x.pth 4 \
     --format-only \
     --options "jsonfile_prefix=work_dirs/voc_r50_results.json"

ResNet-101 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_3x_fp16.py \
     work_dirs/voc_r101_6x.pth 4 \
     --format-only \
     --options "jsonfile_prefix=work_dirs/voc_r101_results.json"

Step 2: format conversion

ResNet-50:

python tools/json2mat.pywork_dirs/voc_r50_results.json work_dirs/voc_r50_results.mat

ResNet-101:

python tools/json2mat.pywork_dirs/voc_r101_results.json work_dirs/voc_r101_results.mat

Step 3: evaluation

Please visit BBTP for the evaluation code written in Matlab.

PF-Pascal

Please visit this repository.

LICENSE

Please check the LICENSE file. DiscoBox may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact [email protected].

Citation

@article{lan2021discobox,
  title={DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision},
  author={Lan, Shiyi and Yu, Zhiding and Choy, Christopher and Radhakrishnan, Subhashree and Liu, Guilin and Zhu, Yuke and Davis, Larry S and Anandkumar, Anima},
  journal={arXiv preprint arXiv:2105.06464},
  year={2021}
}

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Related tags

Overview

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Paper | Project page | Demo (Youtube) | Demo (Bilibili)

Installation

Models

Results on COCO val 2017

Results on COCO test-dev

Training

COCO

Pascal VOC 2012

Testing

COCO

Pascal VOC 2012 (COCO API)

Pascal VOC 2012 (Matlab)

PF-Pascal

LICENSE

Citation

Owner

Shiyi Lan

Efficient and Scalable Physics-Informed Deep Learning and Scientific Machine Learning on top of Tensorflow for multi-worker distributed computing

Official code release for: EditGAN: High-Precision Semantic Image Editing

A-ESRGAN aims to provide better super-resolution images by using multi-scale attention U-net discriminators.

TensorFlow GNN is a library to build Graph Neural Networks on the TensorFlow platform.

This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Experiments with Fourier layers on simulation data.

Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) is a new approach of noise reduction methods. In this repository is shown the package developed for this new method based on \citepaper.

Cross-platform CLI tool to generate your Github profile's stats and summary.

End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)

Repository features UNet inspired architecture used for segmenting lungs on chest X-Ray images

FinRL­-Meta: A Universe for Data­-Driven Financial Reinforcement Learning. 🔥

Official implementation of VaxNeRF (Voxel-Accelearated NeRF).

An Artificial Intelligence trying to drive a car by itself on a user created map

Platform-agnostic AI Framework 🔥

Datasets and pretrained Models for StyleGAN3 ...

A repository with exploration into using transformers to predict DNA ↔ transcription factor binding

Python script that takes an Impulse response .wav and a input .wav to demonstrate audio convolution.

Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation

Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

A full pipeline AutoML tool for tabular data

FinRL-Meta: A Universe for Data-Driven Financial Reinforcement Learning. 🔥