Object detection on multiple datasets with an automatically learned unified label space.

Overview

Simple multi-dataset detection

An object detector trained on multiple large-scale datasets with a unified label space; Winning solution of ECCV 2020 Robust Vision Challenges.

Simple multi-dataset detection,
Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl,
arXiv technical report (arXiv 2102.13086)

Contact: [email protected]. Any questions or discussions are welcomed!

Abstract

How do we build a general and broad object detection system? We use all labels of all concepts ever annotated. These labels span diverse datasets with potentially inconsistent taxonomies. In this paper, we present a simple method for training a unified detector on multiple large-scale datasets. We use dataset-specific training protocols and losses, but share a common detection architecture with dataset-specific outputs. We show how to automatically integrate these dataset-specific outputs into a common semantic taxonomy. In contrast to prior work, our approach does not require manual taxonomy reconciliation. Our multi-dataset detector performs as well as dataset-specific models on each training domain, but generalizes much better to new unseen domains. Entries based on the presented methodology ranked first in the object detection and instance segmentation tracks of the ECCV 2020 Robust Vision Challenge.

Features at a glance

  • We trained a unified object detector on 4 large-scale detection datasets: COCO, Objects365, OpenImages, and Mapillary, with state-of-the-art performance on all of them.

  • The model predicts class labels in a learned unified label space.

  • The model can be directly used to test on novel datasets outside the training datasets.

  • In this repo, we also provide state-of-the-art baselines for Objects365 and OpenImages.

Main results

COCO test-challenge OpenImages public test Mapillary test Objects365 val
52.9 60.6 25.3 33.7

Results are obtained using a Cascade-RCNN with ResNeSt200 trained in an 8x schedule.

  • Unified model vs. ensemble of dataset-specific models with known test domains.
COCO Objects365 OpenImages mean.
Unified 45.4 24.4 66.0 45.3
Dataset-specific models 42.5 24.9 65.7 44.4

Results are obtained using a Cascade-RCNN with Res50 trained in an 8x schedule.

  • Zero-shot cross dataset evaluation
VOC VIPER CityScapes ScanNet WildDash CrowdHuman KITTI mean
Unified 82.9 21.3 52.6 29.8 34.7 70.7 39.9 47.3
Oracle models 80.3 31.8 54.6 44.7 - 80.0 - -

Results are obtained using a Cascade-RCNN with Res50 trained in an 8x schedule.

More models can be found in our MODEL ZOO.

Installation

Our project is developed on detectron2. Please follow the official detectron2 installation. All our code is under projects/UniDet/. In theory, you should be able to copy-paste projects/UniDet/ to the latest detectron2 release or your own detectron2 repo to run our project. There might be API changes in future detectron2 releases that make it incompatible.

Demo

We use the same inference API as detectorn2. To run inference on an image folder using our pretrained model, run

python projects/UniDet/demo/demo.py --config-file projects/UniDet/configs/Unified_learned_OCIM_R50_6x+2x.yaml --input images/*.jpg --opts MODEL.WEIGHTS models/Unified_learned_OCIM_R50_6x+2x.pth

If setup correctly, the output should look like:

*The sample image is from WildDash dataset.

Note that the model predicts all labels in its label hierarchy tree (for example, both vehicle and car for a car), following the protocol in OpenImages.

Benchmark evaluation and training

After installation, follow the instructions in DATASETS.md to setup the (many) datasets. Then check REPRODUCE.md to reproduce the results in the paper.

License

All our code under projects/Unidet/ is under Apache 2.0 license. The code from detectron2 follows the original Apache 2.0 license.

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{zhou2021simple,
  title={Simple multi-dataset detection},
  author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
  booktitle={arXiv preprint arXiv:2102.13086},
  year={2021}
}
Owner
Xingyi Zhou
CS Ph.D. student at UT Austin.
Xingyi Zhou
Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

TDY-CNN for Text-Independent Speaker Verification Official implementation of Temporal Dynamic Convolutional Neural Network for Text-Independent Speake

Seong-Hu Kim 16 Oct 17, 2022
Near-Duplicate Video Retrieval with Deep Metric Learning

Near-Duplicate Video Retrieval with Deep Metric Learning This repository contains the Tensorflow implementation of the paper Near-Duplicate Video Retr

2 Jan 24, 2022
💊 A 3D Generative Model for Structure-Based Drug Design (NeurIPS 2021)

A 3D Generative Model for Structure-Based Drug Design Coming soon... Citation @inproceedings{luo2021sbdd, title={A 3D Generative Model for Structu

Shitong Luo 118 Jan 05, 2023
A large-scale benchmark for co-optimizing the design and control of soft robots, as seen in NeurIPS 2021.

Evolution Gym A large-scale benchmark for co-optimizing the design and control of soft robots. As seen in Evolution Gym: A Large-Scale Benchmark for E

121 Dec 14, 2022
The code for "Deep Level Set for Box-supervised Instance Segmentation in Aerial Images".

Deep Levelset for Box-supervised Instance Segmentation in Aerial Images Wentong Li, Yijie Chen, Wenyu Liu, Jianke Zhu* This code is based on MMdetecti

sunshine.lwt 112 Jan 05, 2023
Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification [pdf] The official repository for Self-Supervised Pre-Training for Transfo

Hao Luo 116 Jan 04, 2023
Generic image compressor for machine learning. Pytorch code for our paper "Lossy compression for lossless prediction".

Lossy Compression for Lossless Prediction Using: Training: This repostiory contains our implementation of the paper: Lossy Compression for Lossless Pr

Yann Dubois 84 Jan 02, 2023
Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.

Yet Another Robotics and Reinforcement (YARR) learning framework for PyTorch.

Stephen James 51 Dec 27, 2022
Coded illumination for improved lensless imaging

CodedCam Coded Illumination for Improved Lensless Imaging Paper | Supplementary results | Data and Code are available. Coded illumination for improved

Computational Sensing and Information Processing Lab 1 Nov 29, 2021
Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

DynaBOA Code repositoty for the paper: Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation Shanyan Guan, Jingwei Xu, Michell

198 Dec 29, 2022
PyTorch inference for "Progressive Growing of GANs" with CelebA snapshot

Progressive Growing of GANs inference in PyTorch with CelebA training snapshot Description This is an inference sample written in PyTorch of the origi

320 Nov 21, 2022
GMFlow: Learning Optical Flow via Global Matching

GMFlow GMFlow: Learning Optical Flow via Global Matching Authors: Haofei Xu, Jing Zhang, Jianfei Cai, Hamid Rezatofighi, Dacheng Tao We streamline the

Haofei Xu 298 Jan 04, 2023
Generating Anime Images by Implementing Deep Convolutional Generative Adversarial Networks paper

AnimeGAN - Deep Convolutional Generative Adverserial Network PyTorch implementation of DCGAN introduced in the paper: Unsupervised Representation Lear

Rohit Kukreja 23 Jul 21, 2022
This repo provides a demo for the CVPR 2021 paper "A Fourier-based Framework for Domain Generalization" on the PACS dataset.

FACT This repo provides a demo for the CVPR 2021 paper "A Fourier-based Framework for Domain Generalization" on the PACS dataset. To cite, please use:

105 Dec 17, 2022
Supplementary code for the AISTATS 2021 paper "Matern Gaussian Processes on Graphs".

Matern Gaussian Processes on Graphs This repo provides an extension for gpflow with Matérn kernels, inducing variables and trainable models implemente

41 Dec 17, 2022
Code repo for "Transformer on a Diet" paper

Transformer on a Diet Reference: C Wang, Z Ye, A Zhang, Z Zhang, A Smola. "Transformer on a Diet". arXiv preprint arXiv (2020). Installation pip insta

cgraywang 31 Sep 26, 2021
Make a Turtlebot3 follow a figure 8 trajectory and create a robot arm and make it follow a trajectory

HW2 - ME 495 Overview Part 1: Makes the robot move in a figure 8 shape. The robot starts moving when launched on a real turtlebot3 and can be paused a

Devesh Bhura 0 Oct 21, 2022
LabelImg is a graphical image annotation tool.

LabelImgPlus LabelImg is a graphical image annotation tool. This project is not updated with new functions now. More functions are supported with Labe

lzx1413 200 Dec 20, 2022
Deep-learning-roadmap - All You Need to Know About Deep Learning - A kick-starter

Deep Learning - All You Need to Know Sponsorship To support maintaining and upgrading this project, please kindly consider Sponsoring the project deve

Instill AI 4.4k Dec 26, 2022
Code, Models and Datasets for OpenViDial Dataset

OpenViDial This repo contains downloading instructions for the OpenViDial dataset in 《OpenViDial: A Large-Scale, Open-Domain Dialogue Dataset with Vis

119 Dec 08, 2022