Implementation of " SESS: Self-Ensembling Semi-Supervised 3D Object Detection" (CVPR2020 Oral)

Related tags

Deep Learningsess
Overview

SESS: Self-Ensembling Semi-Supervised 3D Object Detection

Created by Na Zhao from National University of Singapore

teaser

Introduction

This repository contains the PyTorch implementation for our CVPR 2020 Paper "SESS: Self-Ensembling Semi-Supervised 3D Object Detection" by Na Zhao, Tat Seng Chua, Gim Hee Lee [paper]

The performance of existing point cloud-based 3D object detection methods heavily relies on large-scale high-quality 3D annotations. However, such annotations are often tedious and expensive to collect. Semi-supervised learning is a good alternative to mitigate the data annotation issue, but has remained largely unexplored in 3D object detection. Inspired by the recent success of self-ensembling technique in semi-supervised image classification task, we propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data. Furthermore, we propose three consistency losses to enforce the consistency between two sets of predicted 3D object proposals, to facilitate the learning of structure and semantic invariances of objects. Extensive experiments conducted on SUN RGB-D and ScanNet datasets demonstrate the effectiveness of SESS in both inductive and transductive semi-supervised 3D object detection. Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.

Setup

  • Install python --This repo is tested with python 3.6.8.
  • Install pytorch with CUDA -- This repo is tested with torch 1.1, CUDA 9.0. It may wrk with newer versions, but that is not gauranteed.
  • Install tensorflow (for Tensorboard) -- This repo is tested with tensorflow 1.14.
  • Compile the CUDA layers for PointNet++, which is used in the backbone network:
    cd pointnet2
    python setup.py install
    
  • Install dependencies
    pip install -r requirements.txt
    

Usage

Data preparation

For SUNRGB-D, follow the README under sunrgbd folder.

For ScanNet, follow the README under scannet folder.

Running experiments

For SUNRGB-D, using the following command to train and evaluate:

python scripts/run_sess_sunrgbd.py

For ScanNet, using the following command to train and evaluate:

python scripts/run_sess_scannet.py

Note that we have included the pretaining phase, training phase, and two evaluation phases (inductive and transductive semi-supervised learning) as four functions in each script. You are free to uncomment any function execution line to skip the corresponding phase.

Citation

Please cite our paper if it is helpful to your research:

@inproceedings{zhao2020sess,
  title={SESS: Self-Ensembling Semi-Supervised 3D Object Detection},
  author={Zhao, Na and Chua, Tat-Seng and Lee, Gim Hee},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={11079--11087},
  year={2020}
}

Acknowledgement

Our implementation leverages on the source code from the following repositories:

HybVIO visual-inertial odometry and SLAM system

HybVIO A visual-inertial odometry system with an optional SLAM module. This is a research-oriented codebase, which has been published for the purposes

Spectacular AI 320 Jan 03, 2023
Kaggle Feedback Prize - Evaluating Student Writing 15th solution

Kaggle Feedback Prize - Evaluating Student Writing 15th solution First of all, I would like to thank the excellent notebooks and discussions from http

Lingyuan Zhang 6 Mar 24, 2022
SCNet: Learning Semantic Correspondence

SCNet Code Region matching code is contributed by Kai Han ([email protected]). Dense

Kai Han 34 Sep 06, 2022
GBIM(Gesture-Based Interaction map)

手势交互地图 GBIM(Gesture-Based Interaction map),基于视觉深度神经网络的交互地图,通过电脑摄像头观察使用者的手势变化,进而控制地图进行简单的交互。网络使用PaddleX提供的轻量级模型PPYOLO Tiny以及MobileNet V3 small,使得整个模型大小约10MB左右,即使在CPU下也能快速定位和识别手势。

8 Feb 10, 2022
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics

Dataset Cartography Code for the paper Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics at EMNLP 2020. This repository cont

AI2 125 Dec 22, 2022
Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GanFormer and TransGan paper

TransGanFormer (wip) Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GansFormer and TransGan paper. I

Phil Wang 146 Dec 06, 2022
Dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K Our dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

96 Jul 05, 2022
Official Pytorch implementation of RePOSE (ICCV2021)

RePOSE: Iterative Rendering and Refinement for 6D Object Detection (ICCV2021) [Link] Abstract We present RePOSE, a fast iterative refinement method fo

Shun Iwase 68 Nov 15, 2022
An OpenAI Gym environment for multi-agent car racing based on Gym's original car racing environment.

Multi-Car Racing Gym Environment This repository contains MultiCarRacing-v0 a multiplayer variant of Gym's original CarRacing-v0 environment. This env

Igor Gilitschenski 56 Nov 01, 2022
A collection of papers about Transformer in the field of medical image analysis.

A collection of papers about Transformer in the field of medical image analysis.

Junyu Chen 377 Jan 05, 2023
Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

ORB-SLAM2 Authors: Raul Mur-Artal, Juan D. Tardos, J. M. M. Montiel and Dorian Galvez-Lopez (DBoW2) 13 Jan 2017: OpenCV 3 and Eigen 3.3 are now suppor

Raul Mur-Artal 7.8k Dec 30, 2022
This is the dataset and code release of the OpenRooms Dataset.

This is the dataset and code release of the OpenRooms Dataset.

Visual Intelligence Lab of UCSD 95 Jan 08, 2023
Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems

Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems This repository is the official implementation of Rever

6 Aug 25, 2022
Provide partial dates and retain the date precision through processing

Prefix date parser This is a helper class to parse dates with varied degrees of precision. For example, a data source might state a date as 2001, 2001

Friedrich Lindenberg 13 Dec 14, 2022
Machine learning framework for both deep learning and traditional algorithms

NeoML is an end-to-end machine learning framework that allows you to build, train, and deploy ML models. This framework is used by ABBYY engineers for

NeoML 704 Dec 27, 2022
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting This is the origin Pytorch implementation of Informer in the followin

Haoyi 3.1k Dec 29, 2022
A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

Jin 4 Dec 30, 2022
TensorFlow2 Classification Model Zoo playing with TensorFlow2 on the CIFAR-10 dataset.

Training CIFAR-10 with TensorFlow2(TF2) TensorFlow2 Classification Model Zoo. I'm playing with TensorFlow2 on the CIFAR-10 dataset. Architectures LeNe

Chia-Hung Yuan 16 Sep 27, 2022
yufan 81 Dec 08, 2022
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding

Vision Longformer This project provides the source code for the vision longformer paper. Multi-Scale Vision Longformer: A New Vision Transformer for H

Microsoft 209 Dec 30, 2022