Repository of 3D Object Detection with Pointformer (CVPR2021)

Overview

3D Object Detection with Pointformer

This repository contains the code for the paper 3D Object Detection with Pointformer (CVPR 2021) [arXiv]. This work is developed on the top of MMDetection3D toolbox and includes the models and results on SUN RGB-D and ScanNet datasets in the paper.

Overall Structure

More models results on KITTI and nuScenes datasets will be released soon.

Installation and Usage

The code is developed with MMDetection3D v0.6.1 and works well with v0.14.0.

Dependencies

  • NVIDIA GPU + CUDA 10.2
  • Python 3.8 (Recommend to use Anaconda)
  • PyTorch == 1.8.0
  • mmcv-full == 1.3.7
  • mmdet == 2.11.0
  • mmsegmentation == 0.13.0

Installation

  1. Install dependencies following their guidelines.
  2. Clone and install mmdet3d in develop mode.
git clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
python setup.py develop
  1. Add the files in this repo into the directories in mmdet3d.

Training and Testing

Download the pretrained weights from Google Drive or Tsinghua Cloud and put them in the checkpoints folder. Use votenet_ptr_sunrgbd-3d-10class as an example:

# Training
bash -x tools/dist_train.sh configs/pointformer/votenet_ptr_sunrgbd-3d-10class.py 8

# Testing 
bash tools/dist_test.sh configs/pointformer/votenet_ptr_sunrgbd-3d-10class.py checkpoints/votenet_ptr_sunrgbd-3d-10class.pth 8 --eval mAP

Results

SUN RGB-D

classes AP_0.25 AR_0.25 AP_0.50 AR_0.50
bed 0.8343 0.9515 0.5556 0.7029
table 0.5353 0.8705 0.2344 0.4604
sofa 0.6588 0.9171 0.4979 0.6715
chair 0.7681 0.8700 0.5664 0.6703
toilet 0.9117 0.9931 0.5538 0.7103
desk 0.2458 0.8050 0.0754 0.3395
dresser 0.3626 0.8028 0.2357 0.4908
night_stand 0.6701 0.9020 0.4525 0.6196
bookshelf 0.3383 0.6809 0.0968 0.2624
bathtub 0.7821 0.8980 0.4259 0.5510
Overall 0.6107 0.8691 0.3694 0.5479

ScanNet

classes AP_0.25 AR_0.25 AP_0.50 AR_0.50
cabinet 0.4548 0.7930 0.1757 0.4435
bed 0.8839 0.9506 0.8006 0.8889
chair 0.9011 0.9386 0.7562 0.8136
sofa 0.8915 0.9794 0.6619 0.8041
table 0.6763 0.8714 0.4858 0.6971
door 0.5413 0.7216 0.2107 0.4283
window 0.4821 0.7021 0.1504 0.2979
bookshelf 0.5255 0.8701 0.4422 0.7273
picture 0.1815 0.3649 0.0748 0.1351
counter 0.6210 0.8654 0.2333 0.3846
desk 0.6859 0.9370 0.3774 0.6535
curtain 0.5522 0.7910 0.3156 0.4627
refrigerator 0.5215 0.9649 0.4028 0.7193
showercurtrain 0.6709 0.9643 0.1941 0.5000
toilet 0.9922 1.0000 0.8210 0.8793
sink 0.6361 0.7347 0.4119 0.5000
bathtub 0.8710 0.8710 0.8375 0.8387
garbagebin 0.4762 0.7264 0.2244 0.4604
Overall 0.6425 0.8359 0.4209 0.5908

For more details of experimetns please refer to the paper.

Acknowledgement

This code is based on MMDetection3D.

Citation

If you find our work is useful in your research, please consider citing:

@InProceedings{Pan_2021_CVPR,
    author    = {Pan, Xuran and Xia, Zhuofan and Song, Shiji and Li, Li Erran and Huang, Gao},
    title     = {3D Object Detection With Pointformer},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {7463-7472}
}

@misc{pan20203d,
  title={3D Object Detection with Pointformer}, 
  author={Xuran Pan and Zhuofan Xia and Shiji Song and Li Erran Li and Gao Huang},
  year={2020},
  eprint={2012.11409},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}
Owner
Zhuofan Xia
Zhuofan Xia
Voice of Pajlada with model and weights.

Pajlada TTS Stripped down version of ForwardTacotron (https://github.com/as-ideas/ForwardTacotron) with pretrained weights for Pajlada's (https://gith

6 Sep 03, 2021
FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation

FIRA is a learning-based commit message generation approach, which first represents code changes via fine-grained graphs and then learns to generate commit messages automatically.

Van 21 Dec 30, 2022
E2VID_ROS - E2VID_ROS: E2VID to a real-time system

E2VID_ROS Introduce We extend E2VID to a real-time system. Because Python ROS ca

Robin Shaun 7 Apr 17, 2022
Bayesian Optimization Library for Medical Image Segmentation.

bayesmedaug: Bayesian Optimization Library for Medical Image Segmentation. bayesmedaug optimizes your data augmentation hyperparameters for medical im

Şafak Bilici 7 Feb 10, 2022
PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Shape-aware Convolutional Layer (ShapeConv) PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentatio

Hanchao Leng 82 Dec 29, 2022
The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift

TwoStageAlign The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift Pa

Shi Guo 32 Dec 15, 2022
Multi-modal Content Creation Model Training Infrastructure including the FACT model (AI Choreographer) implementation.

AI Choreographer: Music Conditioned 3D Dance Generation with AIST++ [ICCV-2021]. Overview This package contains the model implementation and training

Google Research 365 Dec 30, 2022
A collection of semantic image segmentation models implemented in TensorFlow

A collection of semantic image segmentation models implemented in TensorFlow. Contains data-loaders for the generic and medical benchmark datasets.

bobby 16 Dec 06, 2019
This is a collection of all challenges in HKCERT CTF 2021

香港網絡保安新生代奪旗挑戰賽 2021 (HKCERT CTF 2021) This is a collection of all challenges (and writeups) in HKCERT CTF 2021 Challenges ID Chinese name Name Score S

10 Jan 27, 2022
Group-Free 3D Object Detection via Transformers

Group-Free 3D Object Detection via Transformers By Ze Liu, Zheng Zhang, Yue Cao, Han Hu, Xin Tong. This repo is the official implementation of "Group-

Ze Liu 213 Dec 07, 2022
This repository contains the code for Direct Molecular Conformation Generation (DMCG).

Direct Molecular Conformation Generation This repository contains the code for Direct Molecular Conformation Generation (DMCG). Dataset Download rdkit

25 Dec 20, 2022
A demonstration of using a live Tensorflow session to create an interactive face-GAN explorer.

Streamlit Demo: The Controllable GAN Face Generator This project highlights Streamlit's new hash_func feature with an app that calls on TensorFlow to

Streamlit 257 Dec 31, 2022
LieTransformer: Equivariant Self-Attention for Lie Groups

LieTransformer This repository contains the implementation of the LieTransformer used for experiments in the paper LieTransformer: Equivariant Self-At

OxCSML (Oxford Computational Statistics and Machine Learning) 50 Dec 28, 2022
This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization

Spherical Gaussian Optimization This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization. This code has b

41 Dec 14, 2022
chen2020iros: Learning an Overlap-based Observation Model for 3D LiDAR Localization.

Overlap-based 3D LiDAR Monte Carlo Localization This repo contains the code for our IROS2020 paper: Learning an Overlap-based Observation Model for 3D

Photogrammetry & Robotics Bonn 219 Dec 15, 2022
Iris prediction model is used to classify iris species created julia's DecisionTree, DataFrames, JLD2, PlotlyJS and Statistics packages.

Iris Species Predictor Iris prediction is used to classify iris species using their sepal length, sepal width, petal length and petal width created us

Siva Prakash 2 Jan 06, 2022
Implementation of Research Paper "Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation"

Zero-DCE and Zero-DCE++(Lite architechture for Mobile and edge Devices) Papers Abstract The paper presents a novel method, Zero-Reference Deep Curve E

Tauhid Khan 15 Dec 10, 2022
Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

Autoregressive Predictive Coding This repository contains the official implementation (in PyTorch) of Autoregressive Predictive Coding (APC) proposed

iamyuanchung 173 Dec 18, 2022
One Million Scenes for Autonomous Driving

ONCE Benchmark This is a reproduced benchmark for 3D object detection on the ONCE (One Million Scenes) dataset. The code is mainly based on OpenPCDet.

148 Dec 28, 2022
Dense Prediction Transformers

Vision Transformers for Dense Prediction This repository contains code and models for our paper: Vision Transformers for Dense Prediction René Ranftl,

Intel ISL (Intel Intelligent Systems Lab) 1.3k Dec 28, 2022