3D cascade RCNN for object detection on point cloud

Last update: Dec 02, 2022

Overview

3D Cascade RCNN

This is the implementation of 3D Cascade RCNN: High Quality Object Detection in Point Clouds.

We designed a 3D object detection model on point clouds by:

Presenting a simple yet effective 3D cascade architecture
Analyzing the sparsity of the point clouds and using point completeness score to re-weighting training samples. Following is detection results on Waymo Open Dataset.

Results on KITTI

	Easy Car	Moderate Car	Hard Car
AP 11	90.05	86.02	79.27
AP 40	93.20	86.19	83.48

Results on Waymo

	Overall Vehicle	0-30m Vehicle	30-50m Vehicle	50m-Inf Vehicle
LEVEL_1 mAP	76.27	92.66	74.99	54.49
LEVEL_2 mAP	67.12	91.95	68.96	41.82

Installation

Requirements. The code is tested on the following environment:

Ubuntu 16.04 with 4 V100 GPUs
Python 3.7
Pytorch 1.7
CUDA 10.1
spconv 1.2.1

Build extensions

python setup.py develop

Getting Started

Prepare for the data.

Please download the official KITTI dataset and generate data infos by following command:

python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/kitti_dataset.yaml

The folder should be like:

data
├── kitti
│   │── ImageSets
│   │── training
│   │   ├──calib & velodyne & label_2 & image_2
│   │── testing
│   │   ├──calib & velodyne & image_2
|   |── kitti_dbinfos_train.pkl
|   |── kitti_infos_train.pkl
|   |── kitti_infos_val.pkl

Training and evaluation.

The configuration file is in tools/cfgs/3d_cascade_rcnn.yaml, and the training scripts is in tools/scripts.

cd tools
sh scripts/3d-cascade-rcnn.sh

Test a pre-trained model

The pre-trained KITTI model is at: model. Run with:

cd tools
sh scripts/3d-cascade-rcnn_test.sh

The evaluation results should be like:

2021-08-10 14:06:14,608   INFO  Car [email protected], 0.70, 0.70:
bbox AP:97.9644, 90.1199, 89.7076
bev  AP:90.6405, 89.0829, 88.4391
3d   AP:90.0468, 86.0168, 79.2661
aos  AP:97.91, 90.00, 89.48
Car [email protected], 0.70, 0.70:
bbox AP:99.1663, 95.8055, 93.3149
bev  AP:96.3107, 92.4128, 89.9473
3d   AP:93.1961, 86.1857, 83.4783
aos  AP:99.13, 95.65, 93.03
Car [email protected], 0.50, 0.50:
bbox AP:97.9644, 90.1199, 89.7076
bev  AP:98.0539, 97.1877, 89.7716
3d   AP:97.9921, 90.1001, 89.7393
aos  AP:97.91, 90.00, 89.48
Car [email protected], 0.50, 0.50:
bbox AP:99.1663, 95.8055, 93.3149
bev  AP:99.1943, 97.8180, 95.5420
3d   AP:99.1717, 95.8046, 95.4500
aos  AP:99.13, 95.65, 93.03

Acknowledge

The code is built on OpenPCDet and Voxel R-CNN.

3D cascade RCNN for object detection on point cloud

Related tags

Overview

3D Cascade RCNN

Results on KITTI

Results on Waymo

Installation

Getting Started

Prepare for the data.

Training and evaluation.

Test a pre-trained model

Acknowledge

Owner

Qi Cai

Quantum-enhanced transformer neural network

The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

Semantic Segmentation in Pytorch. Network include: FCN、FCN_ResNet、SegNet、UNet、BiSeNet、BiSeNetV2、PSPNet、DeepLabv3_plus、 HRNet、DDRNet

Official PyTorch Implementation of "AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting".

Explore the Expression: Facial Expression Generation using Auxiliary Classifier Generative Adversarial Network

CoRe: Contrastive Recurrent State-Space Models

CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images

Nightmare-Writeup - Writeup for the Nightmare CTF Challenge from 2022 DiceCTF

diablo2 resurrected loot filter

3.8% and 18.3% on CIFAR-10 and CIFAR-100

A-ESRGAN aims to provide better super-resolution images by using multi-scale attention U-net discriminators.

Implementation of PersonaGPT Dialog Model

A custom-designed Spider Robot trained to walk using Deep RL in a PyBullet Simulation

Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'

Dynamic Bottleneck for Robust Self-Supervised Exploration

The Submission for SIMMC 2.0 Challenge 2021

Normalizing Flows with a resampled base distribution

Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations