Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

Last update: Oct 07, 2022

Related tags

Overview

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

abstract:Unlike 2D object detection where all RoI features come from grid pixels, the RoI feature extraction of 3D point cloud object detection is more diverse. In this paper, we first compare and analyze the differences in structure and performance between the two state-of-the-art models PV-RCNN and Voxel-RCNN. Then, we find that the performance gap between the two models does not come from point information, but structural information. The voxel features contain more structural information because they do quantization instead of downsampling to point cloud so that they can contain basically the complete information of the whole point cloud. The stronger structural information in voxel features makes the detector have higher performance in our experiments even if the voxel features don't have accurate location information. Then, we propose that structural information is the key to 3D object detection. Based on the above conclusion, we propose a Self-Attention RoI Feature Extractor (SARFE) to enhance structural information of the feature extracted from 3D proposals. SARFE is a plug-and-play module that can be easily used on existing 3D detectors. Our SARFE is evaluated on both KITTI dataset and Waymo Open dataset. With the newly introduced SARFE, we improve the performance of the state-of-the-art 3D detectors by a large margin in \textit{cyclist} on KITTI dataset while keeping real-time capability.

The source code will be published after the paper has been accepted to a conference.

Full paper

AP on KITTI Dataset

Submission link

AP on Waymo Open Dataset

Submission link

License

This code is released under the Apache 2.0 license.

Acknowledge

Our code are mainly based on OpenPCDet, thanks for their contributions!

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

Related tags

Overview

Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

AP on KITTI Dataset

AP on Waymo Open Dataset

License

Acknowledge

Owner

DK. Zhang

Official code repository for "Exploring Neural Models for Query-Focused Summarization"

A PyTorch toolkit for 2D Human Pose Estimation.

Random Walk Graph Neural Networks

Chinese named entity recognization with BiLSTM using Keras

50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program

Pose estimation for iOS and android using TensorFlow 2.0

An improvement of FasterGICP: Acceptance-rejection Sampling based 3D Lidar Odometry

Pytorch implementation of the paper: "A Unified Framework for Separating Superimposed Images", in CVPR 2020.

Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback

Official repository for Fourier model that can generate periodic signals

This repo is to be freely used by ML devs to check the GAN performances without coding from scratch.

Official PyTorch Implementation of SSMix (Findings of ACL 2021)

Nightmare-Writeup - Writeup for the Nightmare CTF Challenge from 2022 DiceCTF

PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021.

Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)

To propose and implement a multi-class classification approach to disaster assessment from the given data set of post-earthquake satellite imagery.

ACL'2021: LM-BFF: Better Few-shot Fine-tuning of Language Models

A Python library for common tasks on 3D point clouds

A PyTorch library and evaluation platform for end-to-end compression research

Codes for CVPR2021 paper "PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization"