Refer-it-in-RGBD

This is the repository of our paper 'Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images' in CVPR 2021

Paper - ArXiv - pdf (abs)
Project page: https://unclemedm.github.io/Refer-it-in-RGBD/

Introduction

We present a novel task of 3D visual grounding in single-view RGB-D images where the referred objects are often only partially scanned. In contrast to previous works that directly generate object proposals for grounding in the 3D scenes, we propose a bottom-up approach to gradually aggregate information, effectively addressing the challenge posed by the partial scans. Our approach first fuses the language and the visual features at the bottom level to generate a heatmap that coarsely localizes the relevant regions in the RGB-D image. Then our approach adopts an adaptive search based on the heatmap and performs the object-level matching with another visio-linguistic fusion to finally ground the referred object. We evaluate the proposed method by comparing to the state-of-the-art methods on both the RGB-D images extracted from the ScanRefer dataset and our newly collected SUN-Refer dataset. Experiments show that our method outperforms the previous methods by a large margin (by 11.1% and 11.2% [email protected]) on both datasets.

Dataset

Download SUNREFER_v2 dataset
SUNREFER dataset contains 38,495 referring expression corresponding to 7,699 objects from SUNRGBD dataset. Here is one example from SUNREFER dataset:

Repository of our paper 'Refer-it-in-RGBD' in CVPR 2021

Related tags

Overview

Refer-it-in-RGBD

Introduction

Dataset

Owner

Haolin Liu

Use AI to generate a optimized stock portfolio

🏖 Keras Implementation of Painting outside the box

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

[ICCV 2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration

Code for Learning Manifold Patch-Based Representations of Man-Made Shapes, in ICLR 2021.

Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling"

Code for How To Create A Fully Automated AI Based Trading System With Python

Official PyTorch implementation of the NeurIPS 2021 paper StyleGAN3

Ipython notebook presentations for getting starting with basic programming, statistics and machine learning techniques

Subpopulation detection in high-dimensional single-cell data

PyTorch-based framework for Deep Hedging

PyTorch implementation of Densely Connected Time Delay Neural Network

Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design"

GPU-Accelerated Deep Learning Library in Python

Inferring Lexicographically-Ordered Rewards from Preferences

Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

TensorFlow implementation of "Variational Inference with Normalizing Flows"

Easy-to-use micro-wrappers for Gym and PettingZoo based RL Environments

OOD Generalization and Detection (ACL 2020)

A toolkit for developing and comparing reinforcement learning algorithms.