Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

Last update: Dec 27, 2022

Related tags

Overview

CenterPose

Overview

This repository is the official implementation of the paper "Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image" by Lin et al. (full citation below). In this work, we propose a single-stage, keypoint-based approach for category-level object pose estimation, which operates on unknown object instances within a known category using a single RGB image input. The proposed network performs 2D object detection, detects 2D keypoints, estimates 6-DoF pose, and regresses relative 3D bounding cuboid dimensions. These quantities are estimated in a sequential fashion, leveraging the recent idea of convGRU for propagating information from easier tasks to those that are more difficult. We favor simplicity in our design choices: generic cuboid vertex coordinates, a single-stage network, and monocular RGB input. We conduct extensive experiments on the challenging Objectron benchmark of real images, outperforming state-of-the-art methods for 3D IoU metric (27.6% higher than the single-stage approach of MobilePose and 7.1% higher than the related two-stage approach). The algorithm runs at 15 fps on an NVIDIA GTX 1080Ti GPU.

Installation

The code was tested on Ubuntu 16.04, with Anaconda Python 3.6 and PyTorch 1.1.0. Higher versions should be possible with some accuracy difference. NVIDIA GPUs are needed for both training and testing.

Clone this repo:

CenterPose_ROOT=/path/to/clone/CenterPose
git clone https://github.com/NVlabs/CenterPose.git $CenterPose_ROOT

Create an Anaconda environment or create your own virtual environment

conda create -n CenterPose python=3.6
conda activate CenterPose
pip install -r requirements.txt
conda install -c conda-forge eigenpy

Compile the deformable convolutional layer

git submodule init
git submodule update
cd $CenterPose_ROOT/src/lib/models/networks/DCNv2
./make.sh

[Optional] If you want to use a higher version of PyTorch, you need to download the latest version of DCNv2 and compile the library.

git submodule set-url https://github.com/jinfagang/DCNv2_latest.git src/lib/models/networks/DCNv2
git submodule sync
git submodule update --init --recursive --remote
cd $CenterPose_ROOT/src/lib/models/networks/DCNv2
./make.sh

Download our pre-trained models for CenterPose and move all the .pth files to $CenterPose_ROOT/models/CenterPose/. We currently provide models for 9 categories: bike, book, bottle, camera, cereal_box, chair, cup, laptop, and shoe.
Prepare training/testing data

We save all the training/testing data under $CenterPose_ROOT/data/.

For the Objectron dataset, we created our own data pre-processor to extract the data for training/testing. Refer to the data directory for more details.

Demo

We provide supporting demos for image, videos, webcam, and image folders. See $CenterPose_ROOT/images/CenterPose

For category-level 6-DoF object estimation on images/video/image folders, run:

cd $CenterPose_ROOT/src
python demo.py --demo /path/to/image/or/folder/or/video --arch dlav1_34 --load_model ../path/to/model

You can also enable --debug 4 to save all the intermediate and final outputs.

For the webcam demo (You may want to specify the camera intrinsics via --cam_intrinsic), run

cd $CenterPose_ROOT/src
python demo.py --demo webcam --arch dlav1_34 --load_model ../path/to/model

Training

We follow the approach of CenterNet for training the DLA network, reducing the learning rate by 10x after epoch 90 and 120, and stopping after 140 epochs.

For debug purposes, you can put all the local training params in the $CenterPose_ROOT/src/main_CenterPose.py script. You can also use the command line instead. More options are in $CenterPose_ROOT/src/lib/opts.py.

To start a new training job, simply do the following, which will use default parameter settings:

cd $CenterPose_ROOT/src
python main_CenterPose.py

The result will be saved in $CenterPose_ROOT/exp/object_pose/$dataset_$category_$arch_$time ,e.g., objectron_bike_dlav1_34_2021-02-27-15-33

You could then use tensorboard to visualize the training process via

cd $path/to/folder
tensorboard --logdir=logs --host=XX.XX.XX.XX

Evaluation

We evaluate our method on the Objectron dataset, please refer to the objectron_eval directory for more details.

Citation

Please cite grasp_primitiveShape if you use this repository in your publications:

@article{lin2021single,
  title={Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image},
  author={Lin, Yunzhi and Tremblay, Jonathan and Tyree, Stephen and Vela, Patricio A and Birchfield, Stan},
  journal={arXiv preprint arXiv:2109.06161},
  year={2021}
}

Licence

CenterPose is licensed under the NVIDIA Source Code License - Non-commercial.

Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

Related tags

Overview

CenterPose

Overview

Installation

Demo

Training

Evaluation

Citation

Licence

Owner

NVIDIA Research Projects

Fast (simple) spectral synthesis and emission-line fitting of DESI spectra.

[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️

Focal Loss for Dense Rotation Object Detection

Wikidated : An Evolving Knowledge Graph Dataset of Wikidata’s Revision History

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)

DetCo: Unsupervised Contrastive Learning for Object Detection

This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

Reference implementation for Deep Unsupervised Learning using Nonequilibrium Thermodynamics

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

A simple, high level, easy-to-use open source Computer Vision library for Python.

Harmonic Memory Networks for Graph Completion

Spatial-Temporal Transformer for Dynamic Scene Graph Generation, ICCV2021

[ACM MM 2021] Yes, "Attention is All You Need", for Exemplar based Colorization

JudeasRx - graphical app for doing personalized causal medicine using the methods invented by Judea Pearl et al.

Artificial Intelligence playing minesweeper 🤖

Fortuitous Forgetting in Connectionist Networks

CTF Challenge for CSAW Finals 2021

ObjDetApp deploys a pytorch model for object detection

Tensorforce: a TensorFlow library for applied reinforcement learning

Canonical Capsules: Unsupervised Capsules in Canonical Pose (NeurIPS 2021)

Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

Related tags

Overview

CenterPose

Overview

Installation

Demo

Training

Evaluation

Citation

Licence

Owner

NVIDIA Research Projects

Fast (simple) spectral synthesis and emission-line fitting of DESI spectra.

[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️

Focal Loss for Dense Rotation Object Detection

Wikidated : An Evolving Knowledge Graph Dataset of Wikidata’s Revision History

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)

DetCo: Unsupervised Contrastive Learning for Object Detection

This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

Reference implementation for Deep Unsupervised Learning using Nonequilibrium Thermodynamics

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

A simple, high level, easy-to-use open source Computer Vision library for Python.

Harmonic Memory Networks for Graph Completion

Spatial-Temporal Transformer for Dynamic Scene Graph Generation, ICCV2021

[ACM MM 2021] Yes, "Attention is All You Need", for Exemplar based Colorization

JudeasRx - graphical app for doing personalized causal medicine using the methods invented by Judea Pearl et al.

Artificial Intelligence playing minesweeper 🤖

Fortuitous Forgetting in Connectionist Networks

CTF Challenge for CSAW Finals 2021

*ObjDetApp* deploys a pytorch model for object detection

Tensorforce: a TensorFlow library for applied reinforcement learning

Canonical Capsules: Unsupervised Capsules in Canonical Pose (NeurIPS 2021)

ObjDetApp deploys a pytorch model for object detection