GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

Overview

GDR-Net

This repo provides the PyTorch implementation of the work:

Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji. GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. In CVPR 2021. [Paper][ArXiv][Video][bibtex]

Overview

Requirements

  • Ubuntu 16.04/18.04, CUDA 10.1/10.2, python >= 3.6, PyTorch >= 1.6, torchvision
  • Install detectron2 from source
  • sh scripts/install_deps.sh
  • Compile the cpp extension for farthest points sampling (fps):
    sh core/csrc/compile.sh
    

Datasets

Download the 6D pose datasets (LM, LM-O, YCB-V) from the BOP website and VOC 2012 for background images. Please also download the image_sets and test_bboxes from here (BaiduNetDisk, OneDrive, password: qjfk).

The structure of datasets folder should look like below:

# recommend using soft links (ln -sf)
datasets/
├── BOP_DATASETS
    ├──lm
    ├──lmo
    ├──ycbv
├── lm_imgn  # the OpenGL rendered images for LM, 1k/obj
├── lm_renders_blender  # the Blender rendered images for LM, 10k/obj (pvnet-rendering)
├── VOCdevkit

Training GDR-Net

./core/gdrn_modeling/train_gdrn.sh <config_path> <gpu_ids> (other args)

Example:

./core/gdrn_modeling/train_gdrn.sh configs/gdrn/lm/a6_cPnP_lm13.py 0  # multiple gpus: 0,1,2,3
# add --resume if you want to resume from an interrupted experiment.

Our trained GDR-Net models can be found here (BaiduNetDisk, OneDrive, password: kedv).
(Note that the models for BOP setup in the supplement were trained using a refactored version of this repo (not compatible), they are slightly better than the models provided here.)

Evaluation

./core/gdrn_modeling/test_gdrn.sh <config_path> <gpu_ids> <ckpt_path> (other args)

Example:

./core/gdrn_modeling/test_gdrn.sh configs/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e.py 0 output/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e/gdrn_lmo_real_pbr.pth

Citation

If you find this useful in your research, please consider citing:

@InProceedings{Wang_2021_GDRN,
    title     = {{GDR-Net}: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation},
    author    = {Wang, Gu and Manhardt, Fabian and Tombari, Federico and Ji, Xiangyang},
    booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {16611-16621}
}
Comments
  • 关于cuda版本的问题

    关于cuda版本的问题

    您好,请问我使用cuda11以上的版本可以训练吗,因为我只有A6000和A100的显卡,它们不兼容cuda11以下的版本。我用cuda11.1和torch1.8或者1.9训练时,都会报double free or corruption (!prev)、RuntimeError: DataLoader worker (pid(s) xxxxx) exited unexpectedly。

    help wanted 
    opened by fn6767 9
  • Pipeline to print inferred 3D bounding boxes on images

    Pipeline to print inferred 3D bounding boxes on images

    Hello! I find this work really interesting. After successfully testing inference (LMO and YCB) I was just interested in plotting the inference results as 3D bounding boxes on RGB images and by inspecting the code I bumped into:

    https://github.com/THU-DA-6D-Pose-Group/GDR-Net/blob/5fb30c3dc53f46bac24a8a83a373eac7a8038556/core/gdrn_modeling/gdrn_evaluator.py#L516

    it is the function used for inference, which seems to show the results in terms of the different metrics, but not showing graphical results as I am looking for

    In the same file I noticed the function: https://github.com/THU-DA-6D-Pose-Group/GDR-Net/blob/5fb30c3dc53f46bac24a8a83a373eac7a8038556/core/gdrn_modeling/gdrn_evaluator.py#L634

    which seems structurally similar but with some input differences, in particular I would like to ask if the input dataloader can be be computed for gdrn_inference_on_dataset as for save_result_of_dataset as in

    https://github.com/THU-DA-6D-Pose-Group/GDR-Net/blob/5fb30c3dc53f46bac24a8a83a373eac7a8038556/core/gdrn_modeling/engine.py#L135-L137

    Since from preliminar debugging it seems it is not possible to access to the "image" field of the input sample in

    https://github.com/THU-DA-6D-Pose-Group/GDR-Net/blob/5fb30c3dc53f46bac24a8a83a373eac7a8038556/core/gdrn_modeling/gdrn_evaluator.py#L678

    Possibly related issue: https://github.com/THU-DA-6D-Pose-Group/GDR-Net/issues/56

    opened by AlbertoRemus 8
  • Some question of the paper

    Some question of the paper

    你好,论文里关于MXYZ到M2D-3D的转化是这样说的。"$M_{2D-3D}$ can then be derived by stacking $M_{XYZ}$onto the corresponding 2D pixel coordinates". 但是我还是不太清楚为什么从$3\times64\times64$维度的$M_{XYZ}$转变成了$2\times64\times64$维度的$M_{2D-3D}$。以及为什么要做这样一个转化呢,直接将预测的XYZ归一化之后和MSRA Concatenation不行吗?

    opened by Mr2er0 8
  • 关于更换数据的问题

    关于更换数据的问题

    王博,您好! 您的工作对我的帮助很大,非常感谢您提供的开源项目。现在我想使用自己的数据在您的模型上训练,之前的一些issue里您只提到了应该如何处理和组织自己的数据,但并没有提及如果要使用自己的数据,应该修改哪些部分的代码。因为之前在lm数据集上训练时需要先生成一些文件,所以我猜测如果要将模型应用在自己的数据上,可能需要修改的地方有很多,可以请您具体讲讲吗? 期待您的回复,再次致谢!

    opened by micki-37 7
  • Questions about LM-O evaluation results

    Questions about LM-O evaluation results

    Hi! thanks for your great work. I execute the following command to get the evaluation results of LM-O as follows, ‘GDR-Net-DATA‘ is the folder where I put the trained models. ./core/gdrn_modeling/test_gdrn.sh configs/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e.py 1 GDR-Net-DATA/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e/gdrn_lmo_real_pbr.pth 2345截图20210821172336 Is ‘ad_10’ the ‘Average Recall (%) of ADD(-S)’ mentioned in Table 2 in the paper?

    opened by Liuchongpei 7
  • Zero recall value while evaluating on LMO dataset

    Zero recall value while evaluating on LMO dataset

    Hello @wangg12

    I tried to evaluate the GDR-Net model on LMO dataset using the pretrained models you shared on OneDrive. I used following command to run the valuation:

    python core/gdrn_modeling/main_gdrn.py --config-file configs/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e.py \
     --num-gpus 1 \
    --eval-only  \
    --opts MODEL.WEIGHTS=output/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e/gdrn_lmo_real_pbr.pth
    

    However, it is showing zero recall values. Please see the screenshot below. Could you please help?

    Thank you, Supriya image

    opened by supriya-gdptl 6
  • evaluation failed for lmoSO

    evaluation failed for lmoSO

    Hi,

    When I train GDR-Net on ape of LMO dataset by

    ./core/gdrn_modeling/train_gdrn.sh configs/gdrn/lmoSO/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_80e_SO/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_80e_ape.py 1
    

    I get the unexpected output at the end of log.txt:

    core.gdrn_modeling.test_utils [email protected]: evaluation failed.
    core.gdrn_modeling.test_utils [email protected]: =====================================================================
    core.gdrn_modeling.test_utils [email protected]: output/gdrn/lmoSO/a6_cPnP_AugAAETrunc_lmo_real_pbr0.1_80e_SO/ape/inference_model_final/lmo_test/a6-cPnP-AugAAETrunc-BG0.5-lmo-real-pbr0.1-80e-ape-test-iter0_lmo-test-bb8/error:ad_ntop:1 does not exist.
    

    Could you suggest how to fix it? Thanks!

    opened by RuyiLian 6
  • One drive link seems not working

    One drive link seems not working

    Hi, unfortunately the One-Drive link of pretrained model seems to provide the following error on different browsers, do you have any insight about this?

    Thanks in advance,

    Alberto

    Screenshot from 2022-09-08 18-15-36

    opened by AlbertoRemus 5
  • 关于xyz_crop生成问题

    关于xyz_crop生成问题

    王博你好,我在使用tools/lm/lm_pbr_1_gen_xyz_crop.py生成xyz_crop文件的过程中遇到了这个问题。

    Traceback (most recent call last): File "tools/lm/lmo_pbr_1_gen_xyz_crop.py", line 228, in xyz_gen.main() File "tools/lm/lmo_pbr_1_gen_xyz_crop.py", line 137, in main bgr_gl, depth_gl = self.get_renderer().render(render_obj_id, IM_W, IM_H, K, R, t, near, far) File "tools/lm/lmo_pbr_1_gen_xyz_crop.py", line 98, in get_renderer self.renderer = Renderer( File "/data/hsm/gdr/tools/lm/../../lib/meshrenderer/meshrenderer_phong.py", line 26, in init self._fbo = gu.Framebuffer( File "/data/hsm/gdr/tools/lm/../../lib/meshrenderer/gl_utils/fbo.py", line 22, in init glNamedFramebufferTexture(self.__id, k, attachement.id, 0) File "/data/hsm/env/gdrn2/lib/python3.8/site-packages/OpenGL/platform/baseplatform.py", line 415, in call return self( *args, **named ) ctypes.ArgumentError: argument 1: <class 'TypeError'>: wrong type

    我认为可能是在 https://github.com/THU-DA-6D-Pose-Group/GDR-Net/blob/main/lib/meshrenderer/gl_utils/fbo.py#L19 中传入的k类型不匹配所以出错。图片为debug中显示的glNamedFramebufferTexture函数要求传入的数据类型。 图片 在issue中没有找到与我类似的问题,请问有人有任何解决这个问题的相关建议吗?

    need-more-info 
    opened by hellohaley 5
  • CUDA out of memory

    CUDA out of memory

    We implement the training process with pbr rendered data on eight GPU parallel computing (NIVDIA 2080 Ti with graphic memory of 12 G) , it barely starts training in batchsize 8 (original is 24). But when we resume the training process, CUDA will be out of memory.

    We'd like to know the author's training configuration...

    opened by GabrielleTse 5
  • Loss_region unable to converge

    Loss_region unable to converge

    1 Other Loss has significant decline, but Loss_region‘s drop is very weak. My training use config : configs/gdrn/lm/a6_cPnP_lm13.py Region area choose 4, 16, 64 can not make any improve.

    opened by lu-ming-lei 5
  • Generating test_bboxes/faster_R50_FPN_AugCosyAAE_HalfAnchor_lmo_pbr_lmo_fuse_real_all_8e_test_480x640.json file

    Generating test_bboxes/faster_R50_FPN_AugCosyAAE_HalfAnchor_lmo_pbr_lmo_fuse_real_all_8e_test_480x640.json file

    Hello @wangg12,

    Sorry to bother you again.

    Could you please tell me how to generate faster_R50_FPN_AugCosyAAE_HalfAnchor_lmo_pbr_lmo_fuse_real_all_8e_test_480x640.json in lmo/test/test_bboxes folder?

    Which code did you run to obtain this file?

    Thank you, Supriya

    opened by supriya-gdptl 1
Releases(v1.1)
This repository attempts to replicate the SqueezeNet architecture and implement the same on an image classification task.

SqueezeNet-Implementation This repository attempts to replicate the SqueezeNet architecture using TensorFlow discussed in the research paper: "Squeeze

Rohan Mathur 3 Dec 13, 2022
PyTorch implementation of SimSiam: Exploring Simple Siamese Representation Learning

SimSiam: Exploring Simple Siamese Representation Learning This is a PyTorch implementation of the SimSiam paper: @Article{chen2020simsiam, author =

Facebook Research 834 Dec 30, 2022
Retinal Vessel Segmentation with Pixel-wise Adaptive Filters (ISBI 2022)

Retinal Vessel Segmentation with Pixel-wise Adaptive Filters (ISBI 2022) Introdu

anonymous 14 Oct 27, 2022
(NeurIPS 2021) Realistic Evaluation of Transductive Few-Shot Learning

Realistic evaluation of transductive few-shot learning Introduction This repo contains the code for our NeurIPS 2021 submitted paper "Realistic evalua

Olivier Veilleux 14 Dec 13, 2022
Template repository for managing machine learning research projects built with PyTorch-Lightning

Tutorial Repository with a minimal example for showing how to deploy training across various compute infrastructure.

Sidd Karamcheti 3 Feb 11, 2022
NeuroFind - A solution to the to the Task given by the Oberseminar of Messtechnik Institute of TU Dresden in 2021

NeuroFind A solution to the to the Task given by the Oberseminar of Messtechnik

1 Jan 20, 2022
Official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution"

RealBasicVSR [Paper] This is the official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution, arXiv". This repository contain

Kelvin C.K. Chan 566 Dec 28, 2022
Using python and scikit-learn to make stock predictions

MachineLearningStocks in python: a starter project and guide EDIT as of Feb 2021: MachineLearningStocks is no longer actively maintained MachineLearni

Robert Martin 1.3k Dec 29, 2022
A new GCN model for Point Cloud Analyse

Pytorch Implementation of PointNet and PointNet++ This repo is implementation for VA-GCN in pytorch. Classification (ModelNet10/40) Data Preparation D

12 Feb 02, 2022
Unofficial PyTorch implementation of SimCLR by Google Brain

Unofficial PyTorch implementation of SimCLR by Google Brain

Rishabh Anand 2 Oct 13, 2021
First-Order Probabilistic Programming Language

FOPPL: A First-Order Probabilistic Programming Language This is an implementation of FOPPL, an S-expression based probabilistic programming language d

Renato Costa 23 Dec 20, 2022
A benchmark dataset for mesh multi-label-classification based on cube engravings introduced in MeshCNN

Double Cube Engravings This script creates a dataset for multi-label mesh clasification, with an intentionally difficult setup for point cloud classif

Yotam Erel 1 Nov 30, 2021
PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection

Unbiased Teacher for Semi-Supervised Object Detection This is the PyTorch implementation of our paper: Unbiased Teacher for Semi-Supervised Object Detection

Facebook Research 366 Dec 28, 2022
Using Language Model to Bootstrap Human Activity Recognition Ambient Sensors Based in Smart Homes

Using Language Model to Bootstrap Human Activity Recognition Ambient Sensors Based in Smart Homes This repository is the official implementation of Us

Damien Bouchabou 0 Oct 18, 2021
Reproduction of Vision Transformer in Tensorflow2. Train from scratch and Finetune.

Vision Transformer(ViT) in Tensorflow2 Tensorflow2 implementation of the Vision Transformer(ViT). This repository is for An image is worth 16x16 words

sungjun lee 42 Dec 27, 2022
Semantic segmentation task for ADE20k & cityscapse dataset, based on several models.

semantic-segmentation-tensorflow This is a Tensorflow implementation of semantic segmentation models on MIT ADE20K scene parsing dataset and Cityscape

HsuanKung Yang 83 Oct 13, 2022
Perception-aware multi-sensor fusion for 3D LiDAR semantic segmentation (ICCV 2021)

Perception-Aware Multi-Sensor Fusion for 3D LiDAR Semantic Segmentation (ICCV 2021) [中文|EN] 概述 本工作主要探索一种高效的多传感器(激光雷达和摄像头)融合点云语义分割方法。现有的多传感器融合方法主要将点云投影

ICE 126 Dec 30, 2022
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

Wav2CLIP 🚧 WIP 🚧 Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP 📄 🔗 Ho-Hsiang Wu, Prem Seetharaman

Descript 240 Dec 13, 2022
A list of all named GANs!

The GAN Zoo Every week, new GAN papers are coming out and it's hard to keep track of them all, not to mention the incredibly creative ways in which re

Avinash Hindupur 12.9k Jan 08, 2023
Elevation Mapping on GPU.

Elevation Mapping cupy Overview This is a ros package of elevation mapping on GPU. Code are written in python and uses cupy for GPU calculation. * pla

Robotic Systems Lab - Legged Robotics at ETH Zürich 183 Dec 19, 2022