Extreme Rotation Estimation using Dense Correlation Volumes

Overview

Extreme Rotation Estimation using Dense Correlation Volumes

This repository contains a PyTorch implementation of the paper:

Extreme Rotation Estimation using Dense Correlation Volumes [Project page] [Arxiv]

Ruojin Cai, Bharath Hariharan, Noah Snavely, Hadar Averbuch-Elor

CVPR 2021

Introduction

We present a technique for estimating the relative 3D rotation of an RGB image pair in an extreme setting, where the images have little or no overlap. We observe that, even when images do not overlap, there may be rich hidden cues as to their geometric relationship, such as light source directions, vanishing points, and symmetries present in the scene. We propose a network design that can automatically learn such implicit cues by comparing all pairs of points between the two input images. Our method therefore constructs dense feature correlation volumes and processes these to predict relative 3D rotations. Our predictions are formed over a fine-grained discretization of rotations, bypassing difficulties associated with regressing 3D rotations. We demonstrate our approach on a large variety of extreme RGB image pairs, including indoor and outdoor images captured under different lighting conditions and geographic locations. Our evaluation shows that our model can successfully estimate relative rotations among non-overlapping images without compromising performance over overlapping image pairs.

Overview of our Method:

Overview

Given a pair of images, a shared-weight Siamese encoder extracts feature maps. We compute a 4D correlation volume using the inner product of features, from which our model predicts the relative rotation (here, as distributions over Euler angles).

Dependencies

# Create conda environment with python 3.6, torch 1.3.1 and CUDA 10.0
conda env create -f ./tools/environment.yml
conda activate rota

Dataset

Perspective images are randomly sampled from panoramas with a resolution of 256 × 256 and a 90◦ FoV. We sample images distributed uniformly over the range of [−180, 180] for yaw angles. To avoid generating textureless images that focus on the ceiling/sky or the floor, we limit the range over pitch angles to [−30◦, 30◦] for the indoor datasets and [−45◦, 45◦] for the outdoor dataset.

Download InteriorNet, SUN360, and StreetLearn datasets to obtain the full panoramas.

Metadata files about the training and test image pairs are available in the following google drive: link. Download the metadata.zip file, unzip it and put it under the project root directory.

We base on this MATLAB Toolbox that extracts perspective images from an input panorama. Before running PanoBasic/pano2perspective_script.m, you need to modify the path to the datasets and metadata files in the script.

Pretrained Model

Pretrained models are be available in the following google drive: link. To use the pretrained models, download the pretrained.zip file, unzip it and put it under the project root directory.

Testing the pretrained model:

The following commands test the performance of the pre-trained models in the rotation estimation task. The commands output the mean and median geodesic error, and the percentage of pairs with a relative rotation error under 10◦ for different levels of overlap on the test set.

# Usage:
# python test.py <config> --pretrained <checkpoint_filename>

python test.py configs/sun360/sun360_cv_distribution.yaml \
    --pretrained pretrained/sun360_cv_distribution.pt

python test.py configs/interiornet/interiornet_cv_distribution.yaml \
    --pretrained pretrained/interiornet_cv_distribution.pt

python test.py configs/interiornetT/interiornetT_cv_distribution.yaml \
    --pretrained pretrained/interiornetT_cv_distribution.pt

python test.py configs/streetlearn/streetlearn_cv_distribution.yaml \
    --pretrained pretrained/streetlearn_cv_distribution.pt

python test.py configs/streetlearnT/streetlearnT_cv_distribution.yaml \
    --pretrained pretrained/streetlearnT_cv_distribution.pt

Rotation estimation evaluation of the pretrained models is as follows:

InteriorNet InteriorNet-T SUM360 StreetLearn StreetLearn-T
Avg(°) Med(°) 10° Avg(°) Med(°) 10° Avg(°) Med(°) 10° Avg(°) Med(°) 10° Avg(°) Med(°) 10°
Large 1.82 0.88 98.76% 8.86 1.86 93.13% 1.37 1.09 99.51% 1.38 1.12 100.00% 24.98 2.50 78.95%
Small 4.31 1.16 96.58% 30.43 2.63 74.07% 6.13 1.77 95.86% 3.25 1.41 98.34% 27.84 3.19 74.76%
None 37.69 3.15 61.97% 49.44 4.17 58.36% 34.92 4.43 61.39% 5.46 1.65 96.60% 32.43 3.64 72.69%
All 13.49 1.18 86.90% 29.68 2.58 75.10% 20.45 2.23 78.30% 4.10 1.46 97.70% 29.85 3.19 74.30%

Training

# Usage:
# python train.py <config>

python train.py configs/interiornet/interiornet_cv_distribution.yaml

python train.py configs/interiornetT/interiornetT_cv_distribution.yaml

python train.py configs/sun360/sun360_cv_distribution_overlap.yaml
python train.py configs/sun360/sun360_cv_distribution.yaml --resume --pretrained <checkpoint_filename>

python train.py configs/streetlearn/streetlearn_cv_distribution_overlap.yaml
python train.py configs/streetlearn/streetlearn_cv_distribution.yaml --resume --pretrained <checkpoint_filename>

python train.py configs/streetlearnT/streetlearnT_cv_distribution_overlap.yaml
python train.py configs/streetlearnT/streetlearnT_cv_distribution.yaml --resume --pretrained <checkpoint_filename>

For SUN360 and StreetLearn dataset, finetune from the pretrained model, which is training with only overlapping pairs, at epoch 10. More configs about baselines can be found in the folder configs/sun360.

Cite

Please cite our work if you find it useful:

@inproceedings{Cai2021Extreme,
 title={Extreme Rotation Estimation using Dense Correlation Volumes},
 author={Cai, Ruojin and Hariharan, Bharath and Snavely, Noah and Averbuch-Elor, Hadar},
 booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
 year={2021}
}

Acknowledgment

This work was supported in part by the National Science Foundation (IIS-2008313) and by the generosity of Eric and Wendy Schmidt by recommendation of the Schmidt Futures program and the Zuckerman STEM leadership program.

Owner
Ruojin Cai
Ph.D. student at Cornell University
Ruojin Cai
Certis - Certis, A High-Quality Backtesting Engine

Certis - Backtesting For y'all Certis is a powerful, lightweight, simple backtes

Yeachan-Heo 46 Oct 30, 2022
Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research

Megaverse Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research. The efficient design of the engine enables ph

Aleksei Petrenko 191 Dec 23, 2022
Unofficial PyTorch code for BasicVSR

Dependencies and Installation The code is based on BasicSR, Please install the BasicSR framework first. Pytorch=1.51 Training cd ./code CUDA_VISIBLE_

Long 59 Dec 06, 2022
NAVER BoostCamp Final Project

CV 14조 final project Super Resolution and Deblur module Inference code & Pretrained weight Repo SwinIR Deblur 실행 방법 streamlit run WebServer/Server_SRD

JiSeong Kim 5 Sep 06, 2022
TDN: Temporal Difference Networks for Efficient Action Recognition

TDN: Temporal Difference Networks for Efficient Action Recognition Overview We release the PyTorch code of the TDN(Temporal Difference Networks).

Multimedia Computing Group, Nanjing University 326 Dec 13, 2022
Informal Persian Universal Dependency Treebank

Informal Persian Universal Dependency Treebank (iPerUDT) Informal Persian Universal Dependency Treebank, consisting of 3000 sentences and 54,904 token

Roya Kabiri 0 Jan 05, 2022
NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

NCVX NCVX: A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning. Please check https://ncvx.org for detailed instruction

SUN Group @ UMN 28 Aug 03, 2022
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 29, 2022
Code for "LASR: Learning Articulated Shape Reconstruction from a Monocular Video". CVPR 2021.

LASR Installation Build with conda conda env create -f lasr.yml conda activate lasr # install softras cd third_party/softras; python setup.py install;

Google 157 Dec 26, 2022
[NeurIPS'21 Spotlight] PyTorch code for our paper "Aligned Structured Sparsity Learning for Efficient Image Super-Resolution"

ASSL This repository is for a new network pruning method (Aligned Structured Sparsity Learning, ASSL) for efficient single image super-resolution (SR)

Huan Wang 47 Nov 28, 2022
Code for "Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations"

Infinitely Deep Bayesian Neural Networks with SDEs This library contains JAX and Pytorch implementations of neural ODEs and Bayesian layers for stocha

Winnie Xu 95 Nov 26, 2021
A python library to artfully visualize Factorio Blueprints and an interactive web demo for using it.

Factorio Blueprint Visualizer I love the game Factorio and I really like the look of factories after growing for many hours or blueprints after tweaki

Piet Brömmel 124 Jan 07, 2023
Code for the paper "There is no Double-Descent in Random Forests"

Code for the paper "There is no Double-Descent in Random Forests" This repository contains the code to run the experiments for our paper called "There

2 Jan 14, 2022
An unsupervised learning framework for depth and ego-motion estimation from monocular videos

SfMLearner This codebase implements the system described in the paper: Unsupervised Learning of Depth and Ego-Motion from Video Tinghui Zhou, Matthew

Tinghui Zhou 1.8k Dec 30, 2022
Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"

MeshTransformer ✨ This is our research code of End-to-End Human Pose and Mesh Reconstruction with Transformers. MEsh TRansfOrmer is a simple yet effec

Microsoft 473 Dec 31, 2022
Representing Long-Range Context for Graph Neural Networks with Global Attention

Graph Augmentation Graph augmentation/self-supervision/etc. Algorithms gcn gcn+virtual node gin gin+virtual node PNA GraphTrans Augmentation methods N

UC Berkeley RISE 67 Dec 30, 2022
Prometheus exporter for Cisco Unified Computing System (UCS) Manager

prometheus-ucs-exporter Overview Use metrics from the UCS API to export relevant metrics to Prometheus This repository is a fork of Drew Stinnett's or

Marshall Wace 6 Nov 07, 2022
AquaTimer - Programmable Timer for Aquariums based on ATtiny414/814/1614

AquaTimer - Programmable Timer for Aquariums based on ATtiny414/814/1614 AquaTimer is a programmable timer for 12V devices such as lighting, solenoid

Stefan Wagner 4 Jun 13, 2022
The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

Flow-to-depth (FDNet) video-depth-estimation This is the implementation of paper Video Depth Estimation by Fusing Flow-to-Depth Proposals Jiaxin Xie,

32 Jun 14, 2022
Convert dog pictures into various painting styles. Try LimnPet

LimnPet Cartoon stylization service project Try our service » Home page · Team notion · Members 목차 프로젝트 소개 프로젝트 목표 사용한 기술스택과 수행도구 팀원 구현 기능 주요 기능 추가 기능

LiJell 7 Jul 14, 2022