MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

Last update: Jan 06, 2023

Related tags

Overview

MonoRec

Paper | Video (CVPR) | Video (Reconstruction) | Project Page

This repository is the official implementation of the paper:

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

Felix Wimbauer*, Nan Yang*, Lukas Von Stumberg, Niclas Zeller and Daniel Cremers

CVPR 2021 (arXiv)

If you find our work useful, please consider citing our paper:

@InProceedings{wimbauer2020monorec,
  title = {{MonoRec}: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera},
  author = {Wimbauer, Felix and Yang, Nan and von Stumberg, Lukas and Zeller, Niclas and Cremers, Daniel},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021},
}

🏗️ ️ Setup

The conda environment for this project can be setup by running the following command:

conda env create -f environment.yml

🏃 Running the Example Script

We provide a sample from the KITTI Odometry test set and a script to run MonoRec on it in example/. To download the pretrained model and put it into the right place, run download_model.sh. You can manually do this by can by downloading the weights from here and unpacking the file to saved/checkpoints/monorec_depth_ref.pth. The example script will plot the keyframe, depth prediction and mask prediction.

cd example
python test_monorec.py

🗃️ Data

In all of our experiments we used the KITTI Odometry dataset for training. For additional evaluations, we used the KITTI, Oxford RobotCar, TUM Mono-VO and TUM RGB-D datasets. All datapaths can be specified in the respective configuration files. In our experiments, we put all datasets into a seperate folder ../data.

KITTI Odometry

To setup KITTI Odometry, download the color images and calibration files from the official website (around 145 GB). Instead of the given velodyne laser data files, we use the improved ground truth depth for evaluation, which can be downloaded from here.

Unzip the color images and calibration files into ../data. The lidar depth maps can be extracted into the given folder structure by running data_loader/scripts/preprocess_kitti_extract_annotated_depth.py.

For training and evaluation, we use the poses estimated by Deep Virtual Stereo Odometry (DVSO). They can be downloaded from here and should be placed under ../data/{kitti_path}/poses_dso. This folder structure is ensured when unpacking the zip file in the {kitti_path} directory.

The auxiliary moving object masks can be downloaded from here. They should be placed under ../data/{kitti_path}/sequences/{seq_num}/mvobj_mask. This folder structure is ensured when unpacking the zip file in the {kitti_path} directory.

Oxford RobotCar

To setup Oxford RobotCar, download the camera model files and the large sample from the official website. Code, as well as, camera extrinsics need to be downloaded from the official GitHub repository. Please move the content of the python folder to data_loaders/oxford_robotcar/. extrinsics/, models/ and sample/ need to be moved to ../data/oxford_robotcar/. Note that for poses we use the official visual odometry poses, which are not provided in the large sample. They need to be downloaded manually from the raw dataset and unpacked into the sample folder.

TUM Mono-VO

Unfortunately, TUM Mono-VO images are provided only in the original, distorted form. Therefore, they need to be undistorted first before fed into MonoRec. To obtain poses for the sequences, we run the publicly available version of Direct Sparse Odometry.

TUM RGB-D

The official sequences can be downloaded from the official website and need to be unpacked under ../data/tumrgbd/{sequence_name}. Note that our provided dataset implementation assumes intrinsics from fr3 sequences. Note that the data loader for this dataset also relies on the code from the Oxford Robotcar dataset.

🏋️ Training & Evaluation

Please stay tuned! Training code will be published soon!

We provide checkpoints for each training stage:

Training stage	Download
Depth Bootstrap	Link
Mask Bootstrap	Link
Mask Refinement	Link
Depth Refinement (final model)	Link

Run download_model.sh to download the final model. It will automatically get moved to saved/checkpoints.

To reproduce the evaluation results on different datasets, run the following commands:

python evaluate.py --config configs/evaluate/eval_monorec.json        # KITTI Odometry
python evaluate.py --config configs/evaluate/eval_monorec_oxrc.json   # Oxford Robotcar

☁️ Pointclouds

To reproduce the pointclouds depicted in the paper and video, use the following commands:

python create_pointcloud.py --config configs/test/pointcloud_monorec.json       # KITTI Odometry
python create_pointcloud.py --config configs/test/pointcloud_monorec_oxrc.json  # Oxford Robotcar
python create_pointcloud.py --config configs/test/pointcloud_monorec_tmvo.json  # TUM Mono-VO

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

Related tags

Overview

MonoRec

🏗️ ️ Setup

🏃 Running the Example Script

🗃️ Data

KITTI Odometry

Oxford RobotCar

TUM Mono-VO

TUM RGB-D

🏋️ Training & Evaluation

☁️ Pointclouds

Owner

Felix Wimbauer

minimizer-space de Bruijn graphs (mdBG) for whole genome assembly

Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources.

Data Augmentation Using Keras and Python

DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation

Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

HistoKT: Cross Knowledge Transfer in Computational Pathology

Аналитика доходности инвестиционного портфеля в Тинькофф брокере

Implementation of the state-of-the-art vision transformers with tensorflow

This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

An implementation of the 1. Parallel, 2. Streaming, 3. Randomized SVD using MPI4Py

YOLO-v5 기반 단안 카메라의 영상을 활용해 차간 거리를 일정하게 유지하며 주행하는 Adaptive Cruise Control 기능 구현

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

This is the winning solution of the Endocv-2021 grand challange.

GluonMM is a library of transformer models for computer vision and multi-modality research

A web application that provides real time temperature and humidity readings of a house.

Computations and statistics on manifolds with geometric structures.

3rd Place Solution of the Traffic4Cast Core Challenge @ NeurIPS 2021

Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

Trax — Deep Learning with Clear Code and Speed

Doing fast searching of nearest neighbors in high dimensional spaces is an increasingly important problem