PyTorch implementation DRO: Deep Recurrent Optimizer for Structure-from-Motion

Related tags

Deep Learningdro-sfm
Overview

DRO: Deep Recurrent Optimizer for Structure-from-Motion

This is the official PyTorch implementation code for DRO-sfm. For technical details, please refer to:

DRO: Deep Recurrent Optimizer for Structure-from-Motion
Xiaodong Gu*, Weihao Yuan*, Zuozhuo Dai, Chengzhou Tang, Siyu Zhu, Ping Tan
[Paper]

Bibtex

If you find this code useful in your research, please cite:

@article{gu2021dro,
  title={DRO: Deep Recurrent Optimizer for Structure-from-Motion},
  author={Gu, Xiaodong and Yuan, Weihao and Dai, Zuozhuo and Tang, Chengzhou and Zhu, Siyu and Tan, Ping},
  journal={arXiv preprint arXiv:2103.13201},
  year={2021}
}

Contents

  1. Install
  2. Datasets
  3. Training
  4. Evaluation
  5. Models

Install

  • We recommend using nvidia-docker2 to have a reproducible environment.
git clone https://github.com/aliyun/dro-sfm.git
cd dro-sfm
sudo make docker-build
sudo make docker-start-interactive

You can also download the built docker directly from dro-sfm-image.tar

docker load < dro-sfm-image.tar
  • If you do not use docker, you could create an environment following the steps in the Dockerfile.
# Environment variables
export PYTORCH_VERSION=1.4.0
export TORCHVISION_VERSION=0.5.0
export NCCL_VERSION=2.4.8-1+cuda10.1
export HOROVOD_VERSION=65de4c961d1e5ad2828f2f6c4329072834f27661
# Install NCCL
sudo apt-get install libnccl2=${NCCL_VERSION} libnccl-dev=${NCCL_VERSION}

# Install Open MPI
mkdir /tmp/openmpi && \
    cd /tmp/openmpi && \
    wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-4.0.0.tar.gz && \
    tar zxf openmpi-4.0.0.tar.gz && \
    cd openmpi-4.0.0 && \
    ./configure --enable-orterun-prefix-by-default && \
    make -j $(nproc) all && \
    make install && \
    ldconfig && \
    rm -rf /tmp/openmpi

# Install PyTorch
pip install torch==${PYTORCH_VERSION} torchvision==${TORCHVISION_VERSION} && ldconfig

# Install horovod (for distributed training)
sudo ldconfig /usr/local/cuda/targets/x86_64-linux/lib/stubs && HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_GPU_BROADCAST=NCCL HOROVOD_WITH_PYTORCH=1 pip install --no-cache-dir git+https://github.com/horovod/horovod.git@${HOROVOD_VERSION} && sudo ldconfig

To verify that the environment is setup correctly, you can run a simple overfitting test:

# download a tiny subset of KITTI
cd dro-sfm
curl -s https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/dro-sfm/datasets/KITTI_tiny.tar | tar xv -C /data/datasets/kitti/
# in docker
./run.sh "python scripts/train.py configs/overfit_kitti_mf_gt.yaml" log.txt

Datasets

Datasets are assumed to be downloaded in /data/datasets/ (can be a symbolic link).

KITTI

The KITTI (raw) dataset used in our experiments can be downloaded from the KITTI website. For convenience, you can download data from packnet or here

Tiny KITTI

For simple tests, you can download a "tiny" version of KITTI:

Scannet

The Scannet (raw) dataset used in our experiments can be downloaded from the Scannet website. For convenience, you can download data from here

DeMoN

Download DeMoN.

bash download_traindata.sh
python ./dataset/preparation/preparedata_train.py
bash download_testdata.sh
python ./dataset/preparation/preparedata_test.py

Training

Any training, including fine-tuning, can be done by passing either a .yaml config file or a .ckpt model checkpoint to scripts/train.py:

# kitti, checkpoints will saved in ./results/mdoel/
./run.sh 'python scripts/train.py  configs/train_kitti_mf_gt.yaml' logs/kitti_sup.txt
./run.sh 'python scripts/train.py  configs/train_kitti_mf_selfsup.yaml' logs/kitti_selfsup.txt 

# scannet
./run.sh 'python scripts/train.py  configs/train_scannet_mf_gt_view3.yaml' logs/scannet_sup.txt
./run.sh 'python scripts/train.py  configs/train_scannet_mf_selfsup_view3.yaml' logs/scannet_selfsup.txt
./run.sh 'python scripts/train.py  configs/train_scannet_mf_gt_view5.yaml' logs/scannet_sup_view5.txt

# demon
./run.sh 'python scripts/train.py  configs/train_demon_mf_gt.yaml' logs/demon_sup.txt

Evaluation

python scripts/eval.py --checkpoint <checkpoint.ckpt> [--config <config.yaml>]
# example:kitti, results will be saved in results/depth/
python scripts/eval.py --checkpoint ckpt/outdoor_kitti.ckpt --config configs/train_kitti_mf_gt.yaml

You can also directly run inference on a single image or video:

# video or folder
# indoor-scannet 
python scripts/infer_video.py --checkpoint ckpt/indoor_sacnnet.ckpt --input /path/to/video or folder --output /path/to/save_folder --sample_rate 1 --data_type scannet --ply_mode 
 # indoor-general
python scripts/infer_video.py --checkpoint ckpt/indoor_sacnnet.ckpt --input /path/to/video or folder --output /path/to/save_folder --sample_rate 1 --data_type general --ply_mode

# outdoor
python scripts/infer_video.py --checkpoint ckpt/outdoor_kitti.ckpt --input /path/to/video or folder --output /path/to/save_folder --sample_rate 1 --data_type kitti --ply_mode 

# image
python scripts/infer.py --checkpoint <checkpoint.ckpt> --input <image or folder> --output <image or folder>

Models

Model Abs.Rel. Sqr.Rel RMSE RMSElog a1 a2 a3 SILog L1_inv rot_ang t_ang t_cm
Kitti_sup 0.045 0.193 2.570 0.080 0.971 0.994 0.998 0.079 0.003 - - -
Kitti_selfsup 0.053 0.346 3.037 0.102 0.962 0.990 0.996 0.101 0.004 - - -
scannet_sup 0.053 0.017 0.165 0.080 0.967 0.994 0.998 0.078 0.033 0.472 9.297 1.160
scannet_sup(view5) 0.047 0.014 0.151 0.072 0.976 0.996 0.999 0.071 0.030 0.456 8.502 1.163
scannet_selfsup 0.143 0.345 0.656 0.274 0.896 0.954 0.969 0.272 0.106 0.609 10.779 1.393

Acknowledgements

Thanks to Toyota Research Institute for opening source of excellent work packnet-sfm. Thanks to Zachary Teed for opening source of his excellent work RAFT.

Owner
Alibaba Cloud
More Than Just Cloud
Alibaba Cloud
Speedy Implementation of Instance-based Learning (IBL) agents in Python

A Python library to create single or multi Instance-based Learning (IBL) agents that are built based on Instance Based Learning Theory (IBLT) 1 Instal

0 Nov 18, 2021
Graph Convolutional Neural Networks with Data-driven Graph Filter (GCNN-DDGF)

Graph Convolutional Gated Recurrent Neural Network (GCGRNN) Improved from Graph Convolutional Neural Networks with Data-driven Graph Filter (GCNN-DDGF

Lei Lin 21 Dec 18, 2022
A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation.

TiSASRec.paddle A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation. Introduction 论文:Time Interval Aware Sel

Paddorch 2 Nov 28, 2021
Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings

Text2Music Emotion Embedding Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings Reference Emotion Embedding Spaces for Matching

Minz Won 50 Dec 05, 2022
Progressive Coordinate Transforms for Monocular 3D Object Detection

Progressive Coordinate Transforms for Monocular 3D Object Detection This repository is the official implementation of PCT. Introduction In this paper,

58 Nov 06, 2022
Neural Caption Generator with Attention

Neural Caption Generator with Attention Tensorflow implementation of "Show

Taeksoo Kim 510 Nov 30, 2022
Predicting Tweet Sentiment Maching Learning and streamlit

Predicting-Tweet-Sentiment-Maching-Learning-and-streamlit (I prefere using Visual Studio Code ) Open the folder in VS Code Run the first cell in requi

1 Nov 20, 2021
End-to-End Referring Video Object Segmentation with Multimodal Transformers

End-to-End Referring Video Object Segmentation with Multimodal Transformers This repo contains the official implementation of the paper: End-to-End Re

608 Dec 30, 2022
A Free and Open Source Python Library for Multiobjective Optimization

Platypus What is Platypus? Platypus is a framework for evolutionary computing in Python with a focus on multiobjective evolutionary algorithms (MOEAs)

Project Platypus 424 Dec 18, 2022
One Million Scenes for Autonomous Driving

ONCE Benchmark This is a reproduced benchmark for 3D object detection on the ONCE (One Million Scenes) dataset. The code is mainly based on OpenPCDet.

148 Dec 28, 2022
AugLiChem - The augmentation library for chemical systems.

AugLiChem Welcome to AugLiChem! The augmentation library for chemical systems. This package supports augmentation for both crystaline and molecular sy

BaratiLab 17 Jan 08, 2023
Official Pytorch implementation of the paper "MotionCLIP: Exposing Human Motion Generation to CLIP Space"

MotionCLIP Official Pytorch implementation of the paper "MotionCLIP: Exposing Human Motion Generation to CLIP Space". Please visit our webpage for mor

Guy Tevet 173 Dec 26, 2022
SwinIR: Image Restoration Using Swin Transformer

SwinIR: Image Restoration Using Swin Transformer This repository is the official PyTorch implementation of SwinIR: Image Restoration Using Shifted Win

Jingyun Liang 2.4k Jan 08, 2023
Code for Universal Semi-Supervised Semantic Segmentation models paper accepted in ICCV 2019

USSS_ICCV19 Code for Universal Semi Supervised Semantic Segmentation accepted to ICCV 2019. Full Paper available at https://arxiv.org/abs/1811.10323.

Tarun K 68 Nov 24, 2022
Run containerized, rootless applications with podman

Why? restrict scope of file system access run any application without root privileges creates usable "Desktop applications" to integrate into your nor

119 Dec 27, 2022
3D-Transformer: Molecular Representation with Transformer in 3D Space

3D-Transformer: Molecular Representation with Transformer in 3D Space

55 Dec 19, 2022
LETR: Line Segment Detection Using Transformers without Edges

LETR: Line Segment Detection Using Transformers without Edges Introduction This repository contains the official code and pretrained models for Line S

mlpc-ucsd 157 Jan 06, 2023
A package for "Procedural Content Generation via Reinforcement Learning" OpenAI Gym interface.

Readme: Illuminating Diverse Neural Cellular Automata for Level Generation This is the codebase used to generate the results presented in the paper av

Sam Earle 27 Jan 05, 2023
Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Volumetric TSDF Fusion of RGB-D Images in Python This is a lightweight python script that fuses multiple registered color and depth images into a proj

Andy Zeng 845 Jan 03, 2023
CVPR2021 Content-Aware GAN Compression

Content-Aware GAN Compression [ArXiv] Paper accepted to CVPR2021. @inproceedings{liu2021content, title = {Content-Aware GAN Compression}, auth

52 Nov 06, 2022