Instance-wise Occlusion and Depth Orders in Natural Scenes (CVPR 2022)

Last update: Dec 27, 2022

Related tags

Overview

Instance-wise Occlusion and Depth Orders in Natural Scenes

Official source code. Appears at CVPR 2022

This repository provides a new dataset, named InstaOrder, that can be used to understand the geometrical relationships of instances in an image. The dataset consists of 2.9M annotations of geometric orderings for class-labeled instances in 101K natural scenes. The scenes were annotated by 3,659 crowd-workers regarding (1) occlusion order that identifies occluder/occludee and (2) depth order that describes ordinal relations that consider relative distance from the camera. This repository also introduce a geometric order prediction network called InstaOrderNet, which is superior to state-of-the-art approaches.

Installation

This code has been developed under Anaconda(Python 3.6), Pytorch 1.7.1, torchvision 0.8.2 and CUDA 10.1. Please install following environments:

# build conda environment
conda create --name order python=3.6
conda activate order

# install requirements
pip install -r requirements.txt

# install COCO API
pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

Visualization

Check InstaOrder_vis.ipynb to visualize InstaOrder dataset including instance masks, occlusion order, and depth order.

Training

The experiments folder contains train and test scripts of experiments demonstrated in the paper.

To train {MODEL} with {DATASET},

Download {DATASET} following this.
Set ${base_dir} correctly in experiments/{DATASET}/{MODEL}/config.yaml
(Optional) To train InstaDepthNet, download MiDaS-v2.1 model-f6b98070.pt under ${base_dir}/data/out/InstaOrder_ckpt

Run the script file as follow:

sh experiments/{DATASET}/{MODEL}/train.sh

# Example of training InstaOrderNet^o (Table3 in the main paper) from the scratch
sh experiments/InstaOrder/InstaOrderNet_o/train.sh

Inference

Download pretrained models InstaOrder_ckpt.zip (3.5G) and unzip files following the below structure. Pretrained models are named by {DATASET}_{MODEL}.pth.tar

${base_dir}
|--data
|    |--out
|    |    |--InstaOrder_ckpt
|    |    |    |--COCOA_InstaOrderNet_o.pth.tar
|    |    |    |--COCOA_OrderNet.pth.tar
|    |    |    |--COCOA_pcnet_m.pth.tar
|    |    |    |--InstaOrder_InstaDepthNet_d.pth.tar
|    |    |    |--InstaOrder_InstaDepthNet_od.pth.tar
|    |    |    |--InstaOrder_InstaOrderNet_d.pth.tar
|    |    |    |--InstaOrder_InstaOrderNet_o.pth.tar
|    |    |    |--InstaOrder_InstaOrderNet_od.pth.tar
|    |    |    |--InstaOrder_OrderNet.pth.tar
|    |    |    |--InstaOrder_OrderNet_ext.pth.tar  
|    |    |    |--InstaOrder_pcnet_m.pth.tar
|    |    |    |--KINS_InstaOrderNet_o.pth.tar
|    |    |    |--KINS_OrderNet.pth.tar
|    |    |    |--KINS_pcnet_m.pth.tar

(Optional) To test InstaDepthNet, download MiDaS-v2.1 model-f6b98070.pt under ${base_dir}/data/out/InstaOrder_ckpt
Set ${base_dir} correctly in experiments/{DATASET}/{MODEL}/config.yaml

To test {MODEL} with {DATASET}, run the script file as follow:

sh experiments/{DATASET}/{MODEL}/test.sh

# Example of reproducing the accuracy of InstaOrderNet^o (Table3 in the main paper)
sh experiments/InstaOrder/InstaOrderNet_o/test.sh

Datasets

InstaOrder dataset

To use InstaOrder, download files following the below structure

${base_dir}
|--data
|    |--COCO
|    |    |--train2017/
|    |    |--val2017/
|    |    |--annotations/
|    |    |    |--instances_train2017.json
|    |    |    |--instances_val2017.json
|    |    |    |--InstaOrder_train2017.json
|    |    |    |--InstaOrder_val2017.json

COCOA dataset

To use COCOA, download files following the below structure

${base_dir}
|--data
|    |--COCO
|    |    |--train2014/
|    |    |--val2014/
|    |    |--annotations/
|    |    |    |--COCO_amodal_train2014.json 
|    |    |    |--COCO_amodal_val2014.json
|    |    |    |--COCO_amodal_val2014.json

KINS dataset

To use KINS, download files following the below structure

KINS dataset

${base_dir}
|--data
|    |--KINS
|    |    |--training/
|    |    |--testing/
|    |    |--instances_val.json
|    |    |--instances_train.json

DIW dataset

To use DIW, download files following the below structure

DIW Dataset

${base_dir}
|--data
|    |--DIW
|    |    |--DIW_test/
|    |    |--DIW_Annotations
|    |    |    |--DIW_test.csv

Citing InstaOrder

If you find this code/data useful in your research then please cite our paper:

@inproceedings{lee2022instaorder,
  title={{Instance-wise Occlusion and Depth Orders in Natural Scenes}},
  author={Hyunmin Lee and Jaesik Park},
  booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition},
  year={2022}
}

Acknowledgement

We have reffered to and borrowed the implementations from Xiaohang Zhan

Instance-wise Occlusion and Depth Orders in Natural Scenes (CVPR 2022)

Related tags

Overview

Instance-wise Occlusion and Depth Orders in Natural Scenes

Installation

Visualization

Training

Inference

Datasets

InstaOrder dataset

COCOA dataset

KINS dataset

DIW dataset

Citing InstaOrder

Acknowledgement

Owner

Cryptocurrency Prediction with Artificial Intelligence (Deep Learning via LSTM Neural Networks)

Automated Evidence Collection for Fake News Detection

Microscopy Image Cytometry Toolkit

Python Blood Vessel Topology Analysis

A collection of educational notebooks on multi-view geometry and computer vision.

A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.

Code for our paper "Sematic Representation for Dialogue Modeling" in ACL2021

ParaGen is a PyTorch deep learning framework for parallel sequence generation

A copy of Ares that costs 30 fucking dollars.

Official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

OpenAi's gym environment wrapper to vectorize them with Ray

Code for "NeRS: Neural Reflectance Surfaces for Sparse-View 3D Reconstruction in the Wild," in NeurIPS 2021

This is the second place solution for : UmojaHack Africa 2022: African Snake Antivenom Binding Challenge

Jupyter notebooks showing best practices for using cx_Oracle, the Python DB API for Oracle Database

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

Medical image analysis framework merging ANTsPy and deep learning

RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

Custom IMDB Dataset is extracted between 2020-2021 and custom distilBERT model is trained for movie success probability prediction

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

[CVPR 2016] Unsupervised Feature Learning by Image Inpainting using GANs