This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

Last update: May 03, 2022

Related tags

Deep Learning ObjProp

Overview

ObjProp

Introduction

This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

Installation

This repo is built using mmdetection. To install the dependencies, first clone the repository locally:

git clone https://github.com/anirudh-chakravarthy/objprop.git

Then, install PyTorch 1.1.0, torchvision 0.3.0, mmcv 0.2.12:

conda install pytorch==1.1.0 torchvision==0.3.0 -c pytorch
pip install mmcv==0.2.12

Then, install the CocoAPI for YouTube-VIS

conda install cython scipy
pip install git+https://github.com/youtubevos/cocoapi.git#"egg=pycocotools&subdirectory=PythonAPI"

Training

First, download and prepare the YouTube-VIS dataset using the following instructions.

To train ObjProp, run the following command:

python3 tools/train.py configs/masktrack_rcnn_r50_fpn_1x_youtubevos_objprop.py

In order to change the arguments such as dataset directory, learning rate, number of GPUs, etc, refer to the following configuration file configs/masktrack_rcnn_r50_fpn_1x_youtubevos_objprop.py.

Inference

To perform inference using ObjProp, run the following command:

python3 tools/test_video.py configs/masktrack_rcnn_r50_fpn_1x_youtubevos_objprop.py [MODEL_PATH] --out [OUTPUT_PATH.json] --eval segm

A JSON file with the inference results will be saved at OUTPUT_PATH.json. To evaluate the performance, submit the result file to the evaluation server.

License

ObjProp is released under the Apache 2.0 license.

Citation

@article{Chakravarthy2021ObjProp,
  author = {Anirudh S Chakravarthy and Won-Dong Jang and Zudi Lin and Donglai Wei and Song Bai and Hanspeter Pfister},  
  title = {Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation},
  journal = {CoRR},
  volume = {abs/2111.07529},
  year = {2021},
  url = {https://arxiv.org/abs/2111.07529}
}

This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

Related tags

Overview

ObjProp

Introduction

Installation

Training

Inference

License

Citation

Owner

Anirudh S Chakravarthy

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

A collection of resources, problems, explanations and concepts that are/were important during my Data Science journey

Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

PAIRED in PyTorch 🔥

Official Implementation of SWAGAN: A Style-based Wavelet-driven Generative Model

[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

MAg: a simple learning-based patient-level aggregation method for detecting microsatellite instability from whole-slide images

Code for "Retrieving Black-box Optimal Images from External Databases" (WSDM 2022)

VGGFace2-HQ - A high resolution face dataset for face editing purpose

SuperSonic, a new open-source framework to allow compiler developers to integrate RL into compilers easily, regardless of their RL expertise

Selective Wavelet Attention Learning for Single Image Deraining

TensorFlow Implementation of Unsupervised Cross-Domain Image Generation

Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.

Repository for Multimodal AutoML Benchmark

Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks

Built a deep neural network (DNN) that functions as an end-to-end machine translation pipeline

Modular Gaussian Processes

an implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

Deep learning model, heat map, data prepo