FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

Last update: Dec 23, 2022

Overview

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation (CVPR 2021)

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction. It uses a customized encoder decoder architecture with spatio-temporal convolutions and channel gating to capture and interpolate complex motion trajectories between frames to generate realistic high frame rate videos. This repository contains original source code for the paper accepted to CVPR 2021.

Dependencies

We used the following to train and test the model.

Ubuntu 18.04
Python==3.7.4
numpy==1.19.2
PyTorch==1.5.0, torchvision==0.6.0, cudatoolkit==10.1

Model

Training model on Vimeo-90K septuplets

For training your own model on the Vimeo-90K dataset, use the following command. You can download the dataset from this link. The results reported in the paper are trained using 8GPUs.

python main.py --batch_size 32 --test_batch_size 32 --dataset vimeo90K_septuplet --loss 1*L1 --max_epoch 200 --lr 0.0002 --data_root <dataset_path> --n_outputs 1

Training on GoPro dataset is similar, change n_outputs to 7 for 8x interpolation.

Testing using trained model.

Trained Models.

You can download the pretrained FLAVR models from the following links.

Method	Trained Model
2x	Link
4x	Link
8x	Link

2x Interpolation

For testing a pretrained model on Vimeo-90K septuplet validation set, you can run the following command:

python test.py --dataset vimeo90K_septuplet --data_root <data_path> --load_from <saved_model> --n_outputs 1

8x Interpolation

For testing a multiframe interpolation model, use the same command as above with multiframe FLAVR model, with n_outputs changed accordingly.

Time Benchmarking

The testing script, in addition to computing PSNR and SSIM values, will also output the inference time and speed for interpolation.

Evaluation on Middleburry

To evaluate on the public benchmark of Middleburry, run the following.

python Middleburry_Test.py --data_root <data_path> --load_from <model_path>

The interpolated images will be saved to the folder Middleburry in a format that can be readily uploaded to the leaderboard.

SloMo-Filter on custom video

You can use our trained models and apply the slomo filter on your own video (requires OpenCV 4.2.0). Use the following command. If you want to convert a 30FPS video to 240FPS video, simply use the command

python interpolate.py --input_video <input_video> --factor 8 --load_model <model_path>

by using our pretrained model for 8x interpolation. For converting a 30FPS video to 60FPS video, use a 2x model with factor 2.

Baseline Models

We also train models for many other previous works on our setting, and provide models for all these methods. Complete benchmarking scripts will also be released soon.

Method	PSNR on Vimeo	Trained Model
FLAVR	36.3	Model
AdaCoF	35.3	Model
QVI	35.15	Model
DAIN	34.19	Model
SuperSloMo*	32.90	Model

SuperSloMo is implemented using code repository from here. Other baselines are implemented using the official codebases.

Google Colab

Coming soon ... !

Acknowledgement

The code is heavily borrowed from Facebook's official PyTorch video repository and CAIN.

Cite

If this code helps in your work, please consider citing us.

@article{kalluri2021flavr,
  title={FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation},
  author={Kalluri, Tarun and Pathak, Deepak and Chandraker, Manmohan and Tran, Du},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2021}
}

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

Related tags

Overview

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation (CVPR 2021)

Dependencies

Model

Training model on Vimeo-90K septuplets

Testing using trained model.

Trained Models.

2x Interpolation

8x Interpolation

Time Benchmarking

Evaluation on Middleburry

SloMo-Filter on custom video

Baseline Models

Google Colab

Acknowledgement

Cite

Owner

Tarun K

Mail classification with tensorflow and MS Exchange Server (ham or spam).

CT Based COVID 19 Diagnose by Image Processing and Deep Learning

MlTr: Multi-label Classification with Transformer

PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection?

This is a official repository of SimViT.

Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Streamlit app demonstrating an image browser for the Udacity self-driving-car dataset with realtime object detection using YOLO.

DNA-RECON { Automatic Web Reconnaissance Tool }

Official Pytorch implementation for video neural representation (NeRV)

Release of the ConditionalQA dataset

PyTorch implementation of Super SloMo by Jiang et al.

Vision Deep-Learning using Tensorflow, Keras.

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

PyTorch implementation of Constrained Policy Optimization

Unsupervised Video Interpolation using Cycle Consistency

FairMOT for Multi-Class MOT using YOLOX as Detector

A simple interface for editing natural photos with generative neural networks.

SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

Related tags

Overview

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation (CVPR 2021)

Dependencies

Model

Training model on Vimeo-90K septuplets

Testing using trained model.

Trained Models.

2x Interpolation

8x Interpolation

Time Benchmarking

Evaluation on Middleburry

SloMo-Filter on custom video

Baseline Models

Google Colab

Acknowledgement

Cite

Owner

Tarun K

Mail classification with tensorflow and MS Exchange Server (ham or spam).

CT Based COVID 19 Diagnose by Image Processing and Deep Learning

MlTr: Multi-label Classification with Transformer

PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection?

This is a official repository of SimViT.

Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Streamlit app demonstrating an image browser for the Udacity self-driving-car dataset with realtime object detection using YOLO.

DNA-RECON { Automatic Web Reconnaissance Tool }

Official Pytorch implementation for video neural representation (NeRV)

Release of the ConditionalQA dataset

PyTorch implementation of Super SloMo by Jiang et al.

Vision Deep-Learning using Tensorflow, Keras.

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

PyTorch implementation of Constrained Policy Optimization

Unsupervised Video Interpolation using Cycle Consistency

FairMOT for Multi-Class MOT using YOLOX as Detector

A simple interface for editing natural photos with generative neural networks.

SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,