Compressed Video Action Recognition

Last update: Dec 26, 2022

Related tags

Overview

Compressed Video Action Recognition

Chao-Yuan Wu, Manzil Zaheer, Hexiang Hu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl.
In CVPR, 2018. [Project Page]

Overview

This is a reimplementation of CoViAR in PyTorch (the original paper uses MXNet). This code currently supports UCF-101 and HMDB-51; Charades coming soon. (This is a work in progress. Any suggestions are appreciated.)

Results

This code produces comparable or better results than the original paper:
HMDB-51: 52% (I-frame), 40% (motion vector), 43% (residuals), 59.2% (CoViAR).
UCF-101: 87% (I-frame), 70% (motion vector), 80% (residuals), 90.5% (CoViAR).
(average of 3 splits; without optical flow. )

Data loader

We provide a python data loader that directly takes a compressed video and returns the compressed representation (I-frames, motion vectors, and residuals) as a numpy array . We can thus train the model without extracting and storing all representations as image files.

In our experiments, it's fast enough so that it doesn't delay GPU training. Please see GETTING_STARTED.md for details and instructions.

Using CoViAR

Please see GETTING_STARTED.md for instructions for training and inference.

Citation

If you find this model useful for your resesarch, please use the following BibTeX entry.

@inproceedings{wu2018coviar,
  title={Compressed Video Action Recognition},
  author={Wu, Chao-Yuan and Zaheer, Manzil and Hu, Hexiang and Manmatha, R and Smola, Alexander J and Kr{\"a}henb{\"u}hl, Philipp},
  booktitle={CVPR},
  year={2018}
}

Acknowledgment

This implementation largely borrows from tsn-pytorch by yjxiong. Part of the dataloader implementation is modified from this tutorial and FFmpeg extract_mv example.

Compressed Video Action Recognition

Related tags

Overview

Compressed Video Action Recognition

Overview

Results

Data loader

Using CoViAR

Citation

Acknowledgment

Owner

Chao-Yuan Wu

Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.

Codes for paper "KNAS: Green Neural Architecture Search"

you can add any codes in any language by creating its respective folder (if already not available).

“英特尔创新大师杯”深度学习挑战赛赛道3：CCKS2021中文NLP地址相关性任务

An official repository for Paper "Uformer: A General U-Shaped Transformer for Image Restoration".

Learning and Building Convolutional Neural Networks using PyTorch

Pcos-prediction - Predicts the likelihood of Polycystic Ovary Syndrome based on patient attributes and symptoms

This is the replication package for paper submission: Towards Training Reproducible Deep Learning Models.

3D position tracking for soccer players with multi-camera videos

This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

toroidal - a lightweight transformer library for PyTorch

A Transformer-Based Siamese Network for Change Detection

Data for "Driving the Herd: Search Engines as Content Influencers" paper

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Deep Learning to Create StepMania SM FIles

Kinetics-Data-Preprocessing

This is an implementation of PIFuhd based on Pytorch

A simple code to perform canny edge contrast detection on images.

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

The Malware Open-source Threat Intelligence Family dataset contains 3,095 disarmed PE malware samples from 454 families

Compressed Video Action Recognition

Related tags

Overview

Compressed Video Action Recognition

Overview

Results

Data loader

Using CoViAR

Citation

Acknowledgment

Owner

Chao-Yuan Wu

Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.

Codes for paper "KNAS: Green Neural Architecture Search"

you can add any codes in any language by creating its respective folder (if already not available).

“英特尔创新大师杯”深度学习挑战赛 赛道3：CCKS2021中文NLP地址相关性任务

An official repository for Paper "Uformer: A General U-Shaped Transformer for Image Restoration".

Learning and Building Convolutional Neural Networks using PyTorch

Pcos-prediction - Predicts the likelihood of Polycystic Ovary Syndrome based on patient attributes and symptoms

This is the replication package for paper submission: Towards Training Reproducible Deep Learning Models.

3D position tracking for soccer players with multi-camera videos

This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

toroidal - a lightweight transformer library for PyTorch

A Transformer-Based Siamese Network for Change Detection

Data for "Driving the Herd: Search Engines as Content Influencers" paper

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Deep Learning to Create StepMania SM FIles

Kinetics-Data-Preprocessing

This is an implementation of PIFuhd based on Pytorch

A simple code to perform canny edge contrast detection on images.

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

The Malware Open-source Threat Intelligence Family dataset contains 3,095 disarmed PE malware samples from 454 families

“英特尔创新大师杯”深度学习挑战赛赛道3：CCKS2021中文NLP地址相关性任务