Code release for Convolutional Two-Stream Network Fusion for Video Action Recognition

Last update: Dec 31, 2022

Related tags

Overview

================================================================================

Convolutional Two-Stream Network Fusion for Video Action Recognition

This repository contains the code for our CVPR 2016 paper:

Christoph Feichtenhofer, Axel Pinz, Andrew Zisserman
"Convolutional Two-Stream Network Fusion for Video Action Recognition"
in Proc. CVPR 2016

If you find the code useful for your research, please cite our paper:

    @inproceedings{feichtenhofer2016convolutional,
      title={Convolutional Two-Stream Network Fusion for Video Action Recognition},
      author={Feichtenhofer, Christoph and Pinz, Axel and Zisserman, Andrew},
      booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},
      year={2016}
    }

Requirements

The code was tested on Ubuntu 14.04 and Windows 10 using MATLAB R2015b and NVIDIA Titan X or Z GPUs.

If you have questions regarding the implementation please contact:

Christoph Feichtenhofer

================================================================================

Setup

Download the code git clone --recursive https://github.com/feichtenhofer/twostreamfusion
Compile the code by running compile.m.
- This will also compile a modified (and older) version of the MatConvNet toolbox. In case of any issues, please follow the installation instructions on the MatConvNet homepage.
Edit the file cnn_setup_environment.m to adjust the models and data paths.
Download pretrained model files and the datasets, linked below and unpack them into your models/data directory.

Optionally you can pretrain your own twostream models by running
1. cnn_ucf101_spatial(); to train the appearance network stream.
2. cnn_ucf101_temporal(); to train the optical flow network stream.

Run cnn_ucf101_fusion(); this will use the downloaded models and demonstrate training of our final architecture on UCF101/HMDB51.
- In case you would like to train on the CPU, clear the variable opts.train.gpus
- In case you encounter memory issues on your GPU, consider decreasing the cudnnWorkspaceLimit (512MB is default)

Pretrained models

Download our baseline networks trained on UCF101 here:

Data

Pre-computed optical flow images and resized rgb frames for the UCF101 and HMDB51 datasets

UCF101 RGB: part1 part2 part3
UCF101 Flow: part1 part2 part3
HMDB51 RGB: part1
HMDB51 Flow: part1

Use it on your own dataset

Our Optical flow extraction tool provides OpenCV wrappers for optical flow extraction on a GPU.

Code release for Convolutional Two-Stream Network Fusion for Video Action Recognition

Related tags

Overview

Convolutional Two-Stream Network Fusion for Video Action Recognition

Requirements

Setup

Pretrained models

Data

Use it on your own dataset

Owner

Christoph Feichtenhofer

This implements one of result networks from Large-scale evolution of image classifiers

Semantic segmentation task for ADE20k & cityscapse dataset, based on several models.

RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

An end-to-end regression problem of predicting the price of properties in Bangalore.

Solving reinforcement learning tasks which require language and vision

Few-Shot-Intent-Detection includes popular challenging intent detection datasets with/without OOS queries and state-of-the-art baselines and results.

This codebase proposes modular light python and pytorch implementations of several LiDAR Odometry methods

code for our ECCV 2020 paper "A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation"

Official implementation for the paper "Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection"

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

System Combination for Grammatical Error Correction Based on Integer Programming

Semi-supervised Learning for Sentiment Analysis

High-Resolution Image Synthesis with Latent Diffusion Models

Redash reset for python

This tool converts a Nondeterministic Finite Automata (NFA) into a Deterministic Finite Automata (DFA)

CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

AFL binary instrumentation

Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition

Convolutional 2D Knowledge Graph Embeddings resources