MoCoGAN: Decomposing Motion and Content for Video Generation

Last update: Dec 18, 2022

Overview

MoCoGAN: Decomposing Motion and Content for Video Generation

This repository contains an implementation and further details of MoCoGAN: Decomposing Motion and Content for Video Generation by Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz.

CVPR Poster:

Representation

MoCoGAN is a generative model for videos, which generates videos from random inputs. It features separated representations of motion and content, offering control over what is generated. For example, MoCoGAN can generate the same object performing different actions, as well as the same action performed by different objects

Examples of generated videos

We trained MoCoGAN on the MUG Facial Expression Database to generate facial expressions. When fixing the content code and changing the motion code, it generated the same person performs different expressions. When fixing the motion code and changing the content code, it generated different people performs the same expression. In the figure shown below, each column has fixed identity, each row shows the same action:

We trained MoCoGAN on a human action dataset where content is represented by the performer, executing several actions. When fixing the content code and changing the motion code, it generated the same person performs different actions. When fixing the motion code and changing the content code, it generated different people performs the same action. Each pair of images represents the same action executed by different people:

We have collected a large-scale TaiChi dataset including 4.5K videos of TaiChi performers. Below are videos generated by MoCoGAN.

Training MoCoGAN

Please refer to a wiki page

Citation

If you use MoCoGAN in your research please cite our paper:

Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz, "MoCoGAN: Decomposing Motion and Content for Video Generation"

@inproceedings{Tulyakov:2018:MoCoGAN,
 title={{MoCoGAN}: Decomposing motion and content for video generation},
 author={Tulyakov, Sergey and Liu, Ming-Yu and Yang, Xiaodong and Kautz, Jan},
 booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
 pages = {1526--1535},
 year={2018}
}

MoCoGAN: Decomposing Motion and Content for Video Generation

Related tags

Overview

MoCoGAN: Decomposing Motion and Content for Video Generation

Representation

Examples of generated videos

Training MoCoGAN

Citation

Other implementations:

Owner

Sergey Tulyakov

Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)

HyperCube: Implicit Field Representations of Voxelized 3D Models

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

Meta Representation Transformation for Low-resource Cross-lingual Learning

JAX bindings to the Flatiron Institute Non-uniform Fast Fourier Transform (FINUFFT) library

Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

Husein pet projects in here!

Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

A deep learning based semantic search platform that computes similarity scores between provided query and documents

Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

FinRL-Meta: A Universe for Data-Driven Financial Reinforcement Learning. 🔥

harmonic-percussive-residual separation algorithm wrapped as a VST3 plugin (iPlug2)

Decorators for maximizing memory utilization with PyTorch & CUDA

Inkscape extensions for figure resizing and editing

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Code repository for our paper "Learning to Generate Scene Graph from Natural Language Supervision" in ICCV 2021

Re-implementation of the vector capsule with dynamic routing

Testing and Estimation of structural breaks in Stata

Emotion classification of online comments based on RNN

MoCoGAN: Decomposing Motion and Content for Video Generation

Related tags

Overview

MoCoGAN: Decomposing Motion and Content for Video Generation

Representation

Examples of generated videos

Training MoCoGAN

Citation

Other implementations:

Owner

Sergey Tulyakov

Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)

HyperCube: Implicit Field Representations of Voxelized 3D Models

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

Meta Representation Transformation for Low-resource Cross-lingual Learning

JAX bindings to the Flatiron Institute Non-uniform Fast Fourier Transform (FINUFFT) library

Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

Husein pet projects in here!

Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

A deep learning based semantic search platform that computes similarity scores between provided query and documents

Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

FinRL­-Meta: A Universe for Data­-Driven Financial Reinforcement Learning. 🔥

harmonic-percussive-residual separation algorithm wrapped as a VST3 plugin (iPlug2)

Decorators for maximizing memory utilization with PyTorch & CUDA

Inkscape extensions for figure resizing and editing

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Code repository for our paper "Learning to Generate Scene Graph from Natural Language Supervision" in ICCV 2021

Re-implementation of the vector capsule with dynamic routing

Testing and Estimation of structural breaks in Stata

Emotion classification of online comments based on RNN

FinRL-Meta: A Universe for Data-Driven Financial Reinforcement Learning. 🔥