Tensorflow Implementation of ECCV'18 paper: Multimodal Human Motion Synthesis

Last update: Oct 02, 2022

Overview

MT-VAE for Multimodal Human Motion Synthesis

This is the code for ECCV 2018 paper MT-VAE: Learning Motion Transformations to Generate Multimodal Human Dynamics by Xinchen Yan, Akash Rastogi, Ruben Villegas, Kalyan Sunkavalli, Eli Shechtman, Sunil Hadap, Ersin Yumer, Honglak Lee.

Please follow the instructions to run the code.

Requirements

MT-VAE requires or works with

Mac OS X or Linux
NVIDIA GPU

Installing Dependency

Install TensorFlow
Note: this implementation has been tested with TensorFlow 1.3.

Data Preprocessing

For Human3.6M dataset, please download the pre-processed dataset.

bash prep_human36m_joints.sh

Disclaimer: Please check the license of Human3.6M dataset if you download this preprocessed version.

Training (MT-VAE)

If you want to train the MT-VAE human motion generator, please run the following script (usually it takes 1 day with a single Titan GPU).

bash demo_human36m_trainMTVAE.sh

Alternatively, you can download the pre-trained MT-VAE model, please run the following script.

bash prep_human36m_model.sh

Motion Synthesis Using Pre-trained MT-VAE Model

Please run the following command to generate multiple diverse human motion given initial motion.

bash demo_human36m_inferMTVAE.sh

Motion Analogy-making Using Pre-trained MT-VAE Model

Please run the following command to execute motion analogy-making.

bash demo_human36m_analogyMTVAE.sh

Hierchical Video Synthesis Using Pre-trained Image Generation Model

Please download full Human3.6M videos into the workspace/Human3.6M/ folder.
We use a pre-trained model from the ICML 2017 HierchVid Repository. Please run the following command for image synthesis given generated motion sequence.

CUDA_VISIBLE_DEVICE=0 python h36m_hierach_gensample.py

Disclaimer: Please double check the license in that repository and cite HierchVid paper when use.

Citation

If you find this useful, please cite our work as follows:

@inproceedings{yan2018mt,
  title={MT-VAE: Learning Motion Transformations to Generate Multimodal Human Dynamics},
  author={Yan, Xinchen and Rastogi, Akash and Villegas, Ruben and Sunkavalli, Kalyan and Shechtman, Eli and Hadap, Sunil and Yumer, Ersin and Lee, Honglak},
  booktitle={European Conference on Computer Vision},
  pages={276--293},
  year={2018},
  organization={Springer}
}

Acknowledgements

We would like to thank the amazing developers and the open-sourcing community. Our implementation has especially been benefited from the following excellent repositories:

Attribute2Image: https://github.com/xcyan/eccv16_attr2img
TensorFlow-PTN: https://github.com/tensorflow/models/tree/master/research/ptn
VideoGAN: https://github.com/cvondrick/videogan
MoCoGAN: https://github.com/sergeytulyakov/mocogan
HierchVid: https://github.com/rubenvillegas/icml2017hierchvid
Sketch-RNN: https://github.com/tensorflow/magenta/tree/master/magenta/models/sketch_rnn
VRNN: https://github.com/jych/nips2015_vrnn
SVG: https://github.com/edenton/svg

Tensorflow Implementation of ECCV'18 paper: Multimodal Human Motion Synthesis

Related tags

Overview

MT-VAE for Multimodal Human Motion Synthesis

Requirements

Installing Dependency

Data Preprocessing

Training (MT-VAE)

Motion Synthesis Using Pre-trained MT-VAE Model

Motion Analogy-making Using Pre-trained MT-VAE Model

Hierchical Video Synthesis Using Pre-trained Image Generation Model

Citation

Acknowledgements

Owner

Xinchen Yan

Diabetes-Feature-Engineering - A machine learning model that can predict whether people have diabetes when their characteristics are specified

Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle.

The source code for CATSETMAT: Cross Attention for Set Matching in Bipartite Hypergraphs

This is a JAX implementation of Neural Radiance Fields for learning purposes.

VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition

CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.

performing moving objects segmentation using image processing techniques with opencv and numpy

FCN (Fully Convolutional Network) is deep fully convolutional neural network architecture for semantic pixel-wise segmentation

A curated list of awesome Machine Learning frameworks, libraries and software.

Official repository for Natural Image Matting via Guided Contextual Attention

Implementation of Memformer, a Memory-augmented Transformer, in Pytorch

To prepare an image processing model to classify the type of disaster based on the image dataset

NeurIPS 2021, "Fine Samples for Learning with Noisy Labels"

Codebase for Image Classification Research, written in PyTorch.

Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"

Code for “ACE-HGNN: Adaptive Curvature ExplorationHyperbolic Graph Neural Network”

DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

Deep Federated Learning for Autonomous Driving

Implementation of ToeplitzLDA for spatiotemporal stationary time series data.

AAAI 2022 paper - Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction