Contrastively Disentangled Sequential Variational Audoencoder

Last update: Dec 24, 2022

Related tags

Overview

Contrastively Disentangled Sequential Variational Audoencoder (C-DSVAE)

Overview

This is the implementation for our C-DSVAE, a novel self-supervised disentangled sequential representation learning method.

Requirements

Python 3
PyTorch 1.7
Numpy 1.18.5

Dataset

Sprites

We provide the raw Sprites .npy files. One can also find the dataset on a third-party repo.

For each split (train/test), we expect the following components for each sequence sample

x: raw sample of shape [8, 3, 64, 64]
c_aug: content augmentation of shape [8, 3, 64, 64]
m_aug: motion augmentation of shape [8, 3, 64, 64]
motion factors: action (3 classes), direction (3 classes)
content factors: skin, tops, pants, hair (each with 6 classes)

Running

Train

./run_cdsvae.sh

Test

./run_test_sprite.sh

Classification Judge

The judge classifiers are pretrained with full supervision separately.

Sprites judge

C-DSVAE Checkpoints

We provide a sample Sprites checkpoint. Checkpoint parameters can be found in ./run_test_sprite.sh.

Paper

If you are inspired by our work, please cite the following paper:

@inproceedings{bai2021contrastively,
  title={Contrastively Disentangled Sequential Variational Autoencoder},
  author={Bai, Junwen and Wang, Weiran and Gomes, Carla},
  booktitle={Advances in Neural Information Processing Systems},
  volume={},
  year={2021}
}

Contrastively Disentangled Sequential Variational Audoencoder

Related tags

Overview

Contrastively Disentangled Sequential Variational Audoencoder (C-DSVAE)

Overview

Requirements

Dataset

Sprites

Running

Train

Test

Classification Judge

C-DSVAE Checkpoints

Paper

Owner

Junwen Bai

Joint project of the duo Hacker Ninjas

Differentiable Simulation of Soft Multi-body Systems

“Robust Lightweight Facial Expression Recognition Network with Label Distribution Training”, AAAI 2021.

Use Python, OpenCV, and MediaPipe to control a keyboard with facial gestures

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud

Predicting Auction Sale Price using the kaggle bulldozer auction sales data: Modeling with Ensembles vs Neural Network

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)

DECAF: Deep Extreme Classification with Label Features

DiffStride: Learning strides in convolutional neural networks

Credo AI Lens is a comprehensive assessment framework for AI systems. Lens standardizes model and data assessment, and acts as a central gateway to assessments created in the open source community.

Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

A demo of how to use JAX to create a simple gravity simulation

Train emoji embeddings based on emoji descriptions.

Spatial Intention Maps for Multi-Agent Mobile Manipulation (ICRA 2021)

Training Very Deep Neural Networks Without Skip-Connections

A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

The Adapter-Bot: All-In-One Controllable Conversational Model