Auto-Encoding Score Distribution Regression for Action Quality Assessment

Last update: Nov 16, 2022

Related tags

Overview

DAE-AQA

It is an open source program reference to paper Auto-Encoding Score Distribution Regression for Action Quality Assessment.

1.Introduction

DAE is a model for action quality assessment(AQA). It takes both advantages of regression algorithms and label distribution learning (LDL). Specifically, it encodes videos into distributions and uses the reparameterization trick in variational auto-encoders (VAE) to sample scores, which establishes a more accurate mapping between video and score. It can be appled to many scenarios. e.g, judgment of accuracy of an operation or score estimation of an diving athlete’s performance.

2.Datasets

MTL-AQA dataset

MTL-AQA dataset was orignially presented in the paper What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment (CVPR 2019) [arXiv], where the authors provided the YouTube links of untrimmed long videos and the corresponding annotations at here. The processed MTL-AQA dataset(Frames) can be downloaded through the following links:

1.[Google Drive]

2.[Baidu Drive](Password:SEU1)

The whole data structure should be:

DAE_AQA
├── data
|  └── frames
|  └── info
...

JIGSAWS dataset

JIGSAWS dataset was presented in the paper Jhu-isi gesture and skill assessment working set (jigsaws): A surgical activity dataset for human motion modeling (MICCAI workshop 2014), where the raw videos could be downloaded at here. We're typographing this part of the code, and we'll release it soon. The whole data structure is same as MTL-AQA. The processed JIGSAWS dataset(Frames) can be downloaded through the following links:

1.[Google Drive]

2.[Baidu Drive](Password:SEU1)

3.Training

training DAE model:

$ python DAE.py --log_info=DAE --num_workers=16 --gpu=0 --train_batch_size=8 --test_batch_size=32 --num_epochs=100

training DAE-MT model:

$ python DAE_MT.py --log_info=DAE-MT --num_workers=16 --gpu=0 --train_batch_size=8 --test_batch_size=32 --num_epochs=100

All default parameters are set in config.py. Considering that the memory of video processing on GPU is quite large, we suggest using small batch for training.

4.Testing

We provided a pre-trained DAE-MT model weight with a correlation coefficient of 0.9449 on MTL-AQA test dataset. You can download it through the following links:

1.[Google Drive]

2.[Baidu Drive](Password:SEU1)

CONTACT US:

If you have any questiones or meet any bugs, please contact us!

E-mail: [email protected]

Auto-Encoding Score Distribution Regression for Action Quality Assessment

Related tags

Overview

DAE-AQA

1.Introduction

2.Datasets

MTL-AQA dataset

JIGSAWS dataset

3.Training

4.Testing

CONTACT US:

Owner

the official code for ICRA 2021 Paper: "Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation"

My take on a practical implementation of Linformer for Pytorch.

DARTS-: Robustly Stepping out of Performance Collapse Without Indicators

Bunch of different tools which helps visualizing and annotating images for semantic/instance segmentation tasks

StyleGAN2-ADA-training-jupyter - Training custom datasets in styleGAN2-ADA by NVIDIA using Jupyter

A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions

GAN example for Keras. Cuz MNIST is too small and there should be something more realistic.

SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts

Flexible Option Learning - NeurIPS 2021

This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Contains source code for the winning solution of the xView3 challenge

[NeurIPS 2020] Semi-Supervision (Unlabeled Data) & Self-Supervision Improve Class-Imbalanced / Long-Tailed Learning

A Kitti Road Segmentation model implemented in tensorflow.

[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning

GANmouflage: 3D Object Nondetection with Texture Fields

Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units

Face uncertainty quantification or estimation using PyTorch.

Repo for 2021 SDD assessment task 2, by Felix, Anna, and James.

StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking