ASFormer: Transformer for Action Segmentation

This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segmentation.

Enviroment

Pytorch == 1.1.0, torchvision == 0.3.0, python == 3.6, CUDA=10.1

Reproduce our results

1. Download the dataset data.zip at (https://mega.nz/#!O6wXlSTS!wcEoDT4Ctq5HRq_hV-aWeVF1_JB3cacQBQqOLjCIbc8) or (https://zenodo.org/record/3625992#.Xiv9jGhKhPY). 
2. Unzip the data.zip file to the current folder. There are three datasets in the ./data folder, i.e. ./data/breakfast, ./data/50salads, ./data/gtea
3. Download the pre-trained models at (https://pan.baidu.com/s/1zf-d-7eYqK-IxroBKTxDfg). There are pretrained models for three datasets, i.e. ./models/50salads, ./models/breakfast, ./models/gtea
4. Run python main.py --action=predict --dataset=50salads/gtea/breakfast --split=1/2/3/4/5 to generate predicted results for each split.
5. Run python eval.py --dataset=50salads/gtea/breakfast --split=0/1/2/3/4/5 to evaluate the performance. **NOTE**: split=0 will evaulate the average results for all splits, It needs to be done after you complete all split predictions.

Train your own model

Also, you can retrain the model by yourself with following command.

python main.py --action=train --dataset=50salads/gtea/breakfast --split=1/2/3/4/5

The training process is very stable in our experiments. It convergences very fast and is not sensitive to the number of training epochs.

Demo for using ASFormer as your backbone

In our paper, we replace the original TCN-based backbone model MS-TCN in ASRF with our ASFormer. The new model achieves even higher results on the 50salads dataset than the original ASRF. Code is Here.

If you find our repo useful, please give us a star and cite

@inproceedings{chinayi_ASformer,  
	author={Fangqiu Yi and Hongyu Wen and Tingting Jiang}, 
	booktitle={The British Machine Vision Conference (BMVC)},   
	title={ASFormer: Transformer for Action Segmentation},
	year={2021},  
}

Feel free to raise a issue if you got trouble with our code.

Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

Related tags

Overview

ASFormer: Transformer for Action Segmentation

Enviroment

Reproduce our results

Train your own model

Demo for using ASFormer as your backbone

Owner

CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K

Keras implementation of AdaBound

This a classic fintech problem that introduces real life difficulties such as data imbalance. Check out the notebook to find out more!

Code to reproduce the results for Compositional Attention

DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

Manipulation OpenAI Gym environments to simulate robots at the STARS lab

[ICLR 2021] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

Synthesizing and manipulating 2048x1024 images with conditional GANs

BBB streaming without Xorg and Pulseaudio and Chromium and other nonsense (heavily WIP)

Code for "Searching for Efficient Multi-Stage Vision Transformers"

A Temporal Extension Library for PyTorch Geometric

Lab Materials for MIT 6.S191: Introduction to Deep Learning

An evaluation toolkit for voice conversion models.

A lightweight face-recognition toolbox and pipeline based on tensorflow-lite

Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

Code for our ICASSP 2021 paper: SA-Net: Shuffle Attention for Deep Convolutional Neural Networks

RealTime Emotion Recognizer for Machine Learning Study Jam's demo

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

Data, notebooks, and articles associated with the RSNA AI Deep Learning Lab at RSNA 2021