Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks

Last update: Nov 24, 2022

Overview

Uniformer - Pytorch

Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks

Install

$ pip install uniformer-pytorch

Usage

Uniformer-S

import torch
from uniformer_pytorch import Uniformer

model = Uniformer(
    num_classes = 1000,                 # number of output classes
    dims = (64, 128, 256, 512),         # feature dimensions per stage (4 stages)
    depths = (3, 4, 8, 3),              # depth at each stage
    mhsa_types = ('l', 'l', 'g', 'g')   # aggregation type at each stage, 'l' stands for local, 'g' stands for global
)

video = torch.randn(1, 3, 8, 224, 224)  # (batch, channels, time, height, width)

logits = model(video) # (1, 1000)

Uniformer-B

import torch
from uniformer_pytorch import Uniformer

model = Uniformer(
    num_classes = 1000
    depths = (5, 8, 20, 7)
)

Citations

@inproceedings{anonymous2022uniformer,
    title   = {UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning},
    author  = {Anonymous},
    booktitle = {Submitted to The Tenth International Conference on Learning Representations },
    year    = {2022},
    url     = {https://openreview.net/forum?id=nBU_u6DLvoK},
    note    = {under review}
}

Comments

About the re-implementation of Layernorm

Hi @lucidrains ,

Thanks for the implementation of Uniformer. After checking the code, I found that you have re-implemented the layernorm operation in the following. https://github.com/lucidrains/uniformer-pytorch/blob/78397000e647fd4035a7fa2bede999d59d4c3ded/uniformer_pytorch/uniformer_pytorch.py#L13-L23

I wonder if there are any differences between your re-implementation and the official code provided by PyTorch (i.e., torch.nn.Layernorm(dim) ).

Looking forward to your reply. Thx.

opened by ChongjianGE 2
Pretrained weights

Did you pretrain the uniformer on any of the benchmark datasets? (Kinetics, something-something, ...) If you did, would you be able to share the state dicts?

Thanks in advance!

opened by Fritskee 3

A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

SVHNClassifier-PyTorch A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks If

182 Jan 3, 2023

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

104 Nov 25, 2022

U-2-Net: U Square Net - Modified for paired image training of style transfer

U2-Net: U Square Net Modified for paired image training of style transfer This is an unofficial repo making use of the code which was made available b

43 Oct 3, 2022

Implementation of TimeSformer, a pure attention-based solution for video classification

TimeSformer - Pytorch Implementation of TimeSformer, a pure and simple attention-based solution for reaching SOTA on video classification.

602 Jan 3, 2023

A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

A PyTorch implementation of V-Net Vnet is a PyTorch implementation of the paper V-Net: Fully Convolutional Neural Networks for Volumetric Medical Imag

606 Dec 21, 2022

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

Retina Blood Vessels Segmentation This is an implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional

23 Aug 20, 2022

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

U-Net Implementation By Christopher Ley This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical

1 Jan 6, 2022

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

556 Jan 4, 2023

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

🦩 Flamingo - Pytorch Implementation of Flamingo, state-of-the-art few-shot visual question answering attention net, in Pytorch. It will include the p

630 Dec 28, 2022

Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks

Related tags

Overview

Uniformer - Pytorch

Install

Usage

Citations

You might also like...

A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U-2-Net: U Square Net - Modified for paired image training of style transfer

Implementation of TimeSformer, a pure attention-based solution for video classification

A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

Comments

About the re-implementation of Layernorm

Pretrained weights

Releases(0.0.4)

0.0.4(Apr 22, 2022)

0.0.3(Nov 17, 2021)

0.0.2(Nov 16, 2021)

0.0.1(Nov 16, 2021)

Owner

Phil Wang

A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval

[Link]mareteutral - pars tradg wth M []

Zero-Cost Proxies for Lightweight NAS

Codes for [NeurIPS'21] You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership.

Official PyTorch implementation of "RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on" (IJCAI-ECAI 2022)

Deep learning PyTorch library for time series forecasting, classification, and anomaly detection

StyleTransfer - Open source style transfer project, based on VGG19

YOLOX_AUDIO is an audio event detection model based on YOLOX

PyTorch source code for Distilling Knowledge by Mimicking Features

CCPD: a diverse and well-annotated dataset for license plate detection and recognition

yolov5 deepsort 行人 车辆 跟踪 检测 计数

Code for Neurips2021 Paper "Topology-Imbalance Learning for Semi-Supervised Node Classification".

El-Gamal on Elliptic Curve (Python)

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

Code for 'Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning', ICCV 2021

Example-custom-ml-block-keras - Custom Keras ML block example for Edge Impulse

Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

Repo for code associated with Modeling the Mitral Valve.

Training a deep learning model on the noisy CIFAR dataset

Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

yolov5 deepsort 行人车辆跟踪检测计数