【ACMMM 2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

Overview

DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning (ACMMM 2021)

1

Overview

We release the code of the DSANet (Dynamic Segment Aggregation Network). We introduce the DSA module to capture relationship among snippets for video-level representation learning. Equipped with DSA modules, the top-1 accuracy of I3D ResNet-50 is improved to 78.2% on Kinetics-400.

The core code to implement the Dynamic Segment Aggregation Module is codes/models/modules_maker/DSA.py.

[July 7, 2021] We release the core code of DSANet.

[July 3, 2021] DSANet has been accepted by ACMMM 2021.

Prerequisites

All dependencies can be installed using pip:

python -m pip install -r requirements.txt

Our experiments run on Python 3.7 and PyTorch 1.5. Other versions should work but are not tested.

Download Pretrained Models

  • Download ImageNet pre-trained models for offline environment
cd pretrained
sh download_imgnet.sh
  • Download K400 pre-trained models for inference

TODO

Data Preparation

We follow the same data process with MVFNet for data preparation.

Model Zoo

TODO

Testing

bash dist_test_recognizer.sh CONFIG_PATH CHECKPOINT_PATH 8 

Training

This implementation supports multi-gpu, DistributedDataParallel training, which is faster and simpler.

  • For example, to train DSANet with 8 gpus, you can run:
bash dist_train_recognizer.sh configs/kinetics/r50_e100.py 8

Acknowledgements

We especially thank the contributors of the MVFNet and mmaction codebase for providing helpful code.

License

This repository is released under the Apache-2.0. license as found in the LICENSE file.

Related Work

MVFNet: Multi-View Fusion Network for Efficient Video Recognition, AAAI2021 Paper | Code

Citation

If you think our work is useful, please feel free to cite our paper 😆 :

@inproceedings{wu2021dsanet,
  title={DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning},
  author={Wu, Wenhao and Zhao, Yuxiang and Xu, Yanwu and Tan, Xiao and He, Dongliang and Zou, Zhikang and Ye, Jin and Li, Yingying and Yao, Mingde and Dong, Zichao and others},
  booktitle = {ACMMM},
  year={2021}
}

Contact

For any question, please file an issue or contact

Wenhao Wu: [email protected]
Yuxiang Zhao: [email protected]
使用OpenCV部署全景驾驶感知网络YOLOP,可同时处理交通目标检测、可驾驶区域分割、车道线检测,三项视觉感知任务,包含C++和Python两种版本的程序实现。本套程序只依赖opencv库就可以运行, 从而彻底摆脱对任何深度学习框架的依赖。

YOLOP-opencv-dnn 使用OpenCV部署全景驾驶感知网络YOLOP,可同时处理交通目标检测、可驾驶区域分割、车道线检测,三项视觉感知任务,依然是包含C++和Python两种版本的程序实现 onnx文件从百度云盘下载,链接:https://pan.baidu.com/s/1A_9cldU

178 Jan 07, 2023
PyTorch implementation of Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction (ICCV 2021).

Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction Introduction This is official PyTorch implementation of Towards Accurate Alignment

TANG Xiao 96 Dec 27, 2022
A general python framework for visual object tracking and video object segmentation, based on PyTorch

PyTracking A general python framework for visual object tracking and video object segmentation, based on PyTorch. 📣 Two tracking/VOS papers accepted

2.6k Jan 04, 2023
The world's largest toxicity dataset.

The Toxicity Dataset by Surge AI Saving the internet is fun. Combing through thousands of online comments to build a toxicity dataset isn't. That's wh

Surge AI 134 Dec 19, 2022
"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation

Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices This repository contains the official PyTorch implemen

Yandex Research 21 Oct 18, 2022
Code for Contrastive-Geometry Networks for Generalized 3D Pose Transfer

Code for Contrastive-Geometry Networks for Generalized 3D Pose Transfer

18 Jun 28, 2022
Prososdy Morph: A python library for manipulating pitch and duration in an algorithmic way, for resynthesizing speech.

ProMo (Prosody Morph) Questions? Comments? Feedback? Chat with us on gitter! A library for manipulating pitch and duration in an algorithmic way, for

Tim 71 Jan 02, 2023
SNIPS: Solving Noisy Inverse Problems Stochastically

SNIPS: Solving Noisy Inverse Problems Stochastically This repo contains the official implementation for the paper SNIPS: Solving Noisy Inverse Problem

Bahjat Kawar 35 Nov 09, 2022
LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image.

This project is based on ultralytics/yolov3. LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image. Download $ git clone http

26 Dec 13, 2022
On Generating Extended Summaries of Long Documents

ExtendedSumm This repository contains the implementation details and datasets used in On Generating Extended Summaries of Long Documents paper at the

Georgetown Information Retrieval Lab 76 Sep 05, 2022
Code release for the paper “Worldsheet Wrapping the World in a 3D Sheet for View Synthesis from a Single Image”, ICCV 2021.

Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image This repository contains the code for the following paper: R. Hu,

Meta Research 37 Jan 04, 2023
Breaching - Breaching privacy in federated learning scenarios for vision and text

Breaching - A Framework for Attacks against Privacy in Federated Learning This P

Jonas Geiping 139 Jan 03, 2023
[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

Garment4D [PDF] | [OpenReview] | [Project Page] Overview This is the codebase for our NeurIPS 2021 paper Garment4D: Garment Reconstruction from Point

Fangzhou Hong 112 Dec 23, 2022
Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

When2com: Multi-Agent Perception via Communication Graph Grouping This is the PyTorch implementation of our paper: When2com: Multi-Agent Perception vi

34 Nov 09, 2022
Pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021).

Pytorch code for SS-Net This is a pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021). Environment Code is tested

Sun Ran 1 May 18, 2022
Performant, differentiable reinforcement learning

deluca Performant, differentiable reinforcement learning Notes This is pre-alpha software and is undergoing a number of core changes. Updates to follo

Google 114 Dec 27, 2022
Pre-training of Graph Augmented Transformers for Medication Recommendation

G-Bert Pre-training of Graph Augmented Transformers for Medication Recommendation Intro G-Bert combined the power of Graph Neural Networks and BERT (B

101 Dec 27, 2022
Code & Data for Enhancing Photorealism Enhancement

Code & Data for Enhancing Photorealism Enhancement

Intel ISL (Intel Intelligent Systems Lab) 1.1k Jan 08, 2023
Code for GNMR in ICDE 2021

GNMR Code for GNMR in ICDE 2021 Please unzip data files in Datasets/MultiInt-ML10M first. Run labcode_preSamp.py (with graph sampling) for ECommerce-c

7 Oct 27, 2022
This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities

MLOps with Vertex AI This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities. The ex

Google Cloud Platform 238 Dec 21, 2022