Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Overview

Learning-Action-Completeness-from-Points

Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV 2021 Oral)

architecture

Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization
Pilhyeon Lee (Yonsei Univ.), Hyeran Byun (Yonsei Univ.)

Paper: https://arxiv.org/abs/2108.05029

Abstract: We tackle the problem of localizing temporal intervals of actions with only a single frame label for each action instance for training. Owing to label sparsity, existing work fails to learn action completeness, resulting in fragmentary action predictions. In this paper, we propose a novel framework, where dense pseudo-labels are generated to provide completeness guidance for the model. Concretely, we first select pseudo background points to supplement point-level action labels. Then, by taking the points as seeds, we search for the optimal sequence that is likely to contain complete action instances while agreeing with the seeds. To learn completeness from the obtained sequence, we introduce two novel losses that contrast action instances with background ones in terms of action score and feature similarity, respectively. Experimental results demonstrate that our completeness guidance indeed helps the model to locate complete action instances, leading to large performance gains especially under high IoU thresholds. Moreover, we demonstrate the superiority of our method over existing state-of-the-art methods on four benchmarks: THUMOS'14, GTEA, BEOID, and ActivityNet. Notably, our method even performs comparably to recent fully-supervised methods, at the 6 times cheaper annotation cost.

Prerequisites

Recommended Environment

  • Python 3.6
  • Pytorch 1.6
  • Tensorflow 1.15 (for Tensorboard)
  • CUDA 10.2

Depencencies

You can set up the environments by using $ pip3 install -r requirements.txt.

Data Preparation

  1. Prepare THUMOS'14 dataset.

    • We excluded three test videos (270, 1292, 1496) as previous work did.
  2. Extract features with two-stream I3D networks

    • We recommend extracting features using this repo.
    • For convenience, we provide the features we used. You can find them here.
  3. Place the features inside the dataset folder.

    • Please ensure the data structure is as below.
├── dataset
   └── THUMOS14
       ├── gt.json
       ├── split_train.txt
       ├── split_test.txt
       ├── fps_dict.json
       ├── point_gaussian
           └── point_labels.csv
       └── features
           ├── train
               ├── rgb
                   ├── video_validation_0000051.npy
                   ├── video_validation_0000052.npy
                   └── ...
               └── flow
                   ├── video_validation_0000051.npy
                   ├── video_validation_0000052.npy
                   └── ...
           └── test
               ├── rgb
                   ├── video_test_0000004.npy
                   ├── video_test_0000006.npy
                   └── ...
               └── flow
                   ├── video_test_0000004.npy
                   ├── video_test_0000006.npy
                   └── ...

Usage

Running

You can easily train and evaluate the model by running the script below.

If you want to try other training options, please refer to options.py.

$ bash run.sh

Evaulation

The pre-trained model can be found here. You can evaluate the model by running the command below.

$ bash run_eval.sh

References

We note that this repo was built upon our previous models.

  • Background Suppression Network for Weakly-supervised Temporal Action Localization (AAAI 2020) [paper] [code]
  • Weakly-supervised Temporal Action Localization by Uncertainty Modeling (AAAI 2021) [paper] [code]

We referenced the repos below for the code.

In addition, we referenced a part of code in the following repo for the greedy algorithm implementation.

Citation

If you find this code useful, please cite our paper.

@inproceedings{lee2021completeness,
  title={Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization},
  author={Pilhyeon Lee and Hyeran Byun},
  booktitle={IEEE/CVF International Conference on Computer Vision},
  year={2021},
}

Contact

If you have any question or comment, please contact the first author of the paper - Pilhyeon Lee ([email protected]).

Owner
Pilhyeon Lee
* Ph.D. student in Yonsei Univ. (2018.03.~present)            
Pilhyeon Lee
A modification of Daniel Russell's notebook merged with Katherine Crowson's hq-skip-net changes

Edits made to this repo by Katherine Crowson I have added several features to this repository for use in creating higher quality generative art (featu

Paul Fishwick 10 May 07, 2022
SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs SMORE is a a versatile framework that scales multi-hop query emb

Google Research 135 Dec 27, 2022
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers

hierarchical-transformer-1d Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers In Progress!! 2021.

MyungHoon Jin 7 Nov 06, 2022
Temporal Segment Networks (TSN) in PyTorch

TSN-Pytorch We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation for TSN as well as oth

1k Jan 03, 2023
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
A public available dataset for road boundary detection in aerial images

Topo-boundary This is the official github repo of paper Topo-boundary: A Benchmark Dataset on Topological Road-boundary Detection Using Aerial Images

Zhenhua Xu 79 Jan 04, 2023
Using LSTM write Tang poetry

本教程将通过一个示例对LSTM进行介绍。通过搭建训练LSTM网络,我们将训练一个模型来生成唐诗。本文将对该实现进行详尽的解释,并阐明此模型的工作方式和原因。并不需要过多专业知识,但是可能需要新手花一些时间来理解的模型训练的实际情况。为了节省时间,请尽量选择GPU进行训练。

56 Dec 15, 2022
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pytorch Lightning 1.4k Jan 01, 2023
Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Codebase for "ProtoAttend: Attention-Based Prototypical Learning." Authors: Sercan O. Arik and Tomas Pfister Paper: Sercan O. Arik and Tomas Pfister,

47 2 May 17, 2022
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

Pranaydeep Singh 22 Dec 08, 2022
One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking This is an official implementation for NEAS presented in CVPR

Multimedia Research 19 Sep 08, 2022
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training By Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue. This

290 Dec 29, 2022
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

DeCLIP Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm. Our paper is available in arxiv Updates ** Ou

Sense-GVT 470 Dec 30, 2022
Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models

Patch-Rotation(PatchRot) Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models Submitted to Neurips2021 To

4 Jul 12, 2021
Official Pytorch implementation of RePOSE (ICCV2021)

RePOSE: Iterative Rendering and Refinement for 6D Object Detection (ICCV2021) [Link] Abstract We present RePOSE, a fast iterative refinement method fo

Shun Iwase 68 Nov 15, 2022
Sample and Computation Redistribution for Efficient Face Detection

Introduction SCRFD is an efficient high accuracy face detection approach which initially described in Arxiv. Performance Precision, flops and infer ti

Sajjad Aemmi 13 Mar 05, 2022
Python framework for Stochastic Differential Equations modeling

SDElearn: a Python package for SDE modeling This package implements functionalities for working with Stochastic Differential Equations models (SDEs fo

4 May 10, 2022
Collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning.

Collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning Installation

Pytorch Lightning 1.6k Jan 08, 2023
Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

Recursive-NeRF: An Efficient and Dynamically Growing NeRF This is a Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

33 Nov 30, 2022
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

536 Dec 20, 2022