Kinetics-Data-Preprocessing

Last update: Oct 27, 2022

Overview

Kinetics-Data-Preprocessing

Kinetics-400 and Kinetics-600 are common video recognition datasets used by popular video understanding projects like SlowFast or PytorchVideo. However, their instruction of dataset preparation is too brief. Therefore, this project provides a more detailed instruction for Kinetics-400/-600 data preprocessing.

Download the raw videos

There are multiple ways to download the raw videos of Kinetics-400 and Kinetics-600. Here, I list two common choices that I found to be simple and fast:

Download the videos via the official scripts. However, I noticed that this option is very slow, so I personally recommend the next choice.
Download the compressed videos from the Common Visual Data Foundation Servers following the repository, which is much faster as they organized 650,000 independent video clips into several compressed files.

Resize the videos

The common data preprocessing of Kinetics requires all videos to be resized to the short edge size of 256. Therefore, I use the moviepy package to do so. The package can be easily installed by the following command:

pip install moviepy

Then, you can use the resize_video.py to resize all the videos within the given folder by following command:

python resize_video.py --size 256 --path YOUR_VIDEO_CONTAINER

IMPORTANT! Note that the resize_video.py will replace the original mp4 files. If you want to keep the original files, please make copys before resizing.

Prepare the csv annotation files

Following SlowFast, we also need to prepare the csv annotation files for training, validation, and testing set as train.csv, val.csv, test.csv. The format of the csv file is:

path_to_video_1 label_1
path_to_video_2 label_2
path_to_video_3 label_3
...
path_to_video_N label_N

The original annotations can be found at the kinetics website, or you can directly use download links of kinetics-400 annotations and kinetics-600 annotations. The official annotations support two different types of files: csv and json. However, both of them don't meet the above format. Therefore, I also provide a python code to transfer json files to the corresponding csv files with correct format. It takes two inputs: the container path of all videos, the path of official json annotation files. The output annotations will be named as 'output_XXX.csv' and located at the same folder. The label-to-id mapping dictionary will be saved as 'label2id.json'. The following command is my example.

python kinetics_annotation.py --train_path /home/kaihua/datasets/kinetics-train/ \
    --test_path /home/kaihua/datasets/kinetics-test/ \
    --val_path /home/kaihua/datasets/kinetics-val/ \
    --anno_path /home/kaihua/datasets/kinetics400-anno/

Kinetics-Data-Preprocessing

Related tags

Overview

Kinetics-Data-Preprocessing

Download the raw videos

Resize the videos

Prepare the csv annotation files

Owner

Kaihua Tang

[IEEE TPAMI21] MobileSal: Extremely Efficient RGB-D Salient Object Detection [PyTorch & Jittor]

Compartmental epidemic model to assess undocumented infections: applications to SARS-CoV-2 epidemics in Brazil - Datasets and Codes

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Easily pull telemetry data and create beautiful visualizations for analysis.

Re-TACRED: Addressing Shortcomings of the TACRED Dataset

Search and filter videos based on objects that appear in them using convolutional neural networks

Selfplay In MultiPlayer Environments

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

基于DouZero定制AI实战欢乐斗地主

Code release for "COTR: Correspondence Transformer for Matching Across Images"

Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

Training DALL-E with volunteers from all over the Internet using hivemind and dalle-pytorch (NeurIPS 2021 demo)

How to Learn a Domain Adaptive Event Simulator? ACM MM, 2021

Training a deep learning model on the noisy CIFAR dataset

3D mesh stylization driven by a text input in PyTorch

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021)

yolov5 deepsort 行人车辆跟踪检测计数

Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch

MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research

Kinetics-Data-Preprocessing

Related tags

Overview

Kinetics-Data-Preprocessing

Download the raw videos

Resize the videos

Prepare the csv annotation files

Owner

Kaihua Tang

[IEEE TPAMI21] MobileSal: Extremely Efficient RGB-D Salient Object Detection [PyTorch & Jittor]

Compartmental epidemic model to assess undocumented infections: applications to SARS-CoV-2 epidemics in Brazil - Datasets and Codes

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Easily pull telemetry data and create beautiful visualizations for analysis.

Re-TACRED: Addressing Shortcomings of the TACRED Dataset

Search and filter videos based on objects that appear in them using convolutional neural networks

Selfplay In MultiPlayer Environments

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

基于DouZero定制AI实战欢乐斗地主

Code release for "COTR: Correspondence Transformer for Matching Across Images"

Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

Training DALL-E with volunteers from all over the Internet using hivemind and dalle-pytorch (NeurIPS 2021 demo)

How to Learn a Domain Adaptive Event Simulator? ACM MM, 2021

Training a deep learning model on the noisy CIFAR dataset

3D mesh stylization driven by a text input in PyTorch

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021)

yolov5 deepsort 行人 车辆 跟踪 检测 计数

Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch

MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research

yolov5 deepsort 行人车辆跟踪检测计数