Multiple-Object Tracking with Transformer

Last update: Jan 04, 2023

Related tags

Deep Learning TransTrack

Overview

TransTrack: Multiple-Object Tracking with Transformer

Introduction

TransTrack: Multiple-Object Tracking with Transformer

Models

Training data	Training time	Validation MOTA	download
crowdhuman, mot_half	36h + 1h	65.4	model
crowdhuman	36h	53.8	model
mot_half	8h	61.6	model

Models are also available in Baidu Drive by code m4iv.

Notes

Evaluating crowdhuman-training model and mot-training model use different command lines, see Steps.
We observe about 1 MOTA noise.
If the resulting MOTA of your self-trained model is not desired, playing around with the --track_thresh sometimes gives a better performance.
The training time is on 8 NVIDIA V100 GPUs with batchsize 16.
We use the models pre-trained on imagenet.

Demo

Installation

The codebases are built on top of Deformable DETR and CenterTrack.

Requirements

Linux, CUDA>=9.2, GCC>=5.4
Python>=3.7
PyTorch ≥ 1.5 and torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this
OpenCV is optional and needed by demo and visualization

Steps

Install and build libs

git clone https://github.com/PeizeSun/TransTrack.git
cd TransTrack
cd models/ops
python setup.py build install
cd ../..
pip install -r requirements.txt

Prepare dataset

mkdir -p crowdhuman/annotations
cp -r /path_to_crowdhuman_dataset/annotations/CrowdHuman_val.json crowdhuman/annotations/CrowdHuman_val.json
cp -r /path_to_crowdhuman_dataset/annotations/CrowdHuman_train.json crowdhuman/annotations/CrowdHuman_train.json
cp -r /path_to_crowdhuman_dataset/CrowdHuman_train crowdhuman/CrowdHuman_train
cp -r /path_to_crowdhuman_dataset/CrowdHuman_val crowdhuman/CrowdHuman_val
mkdir mot
cp -r /path_to_mot_dataset/train mot/train
cp -r /path_to_mot_dataset/test mot/test
python track_tools/convert_mot_to_coco.py

CrowdHuman dataset is available in CrowdHuman. We provide annotations of json format.

MOT dataset is available in MOT.

Pre-train on crowdhuman

sh track_exps/crowdhuman_train.sh
python track_tools/crowdhuman_model_to_mot.py

The pre-trained model is available crowdhuman_final.pth.

Train TransTrack

sh track_exps/crowdhuman_mot_trainhalf.sh

Evaluate TransTrack

sh track_exps/mot_val.sh
sh track_exps/mot_eval.sh

Visualize TransTrack

python track_tools/txt2video.py

Notes

Evaluate pre-trained CrowdHuman model on MOT

sh track_exps/det_val.sh
sh track_exps/mot_eval.sh

License

TransTrack is released under MIT License.

Citing

If you use TransTrack in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@article{transtrack,
  title   =  {TransTrack: Multiple-Object Tracking with Transformer},
  author  =  {Peize Sun and Yi Jiang and Rufeng Zhang and Enze Xie and Jinkun Cao and Xinting Hu and Tao Kong and Zehuan Yuan and Changhu Wang and Ping Luo},
  journal =  {arXiv preprint arXiv: 2012.15460},
  year    =  {2020}
}

Multiple-Object Tracking with Transformer

Related tags

Overview

TransTrack: Multiple-Object Tracking with Transformer

Introduction

Models

Notes

Demo

Installation

Requirements

Steps

Notes

License

Citing

Owner

Peize Sun

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

Algo-burn - Script to configure an Algorand address as a "burn" address for one or more ASA tokens

Code for paper entitled "Improving Novelty Detection using the Reconstructions of Nearest Neighbours"

MCMC samplers for Bayesian estimation in Python, including Metropolis-Hastings, NUTS, and Slice

Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022)

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Neural Network to colorize grayscale images

Neural Network Libraries

This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation).

TRIQ implementation

Bag of Tricks for Natural Policy Gradient Reinforcement Learning

Fuse radar and camera for detection

Wordplay, an artificial Intelligence based crossword puzzle solver.

Ensembling Off-the-shelf Models for GAN Training

Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.

audioLIME: Listenable Explanations Using Source Separation

Blender Python - Node-based multi-line text and image flowchart

Geometry-Free View Synthesis: Transformers and no 3D Priors

A PyTorch Toolbox for Face Recognition

A diff tool for language models