git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Overview

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li

Accepted by CVPR 2021 (Oral). [Paper Link]

This repository includes Python (PyTorch) implementation of the TrDiMP and TrSiam trackers, to appear in CVPR 2021.

Abstract

In video object tracking, there exist rich temporal contexts among successive frames, which have been largely overlooked in existing trackers. In this work, we bridge the individual video frames and explore the temporal contexts across them via a transformer architecture for robust object tracking. Different from classic usage of the transformer in natural language processing tasks, we separate its encoder and decoder into two parallel branches and carefully design them within the Siamese-like tracking pipelines. The transformer encoder promotes the target templates via attention-based feature reinforcement, which benefits the high-quality tracking model generation. The transformer decoder propagates the tracking cues from previous templates to the current frame, which facilitates the object searching process. Our transformer-assisted tracking framework is neat and trained in an end-to-end manner. With the proposed transformer, a simple Siamese matching approach is able to outperform the current top-performing trackers. By combining our transformer with the recent discriminative tracking pipeline, our method sets several new state-of-the-art records on prevalent tracking benchmarks.

Tracking Results and Pretrained Model

Tracking results: the raw results of TrDiMP/TrSiam on 7 benchmarks including OTB, UAV, NFS, VOT2018, GOT-10k, TrackingNet, and LaSOT can be found here.

Pretrained model: please download the TrDiMP model and put it in the pytracking/networks folder.

TrDiMP and TrSiam share the same model. The main difference between TrDiMP and TrSiam lies in the tracking model generation. TrSiam does not utilize the background information and simply crops the target/foreground area to generate the tracking model, which can be regarded as the initialization step of TrDiMP.

Environment Setup

Clone the GIT repository.

git clone https://github.com/594422814/TransformerTrack.git

Clone the submodules.

In the repository directory, run the commands:

git submodule update --init  

Install dependencies

Run the installation script to install all the dependencies. You need to provide the conda install path (e.g. ~/anaconda3) and the name for the created conda environment (here pytracking).

bash install.sh conda_install_path pytracking

This script will also download the default networks and set-up the environment.

Note: The install script has been tested on an Ubuntu 18.04 system. In case of issues, check the detailed installation instructions.

Our code is based on the PyTracking framework. For more details, please refer to PyTracking.

Training the TrDiMP/TrSiam Model

Please refer to the README in the ltr folder.

Testing the TrDiMP/TrSiam Tracker

Please refer to the README in the pytracking folder. As shown in pytracking/README.md, you can either use this PyTracking toolkit or GOT-10k toolkit to reproduce the tracking results.

Citation

If you find this work useful for your research, please consider citing our work:

@inproceedings{Wang_2021_Transformer,
    title={Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking},
    author={Wang, Ning and Zhou, Wengang and Wang, Jie and Li, Houqiang},
    booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021}
}

Acknowledgment

Our transformer-assisted tracker is based on PyTracking. We sincerely thank the authors Martin Danelljan and Goutam Bhat for providing this great framework.

Contact

If you have any questions, please feel free to contact [email protected]

Comments
  • some qusetions on testing VOT18

    some qusetions on testing VOT18

    when i use GOT10K_VOT.py to test VOT18, some errors will happen in the ./pytracking/tracker/trdimp.py, it seems some parameters do not match the test of vot18. But i use GOT10K_GOT.py to test GOT_10k, this errors didn't happen, because ./pytracking/tracker/trdimp_for_GOT.py works. Do you have specialied trdimp_for_VOT.py that uses on testing VOT18? Your prompt attention to my question would be highly appreciated!

    opened by yuyudiandian 4
  • Difference between TrSiam and TrDiMP

    Difference between TrSiam and TrDiMP

    Hi,

    Thanks for your good work!

    When I was reading your paper and your code I found that during the training phase, there is no difference between TrSiam and TrDimp. I want to to know if the difference between thiese two algorithms only occurs during the tracking phase? For TrSiam, you also used the DCF during the training phase. Is this true?

    Thanks

    opened by YanWanquan 3
  • Training

    Training

    In your code I found that you import Lasot, Got10k, TrackingNet and MSCOCOSeq. I want to know if you used all these datasets for training.

    In your paper, you said the batch size you used is 36 image pairs and there is total 1500 iterations per epoch. However, 1500*36=1,8000 which is much smaller thant the number of images in CoCo dataset. I remember batch_size * iterations = dataset_sizse. I am a newer in the tracking field, and just a little confused about this. Thanks.

    opened by YanWanquan 3
  • what is your hardware for trainning?

    what is your hardware for trainning?

    what is your hardware for trainning? Super_dimp use one TATIAN X for training. And your traning setting is slight different, I wonder what's the reason。

    opened by Lightning980729 3
  • No matching checkpoint file found

    No matching checkpoint file found

    I run with: python run_tracker.py trdimp trdimp --sequence Biker

    The reported error is : "No matching checkpoint file found"

    How should I repair this? or How should I set the checkpoint file?

    Thanks a lot.

    opened by gyc9709 3
  • some questions on testing

    some questions on testing

    I did some changes on the structure of code,but when testing GOT or OTB ,it will happen “cuda memory is not enough ”。Same question not happened on testing VOT18,so do you have some suggestions on this question?Or how to reduce the the cuda memory on testing dataset?Thank you.

    opened by yuyudiandian 2
  • no matching checkpoint file found

    no matching checkpoint file found

    Hi, when I want to run python run_training dimp transformer_dimp, it always shows no matching checkpoint file found. Does this model need a checkpoint file before training? If yes, where can I download this file?

    When I run to the 52 line of ltr_trainer.py file: for i, data in enumerate(loader,1), it shows an error: out of memory. My computer has 16G memory, and my GPU has 32G memory, so I think it is large enough to run the model. Can you give any suggestions about this problem? Thanks.

    opened by YanWanquan 2
  • Question about FFN

    Question about FFN

    Thanks for your work! I noticed that in the paper, you claimed that "To achieve a good balance of speed and performance, we slim the classic transformer by omitting the fully-connected feed-forward layers and maintaining the lightweight single-head attention", and it seemed that you also do not use the FFN layer in your code. I'm wondering how about the performance regardless of speed if you use the classic transformer including FFN ?

    opened by 3bobo 2
  • 使用got10k的测试工具报错

    使用got10k的测试工具报错

    image segmentation_dir: /trdimp/trdimp_vot Files already downloaded. Running tracker GOT_Tracker on VOT... Running supervised experiment... --Sequence 1/60: ants1 Repetition: 1 Traceback (most recent call last): File "GOT10k_VOT.py", line 45, in experiment.run(tracker, visualize=False) File "/home/admin312/anaconda3/envs/pytracking/lib/python3.7/site-packages/got10k/experiments/vot.py", line 71, in run self.run_supervised(tracker, visualize) File "/home/admin312/anaconda3/envs/pytracking/lib/python3.7/site-packages/got10k/experiments/vot.py", line 126, in run_supervised tracker.init(frame, anno_rects[0]) File "GOT10k_VOT.py", line 32, in init self.tracker.initialize(image, box) File "../pytracking/tracker/trdimp/trdimp.py", line 55, in initialize state = info['init_bbox'] IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

    看上去是版本导致的错误? 请问作者有遇到过或者有修改的思路吗

    opened by 1145284121 1
  • 训练和测试运行不报错,程序停止了

    训练和测试运行不报错,程序停止了

    您好,在使用您的代码过程中,我使用了自己的数据作为训练和测试。但是在训练或者测试过程中出现如下提示,但是不报错,程序也不在运行了。

    /home/miniconda3/envs/pytracking/bin/python /home/TransformerTrack-main/pytracking/run_tracker.py trdimp trsiam --dataset eotb --sequence val
    Evaluating    1 trackers on    31 sequences
    Tracker: trdimp trsiam None ,  Sequence: test_seq_474
    /home/miniconda3/envs/pytracking/lib/python3.8/site-packages/torch/nn/functional.py:3060: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
      warnings.warn("Default upsampling behavior when mode={} is changed "
    Using /home/.cache/torch_extensions as PyTorch extensions root...
    

    因为DIMP代码机制的问题,在训练的时候,如果ctrl+c代码则会重新运行,且运行正常。但是测试的时候不会。请问这是什么原因呢?

    opened by Jee-King 1
  • Some questions about train setting?

    Some questions about train setting?

    Some questions about your baseline? It seems that you use the same parameter of SuperDiMP, but not DiMP. Based on the SuperDiMP does achieves the performance described in your paper, but the performance of DiMP is doubtful. However, your paper is elaborated from the perspective of temporal feature, which is novel and great.

    opened by yxxxqqq 1
  • Question about source code of Transformer part

    Question about source code of Transformer part

    As illustrated in paper, Encoder is used to ehanced template feature(train_feat in dimp) only, and Decoder is used to produce decoded search feature. But in the transformer's forward function, decoder is also used on train_feat, which is not described in the paper. Could you please explain why?

    opened by ChuzzZz 0
  • linear transformation for key and query

    linear transformation for key and query

    I noticed that linear transformations used to reduce the dimension of key and query share same weight. Is it out of computaition consideration or used different weight degrades performance?

    opened by ChuzzZz 0
  • Problems in Training process

    Problems in Training process

    Hi, when I run python run_training dimp transformer_dimp, it shows ‘No matching checkpoint file found’ and ‘ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])’. The batch_size is set to 6. I don’t know why it change to 1 in the training process. Do you have any solutions to this problem? Thanks very much!

    The error message reported during the training process is shown below.

    _Training: dimp transformer_dimp No matching checkpoint file found Using /tmp/torch_extensions as PyTorch extensions root...Using /tmp/torch_extensions as PyTorch extensions root...

    Detected CUDA files, patching ldflags Emitting ninja build file /tmp/torch_extensions/_prroi_pooling/build.ninja... Building extension module _prroi_pooling... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) Using /tmp/torch_extensions as PyTorch extensions root... No modifications detected for re-loaded extension module _prroi_pooling, skipping build step... Loading extension module _prroi_pooling... ninja: no work to do. Loading extension module _prroi_pooling... Loading extension module _prroi_pooling... Training crashed at epoch 1 Traceback for the error! Traceback (most recent call last): File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/base_trainer.py", line 70, in train self.train_epoch() File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/ltr_trainer.py", line 80, in train_epoch self.cycle_dataset(loader) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/trainers/ltr_trainer.py", line 61, in cycle_dataset loss, stats = self.actor(data) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/actors/tracking.py", line 97, in call test_proposals=data['test_proposals']) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply output.reraise() File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in replica 0 on device 0. Original Traceback (most recent call last): File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker output = module(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/tracking/dimpnet.py", line 75, in forward iou_pred = self.bb_regressor(train_feat_iou, test_feat_iou, train_bb, test_proposals) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/bbreg/atom_iou_net.py", line 86, in forward modulation = self.get_modulation(feat1, bb1) File "/data1/user4/tracker/new_tracker/TransformerTrack/ltr/models/bbreg/atom_iou_net.py", line 162, in get_modulation fc3_r = self.fc3_1r(roi3r) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward input = module(input) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 106, in forward exponential_average_factor, self.eps) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/functional.py", line 1919, in batch_norm _verify_batch_size(input.size()) File "/data1/user4/anaconda3/envs/Transformer/lib/python3.7/site-packages/torch/nn/functional.py", line 1902, in verify_batch_size raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size)) ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

    opened by Chenlulu1993 2
Owner
NingWang
PhD student in University of Science and Technology of China (USTC)
NingWang
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

Jin 4 Dec 30, 2022
Pytorch tutorials for Neural Style transfert

PyTorch Tutorials This tutorial is no longer maintained. Please use the official version: https://pytorch.org/tutorials/advanced/neural_style_tutorial

Alexis David Jacq 135 Jun 26, 2022
Ankou: Guiding Grey-box Fuzzing towards Combinatorial Difference

Ankou Ankou is a source-based grey-box fuzzer. It intends to use a more rich fitness function by going beyond simple branch coverage and considering t

SoftSec Lab 54 Dec 24, 2022
Robustness via Cross-Domain Ensembles

Robustness via Cross-Domain Ensembles [ICCV 2021, Oral] This repository contains tools for training and evaluating: Pretrained models Demo code Traini

Visual Intelligence & Learning Lab, Swiss Federal Institute of Technology (EPFL) 27 Dec 23, 2022
Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network

ild-cnn This is supplementary material for the manuscript: "Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neur

22 Nov 05, 2022
Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

Empirical Experiments in "Feature Learning in Infinite-width Neural Networks" This repo contains code to replicate our experiments (Word2Vec, MAML) in

Edward Hu 37 Dec 14, 2022
Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Using a predicted aligned error matrix corresponding to an AlphaFold2 model , returns a series of lists of residue indices, where each list corresponds to a set of residues clustering together into a

Tristan Croll 24 Nov 23, 2022
YOLOv5 Series Multi-backbone, Pruning and quantization Compression Tool Box.

YOLOv5-Compression Update News Requirements 环境安装 pip install -r requirements.txt Evaluation metric Visdrone Model mAP ZhangYuan 719 Jan 02, 2023

This project aims at providing a concise, easy-to-use, modifiable reference implementation for semantic segmentation models using PyTorch.

Semantic Segmentation on PyTorch (include FCN, PSPNet, Deeplabv3, Deeplabv3+, DANet, DenseASPP, BiSeNet, EncNet, DUNet, ICNet, ENet, OCNet, CCNet, PSANet, CGNet, ESPNet, LEDNet, DFANet)

2.4k Jan 08, 2023
Tracking Progress in Question Answering over Knowledge Graphs

Tracking Progress in Question Answering over Knowledge Graphs Table of contents Question Answering Systems with Descriptions The QA Systems Table cont

Knowledge Graph Question Answering 47 Jan 02, 2023
😊 Python module for face feature changing

PyWarping Python module for face feature changing Installation pip install pywarping If you get an error: No such file or directory: 'cmake': 'cmake',

Dopevog 10 Sep 10, 2021
Morphable Detector for Object Detection on Demand

Morphable Detector for Object Detection on Demand (ICCV 2021) PyTorch implementation of the paper Morphable Detector for Object Detection on Demand. I

9 Feb 23, 2022
Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

SEDE SEDE (Stack Exchange Data Explorer) is new dataset for Text-to-SQL tasks with more than 12,000 SQL queries and their natural language description

Rupert. 83 Nov 11, 2022
Release of the ConditionalQA dataset

ConditionalQA Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers. Disclaimer This dataset

14 Oct 17, 2022
The source code of the paper "SHGNN: Structure-Aware Heterogeneous Graph Neural Network"

SHGNN: Structure-Aware Heterogeneous Graph Neural Network The source code and dataset of the paper: SHGNN: Structure-Aware Heterogeneous Graph Neural

Wentao Xu 7 Nov 13, 2022
Transformer - Transformer in PyTorch

Transformer 完成进度 Embeddings and PositionalEncoding with example. MultiHeadAttent

Tianyang Li 1 Jan 06, 2022
Current state of supervised and unsupervised depth completion methods

Awesome Depth Completion Table of Contents About Sparse-to-Dense Depth Completion Current State of Depth Completion Unsupervised VOID Benchmark Superv

224 Dec 28, 2022
SEJE Pytorch implementation

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering. Contents Inst

0 Oct 21, 2021
PyTorch implementation of Federated Learning with Non-IID Data, and federated learning algorithms, including FedAvg, FedProx.

Federated Learning with Non-IID Data This is an implementation of the following paper: Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, Vik

Youngjoon Lee 48 Dec 29, 2022