Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Overview

Contrast and Mix (CoMix)

The repository contains the codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing part of Advances in Neural Information Processing Systems (NeurIPS) 2021.

Aadarsh Sahoo1, Rutav Shah1, Rameswar Panda2, Kate Saenko2,3, Abir Das1

1 IIT Kharagpur, 2 MIT-IBM Watson AI Lab, 3 Boston University

[Paper] [Project Page]

 

Fig. Temporal Contrastive Learning with Background Mixing and Target Pseudo-labels. Temporal contrastive loss (left) contrasts a single temporally augmented positive (same video, different speed) per anchor against rest of the videos in a mini-batch as negatives. Incorporating background mixing (middle) provides additional positives per anchor possessing same action semantics with a different background alleviating background shift across domains. Incorporating target pseudo-labels (right) additionally enhances the discriminabilty by contrasting the target videos with the same pseudo-label as positives against rest of the videos as negatives.

 

Preparing the Environment

Conda

Please use the comix_environment.yml file to create the conda environment comix as:

conda env create -f comix_environment.yml

Pip

Please use the requirements.txt file to install all the required dependencies as:

pip install -r requirements.txt

Data Directory Structure

All the datasets should be stored in the folder ./data following the convention ./data/ and it must be passed as an argument to base_dir=./data/ .

UCF - HMDB

For ucf_hmdb dataset with base_dir=./data/ucf_hmdb the structure would be as follows:

.
├── ...
├── data
│   ├── ucf_hmdb
│   │   ├── ucf_videos
|   |   |   ├── 
   
    
|   |   |   |   ├── 
    
     
|   |   |   |   ├── 
     
      
|   |   |   |   ├── ...
|   |   |   ├── 
      
       
|   |   |   ├── ...
│   │   ├── hmdb_videos
|   |   ├── ucf_BG
|   |   └── hmdb_BG
│   └──
└──

      
     
    
   
Jester

For Jester dataset with base_dir=./data/jester the structure would be as follows

.
├── ...
├── data
│   ├── jester
|   |   ├── jester_videos
|   |   |   ├── 
   
    
|   |   |   |   ├── 
    
     
|   |   |   |   ├── 
     
      
|   |   |   |   ├── ...
|   |   |   ├── 
      
       
|   |   |   ├── ...
|   |   ├── jester_BG
|   |   |   ├── 
       
         | | | | ├── 
        
          | | | ├── ... └── └── └── 
        
       
      
     
    
   
Epic-Kitchens

For Epic Kitchens dataset with base_dir=./data/epic_kitchens the structure would be as follows (we follow the same structure as in the original dataset) :

.
├── ...
├── data
│   ├── epic_kitchens
|   |   ├── epic_kitchens_videos
|   |   |   ├── train
|   |   |   |   ├── D1
|   |   |   |   |   ├── 
   
    
|   |   |   |   |   |   ├── 
    
     
|   |   |   |   |   |   ├── 
     
      
|   |   |   |   |   |   ├── ...
|   |   |   |   |   ├── 
      
       
|   |   |   |   |   ├── ...
|   |   |   |   ├── D2
|   |   |   |   └── D3
|   |   |   └── test
└── └── └── epic_kitchens_BG

      
     
    
   

For using datasets stored in some other directories, please pass the parameter base_dir accordingly.

Background Extraction using Temporal Median Filtering

Please refer to the folder ./background_extraction for the codes to extract backgrounds using temporal median filtering.

Data

All the required split files are provided inside the directory ./video_splits.

The official download links for the datasets used for this paper are: [UCF-101] [HMDB-51] [Jester] [Epic Kitchens]

Training CoMix

Here are some of the sample and recomended commands to train CoMix for the transfer task of:

UCF -> HMDB from UCF-HMDB dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name UCF-HMDB --src_dataset UCF --tgt_dataset HMDB --batch_size 8 --model_root ./checkpoints_ucf_hmdb --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.1 --lambda_tpl 0.01 --base_dir ./data/ucf_hmdb

S -> T from Jester dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name Jester --src_dataset S --tgt_dataset T --batch_size 8 --model_root ./checkpoints_jester --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.1 --lambda_tpl 0.1 --base_dir ./data/jester

D1 -> D2 from Epic-Kitchens dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name Epic-Kitchens --src_dataset D1 --tgt_dataset D2 --batch_size 8 --model_root ./checkpoints_epic_d1_d2 --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.01 --lambda_tpl 0.01 --base_dir ./data/epic_kitchens

For detailed description regarding the arguments, use:

python main.py --help

Citing CoMix

If you use codes in this repository, consider citing CoMix. Thanks!

@article{sahoo2021contrast,
  title={Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing},
  author={Sahoo, Aadarsh and Shah, Rutav and Panda, Rameswar and Saenko, Kate and Das, Abir},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}
Owner
Computer Vision and Intelligence Research (CVIR)
The Computer Vision and Intelligence Research (CVIR) group is part of the Department of Computer Science and Engineering at IIT Kharagpur.
Computer Vision and Intelligence Research (CVIR)
QKeras: a quantization deep learning library for Tensorflow Keras

QKeras github.com/google/qkeras QKeras 0.8 highlights: Automatic quantization using QKeras; Stochastic behavior (including stochastic rouding) is disa

Google 437 Jan 03, 2023
Supplementary materials to "Spin-optomechanical quantum interface enabled by an ultrasmall mechanical and optical mode volume cavity" by H. Raniwala, S. Krastanov, M. Eichenfield, and D. R. Englund, 2022

Supplementary materials to "Spin-optomechanical quantum interface enabled by an ultrasmall mechanical and optical mode volume cavity" by H. Raniwala,

Stefan Krastanov 1 Jan 17, 2022
Extremely easy multi instancing software for minecraft speedrunning.

Easy Multi Extremely easy multi/single instancing software for minecraft speedrunning. A couple of goals of this project: Setup multi in minutes No fi

Duncan 8 Jul 16, 2022
source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT

LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval This repository contains source code and pre-trained/fine-tun

Siqi 65 Dec 26, 2022
Benchmark for Answering Existential First Order Queries with Single Free Variable

EFO-1-QA Benchmark for First Order Query Estimation on Knowledge Graphs This repository contains an entire pipeline for the EFO-1-QA benchmark. EFO-1

HKUST-KnowComp 14 Oct 24, 2022
Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and t

305 Dec 16, 2022
Continual learning with sketched Jacobian approximations

Continual learning with sketched Jacobian approximations This repository contains the code for reproducing figures and results in the paper ``Provable

Machine Learning and Information Processing Laboratory 1 Jun 30, 2022
TensorFlow for Raspberry Pi

TensorFlow on Raspberry Pi It's officially supported! As of TensorFlow 1.9, Python wheels for TensorFlow are being officially supported. As such, this

Sam Abrahams 2.2k Dec 16, 2022
Collect some papers about transformer with vision. Awesome Transformer with Computer Vision (CV)

Awesome Visual-Transformer Collect some Transformer with Computer-Vision (CV) papers. If you find some overlooked papers, please open issues or pull r

dkliang 2.8k Jan 08, 2023
👨‍💻 run nanosaur in simulation with Gazebo/Ingnition

🦕 👨‍💻 nanosaur_gazebo nanosaur The smallest NVIDIA Jetson dinosaur robot, open-source, fully 3D printable, based on ROS2 & Isaac ROS. Designed & ma

nanosaur 9 Jul 19, 2022
Simulated garment dataset for virtual try-on

Simulated garment dataset for virtual try-on This repository contains the dataset used in the following papers: Self-Supervised Collision Handling via

33 Dec 20, 2022
A Simplied Framework of GAN Inversion

Framework of GAN Inversion Introcuction You can implement your own inversion idea using our repo. We offer a full range of tuning settings (in hparams

Kangneng Zhou 13 Sep 27, 2022
Code for EMNLP2020 long paper: BERT-Attack: Adversarial Attack Against BERT Using BERT

BERT-ATTACK Code for our EMNLP2020 long paper: BERT-ATTACK: Adversarial Attack Against BERT Using BERT Dependencies Python 3.7 PyTorch 1.4.0 transform

Linyang Li 142 Jan 04, 2023
Source code for the Paper: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints}

CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints Installation Run pipenv install (at your own risk with --skip-lo

Autonomous Learning Group 65 Dec 27, 2022
Repo for EchoVPR: Echo State Networks for Visual Place Recognition

EchoVPR Repo for EchoVPR: Echo State Networks for Visual Place Recognition Currently under development Dirs: data: pre-collected hidden representation

Anil Ozdemir 4 Oct 04, 2022
deep_image_prior_extension

Code for "Is Deep Image Prior in Need of a Good Education?" Project page: https://jleuschn.github.io/docs.educated_deep_image_prior/. Supplementary Ma

riccardo barbano 7 Jan 09, 2022
The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

OverlapTransformer The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for

HAOMO.AI 136 Jan 03, 2023
Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

counterfactual-tpp This is a repository containing code and real data for the paper Counterfactual Temporal Point Processes. Pre-requisites This code

Networks Learning 11 Dec 09, 2022
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016

Temporal Segment Networks (TSN) We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation fo

1.4k Jan 01, 2023
Code for binary and multiclass model change active learning, with spectral truncation implementation.

Model Change Active Learning Paper (To Appear) Python code for doing active learning in graph-based semi-supervised learning (GBSSL) paradigm. Impleme

Kevin Miller 1 Jul 24, 2022