The code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

Last update: Apr 20, 2022

Related tags

Overview

The Code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

Setting up and using the repo

Get the dataset. Follow the steps in data/README.md. This includes the steps to get the pretrained BERT embeddings and visual representations.
Install cuda 11.0 if it's not available already.
Install anaconda if it's not available already, and create a new environment. You need to install a few things, namely, pytorch 1.7.1, torchvision, and allennlp.

wget https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64.sh
conda update -n base -c defaults conda
conda create --name MCC python=3.6
source activate MCC

conda install numpy pyyaml setuptools cmake cffi tqdm pyyaml scipy ipython mkl mkl-include cython typing h5py pandas nltk spacy numpydoc scikit-learn jpeg

conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=11.0 -c pytorch

pip install -r allennlp-requirements.txt
pip install --no-deps allennlp==0.8.0
python -m spacy download en_core_web_sm


# this one is optional but it should help make things faster
pip uninstall pillow && CC="cc -mavx2" pip install -U --force-reinstall pillow-simd

That's it! Now to set up the environment, run source activate MCC.

Train/Evaluate models

Please refer to models/README.md.

Acknowledgement

We refer to the repo r2c and tab-vcr for preprocessing codes.

Cite

@inproceedings{zhang2021multi,
  title={Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning},
  author={Zhang, Xi and Zhang, Feifei and Xu, Changsheng},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  pages={1793--1802},
  year={2021}
}

The code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

Related tags

Overview

The Code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

Setting up and using the repo

Train/Evaluate models

Acknowledgement

Cite

Owner

A new framework, collaborative cascade prediction based on graph neural networks (CCasGNN) to jointly utilize the structural characteristics, sequence features, and user profiles.

A Streamlit demo demonstrating the Deep Dream technique. Adapted from the TensorFlow Deep Dream tutorial.

SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

Tensorflow Implementation of the paper "Spectral Normalization for Generative Adversarial Networks" (ICML 2017 workshop)

Code and dataset for AAAI 2021 paper FixMyPose: Pose Correctional Describing and Retrieval Hyounghun Kim, Abhay Zala, Graham Burri, Mohit Bansal.

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.

Vignette is a face tracking software for characters using osu!framework.

Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

Code and data for the paper "Hearing What You Cannot See"

My take on a practical implementation of Linformer for Pytorch.

Code for "Universal inference meets random projections: a scalable test for log-concavity"

Measuring and Improving Consistency in Pretrained Language Models

TensorFlow Implementation of Unsupervised Cross-Domain Image Generation

Short and long time series classification using convolutional neural networks

Stacked Recurrent Hourglass Network for Stereo Matching

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

Reinforcement learning algorithms in RLlib

iBOT: Image BERT Pre-Training with Online Tokenizer

(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."