Certified Patch Robustness via Smoothed Vision Transformers

Last update: Dec 14, 2022

Related tags

Overview

Certified Patch Robustness via Smoothed Vision Transformers

This repository contains the code for replicating the results of our paper:

Certified Patch Robustness via Smoothed Vision Transformers
Hadi Salman*, Saachi Jain*, Eric Wong*, Aleksander Madry

Paper
Blog post Part I.
Blog post Part II.

    @article{salman2021certified,
        title={Certified Patch Robustness via Smoothed Vision Transformers},
        author={Hadi Salman and Saachi Jain and Eric Wong and Aleksander Madry},
        booktitle={ArXiv preprint arXiv:2110.07719},
        year={2021}
    }

Getting started

Our code relies on the MadryLab public robustness library, which will be automatically installed when you follow the instructions below.

Clone our repo: git clone https://github.mit.edu/hady/smoothed-vit

Install dependencies:

conda create -n smoothvit python=3.8
conda activate smoothvit
pip install -r requirements.txt

Full pipeline for building smoothed ViTs.

Now, we will walk you through the steps to create a smoothed ViT on the CIFAR-10 dataset. Similar steps can be followed for other datasets.

The entry point of our code is main.py (see the file for a full description of arguments).

First we will train the base classifier with ablations as data augmentation. Then we will apply derandomizd smoothing to build a smoothed version of the model which is certifiably robust.

Training the base classifier

The first step is to train the base classifier (here a ViT-Tiny) with ablations.

python src/main.py \
      --dataset cifar10 \
      --data /tmp \
      --arch deit_tiny_patch16_224 \
      --pytorch-pretrained \
      --out-dir OUTDIR \
      --exp-name demo \
      --epochs 30 \
      --lr 0.01 \
      --step-lr 10 \
      --batch-size 128 \
      --weight-decay 5e-4 \
      --adv-train 0 \
      --freeze-level -1 \
      --drop-tokens \
      --cifar-preprocess-type simple224 \
      --ablate-input \
      --ablation-type col \
      --ablation-size 4

Once training is done, the mode is saved in OUTDIR/demo/.

Certifying the smoothed classifier

Now we are ready to apply derandomized smoothing to obtain certificates for each datapoint against adversarial patches. To do so, simply run:

python src/main.py \
      --dataset cifar10 \
      --data /tmp \
      --arch deit_tiny_patch16_224 \
      --out-dir OUTDIR \
      --exp-name demo \
      --batch-size 128 \
      --adv-train 0 \
      --freeze-level -1 \
      --drop-tokens \
      --cifar-preprocess-type simple224 \
      --resume \
      --eval-only 1 \
      --certify \
      --certify-out-dir OUTDIR_CERT \
      --certify-mode col \
      --certify-ablation-size 4 \
      --certify-patch-size 5

This will calculate the standard and certified accuracies of the smoothed model. The results will be dumped into OUTDIR_CERT/demo/.

That's it! Now you can replicate all the results of our paper.

Download our ImageNet models

If you find our pretrained models useful, please consider citing our work.

Models trained with column ablations

Model	Ablation Size = 19
ResNet-18	LINK
ResNet-50	LINK
WRN-101-2	LINK
ViT-T	LINK
ViT-S	LINK
ViT-B	LINK

We have uploaded the most important models. If you need any other model (for the sweeps for example) please let us know and we are happy to provide!

Certified Patch Robustness via Smoothed Vision Transformers

Related tags

Overview

Certified Patch Robustness via Smoothed Vision Transformers

Getting started

Full pipeline for building smoothed ViTs.

Training the base classifier

Certifying the smoothed classifier

Download our ImageNet models

Models trained with column ablations

Maintainers

Owner

Madry Lab

InvTorch: memory-efficient models with invertible functions

Continual reinforcement learning baselines: experiment specifications, implementation of existing methods, and common metrics. Easily extensible to new methods.

Lama-cleaner: Image inpainting tool powered by LaMa

A new video text spotting framework with Transformer

Kaggle DSTL Satellite Imagery Feature Detection

NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

JAXDL: JAX (Flax) Deep Learning Library

Like ThreeJS but for Python and based on wgpu

Code for Multimodal Neural SLAM for Interactive Instruction Following

U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

A python interface for training Reinforcement Learning bots to battle on pokemon showdown

PINN(s): Physics-Informed Neural Network(s) for von Karman vortex street

Styleformer - Official Pytorch Implementation

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

A high-level Python library for Quantum Natural Language Processing

A implemetation of the LRCN in mxnet

Churn prediction

Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph

Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter