Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2002.11798)

Last update: Dec 07, 2022

Overview

Representation Robustness Evaluations

Our implementation is based on code from MadryLab's robustness package and Devon Hjelm's Deep InfoMax. For all the scripts, we assume the working directory to be the root folder of our code.

Get ready a pre-trained model

We have two methods to pre-train a model for evaluation. Method 1: Follow instructions from MadryLab's robustness package to train a standard model or a robust model with a given PGD setting. For example, to train a robust ResNet18 with l-inf constraint of eps 8/255

python -m robustness.main --dataset cifar \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch resnet18 \
--epoch 150 \
--adv-train 1 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--exp-name resnet18_adv

Method 2: Use our wrapped code and set task=train-model. Optional commands:

--classifier-loss = robust (adversarial training) / standard (standard training)
--arch = baseline_mlp (baseline-h with last two layer as mlp) / baseline_linear (baseline-h with last two layer as linear classifier) / vgg16 / ...

Our results presented in Figure 1 and 2 use model architecture: baseline_mlp, resnet18, vgg16, resnet50, DenseNet121. For example, to train a baseline-h model with l-inf constraint of eps 8/255

python main.py --dataset cifar \
--task train-model \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch baseline_mlp \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--classifier-loss robust \
--exp-name baseline_mlp_adv

To parse the store file, run

from cox import store
s = store.Store('/path/to/model/parent-folder', 'model-folder')
print(s['logs'].df)
s.close()

Evaluate the representation robustness (Figure 1, 2, 3)

Set task=estimate-mi to load a pre-trained model and test the mutual information between input and representation. By subtracting the normal-case and worst-case mutual information we have the representation vulnerability. Optional commands:

--estimator-loss = worst (worst-case mutual information estimation) / normal (normal-case mutual information estimation)

For example, to test the worst-case mutual information of ResNet18, run

python main.py --dataset cifar \
--data /path/to/dataset \
--out-dir /path/to/output \
--task estimate-mi \
--representation-type layer \
--estimator-loss worst \
--arch resnet18 \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--resume /path/to/saved/model/checkpoint.pt.best \
--exp-name estimator_worst__resnet18_adv \
--no-store

or to test on the baseline-h, run

python main.py --dataset cifar \
--data /path/to/dataset \
--out-dir /path/to/output \
--task estimate-mi \
--representation-type layer \
--estimator-loss worst \
--arch baseline_mlp \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--resume /path/to/saved/model/checkpoint.pt.best \
--exp-name estimator_worst__baseline_mlp_adv \
--no-store

Learn Representations

Set task=train-encoder to learn a representation using our training principle. For train by worst-case mutual information maximization, we can use other lower-bound of mutual information as surrogate for our target, which may have slightly better empirical performance (e.g. nce). Please refer to arxiv.org/abs/1808.06670 for more information. Optional commands:

--estimator-loss = worst (worst-case mutual information maximization) / normal (normal-case mutual information maximization)
--va-mode = dv (Donsker-Varadhan representation) / nce (Noise-Contrastive Estimation) / fd (fenchel dual representation)
--arch = basic_encoder (Hjelm et al.) / ...

Example:

python main.py --dataset cifar \
--task train-encoder \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch basic_encoder \
--representation-type layer \
--estimator-loss worst \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--exp-name learned_encoder

Test on Downstream Classifications (Figure 4, 5, 6; Table 1, 3)

Set task=train-classifier to test the classification accuracy of learned representations. Optional commands:

--classifier-loss = robust (adversarial classification) / standard (standard classification)
--classifier-arch = mlp (mlp as downstream classifier) / linear (linear classifier as downstream classifier)

Example:

python main.py --dataset cifar \
--task train-classifier \
--data /path/to/dataset \
--out-dir /path/to/output \
--arch basic_encoder \
--classifier-arch mlp \
--representation-type layer \
--classifier-loss robust \
--epoch 500 --lr 1e-4 --step-lr 10000 --workers 2 \
--attack-lr=1e-2 --constraint inf --eps 8/255 \
--resume /path/to/saved/model/checkpoint.pt.latest \
--exp-name test_learned_encoder

Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2002.11798)

Related tags

Overview

Representation Robustness Evaluations

Get ready a pre-trained model

Evaluate the representation robustness (Figure 1, 2, 3)

Learn Representations

Test on Downstream Classifications (Figure 4, 5, 6; Table 1, 3)

Owner

Sicheng

Tensorflow implementation of Swin Transformer model.

Synthesize photos from PhotoDNA using machine learning 🌱

Guided Internet-delivered Cognitive Behavioral Therapy Adherence Forecasting

Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

OMLT: Optimization and Machine Learning Toolkit

A torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format

Predicting the duration of arrival delays for commercial flights.

Animate molecular orbital transitions using Psi4 and Blender

Official code repository for the publication "Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons"

QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

Algorithms for outlier, adversarial and drift detection

PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

[3DV 2021] A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks

A deep learning based semantic search platform that computes similarity scores between provided query and documents

Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals"

Repository of 3D Object Detection with Pointformer (CVPR2021)

Pneumonia Detection using machine learning - with PyTorch

Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

A PyTorch Implementation of the Luna: Linear Unified Nested Attention

Unsupervised Video Interpolation using Cycle Consistency