Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

Overview

Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently

This repository is the official implementation for the following paper Analytic-LISTA networks proposed in the following paper:

"Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently" by Xiaohan Chen, Jason Zhang and Zhangyang Wang from the VITA Research Group.

The code implements the Peek-a-Boo (PaB) algorithm for various convolutional networks and is tested in Linux environment with Python: 3.7.2, PyTorch 1.7.0+.

Getting Started

Dependency

pip install tqdm

Prerequisites

  • Python 3.7+
  • PyTorch 1.7.0+
  • tqdm

Data Preparation

To run ImageNet experiments, download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val/ folder respectively as shown below. A useful script for automatic extraction can be found here.

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

How to Run Experiments

CIFAR-10/100 Experiments

To apply PaB w/ PSG to a ResNet-18 network on CIFAR-10/100 datasets, use the following command:

python main.py --use-cuda 0 \
    --arch PsgResNet18 --init-method kaiming_normal \
    --optimizer BOP --ar 1e-3 --tau 1e-6 \
    --ar-decay-freq 45 --ar-decay-ratio 0.15 --epochs 180 \
    --pruner SynFlow --prune-epoch 0 \
    --prune-ratio 3e-1 --prune-iters 100 \
    --msb-bits 8 --msb-bits-weight 8 --msb-bits-grad 16 \
    --psg-threshold 1e-7 --psg-no-take-sign --psg-sparsify \
    --exp-name cifar10_resnet18_pab-psg

To break down the above complex command, PaB includes two stages (pruning and Bop training) and consists of three components (a pruner, a Bop optimizer and a PSG module).

[Pruning module] The pruning module is controlled by the following arguments:

  • --pruner - A string that indicates which pruning method to be used. Valid choices are ['Mag', 'SNIP', 'GraSP', 'SynFlow'].
  • --prune-epoch - An integer, the epoch index of when (the last) pruning is performed.
  • --prune-ratio - A float, the ratio of non-zero parameters remained after (the last) pruning
  • --prune-iters - An integeer, the number of pruning iterations in one run of pruning. Check the SynFlow paper for what this means.

[Bop optimizer] Bop has several hyperparameters that are essential to its successful optimizaiton as shown below. More details can be found in the original Bop paper.

  • --optimizer - A string that specifies the Bop optimizer. You can pass 'SGD' to this argument for a standard training of SGD. Check here.
  • --ar - A float, corresponding to the adativity rate for the calculation of gradient moving average.
  • --tau - A float, corresponding to the threshold that decides if a binary weight needs to be flipped.
  • --ar-decay-freq - An integer, interval in epochs between decays of the adaptivity ratio.
  • --ar-decay-ratio - A float, the decay ratio of the adaptivity ratio decaying.

[PSG module] PSG stands for Predictive Sign Gradient, which was originally proposed in the E2-Train paper. PSG uses low-precision computation during backward passes to save computational cost. It is controlled by several arguments.

  • --msb-bits, --msb-bits-weight, --msb-bits-grad - Three floats, the bit-width for the inputs, weights and output errors during back-propagation.
  • --psg-threshold - A float, the threshold that filters out coarse gradients with small magnitudes to reduce gradient variance.
  • --psg-no-take-sign - A boolean that indicates to bypass the "taking-the-sign" step in the original PSG method.
  • --psg-sparsify - A boolean. The filtered small gradients are set to zero when it is true.

ImageNet Experiments

For PaB experiments on ImageNet, we run the pruning and Bop training in a two-stage manner, implemented in main_imagenet_prune.py and main_imagenet_train.py, respectively.

To prune a ResNet-50 network at its initialization, we first run the following command to perform SynFlow, which follows a similar manner for the arguments as in CIFAR experiments:

export prune_ratio=0.5  # 50% remaining parameters.

# Run SynFlow pruning
python main_imagenet_prune.py \
    --arch resnet50 --init-method kaiming_normal \
    --pruner SynFlow --prune-epoch 0 \
    --prune-ratio $prune_ratio --prune-iters 100 \
    --pruned-save-name /path/to/the/pruning/output/file \
    --seed 0 --workers 32 /path/to/the/imagenet/dataset

We then train the pruned model using Bop with PSG on one node with multi-GPUs.

# Bop hyperparameters
export bop_ar=1e-3
export bop_tau=1e-6
export psg_threshold="-5e-7"

python main_imagenet_train.py \
    --arch psg_resnet50 --init-method kaiming_normal \
    --optimizer BOP --ar $bop_ar --tau $bop_tau \
    --ar-decay-freq 30 --ar-decay-ratio 0.15 --epochs 100 \
    --msb-bits 8 --msb-bits-weight 8 --msb-bits-grad 16 \
    --psg-sparsify --psg-threshold " ${psg_threshold}" --psg-no-take-sign \
    --savedir /path/to/the/output/dir \
    --resume /path/to/the/pruning/output/file \
    --exp-name 'imagenet_resnet50_pab-psg' \
    --dist-url 'tcp://127.0.0.1:2333' \
    --dist-backend 'nccl' --multiprocessing-distributed \
    --world-size 1 --rank 0 \
    --seed 0 --workers 32 /path/to/the/imagenet/dataset 

Acknowledgement

Thank you to Jason Zhang for helping with the development of the code repo, the research that we conducted with it and the consistent report after his movement to CMU. Thank you to Prof. Zhangyang Wang for the guidance and unreserved help with this project.

Cite this work

If you find this work or our code implementation helpful for your own resarch or work, please cite our paper.

@inproceedings{
chen2022peek,
title={Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently},
author={Xiaohan Chen and Jason Zhang and Zhangyang Wang},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=moHCzz6D5H3},
}
Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
Fuzzing JavaScript Engines with Aspect-preserving Mutation

DIE Repository for "Fuzzing JavaScript Engines with Aspect-preserving Mutation" (in S&P'20). You can check the paper for technical details. Environmen

gts3.org (<a href=[email protected])"> 190 Dec 11, 2022
Human4D Dataset tools for processing and visualization

HUMAN4D: A Human-Centric Multimodal Dataset for Motions & Immersive Media HUMAN4D constitutes a large and multimodal 4D dataset that contains a variet

tofis 15 Nov 09, 2022
GrabGpu_py: a scripts for grab gpu when gpu is free

GrabGpu_py a scripts for grab gpu when gpu is free. WaitCondition: gpu_memory

tianyuluan 3 Jun 18, 2022
[ICLR 2022] Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics

CPDeform Code and data for paper Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics at ICLR 2022 (Spotlight). @InProceed

(Lester) Sizhe Li 29 Nov 29, 2022
The mini-MusicNet dataset

mini-MusicNet A music-domain dataset for multi-label classification Music transcription is sequence-to-sequence prediction problem: given an audio per

John Thickstun 4 Nov 09, 2022
A Fast Sequence Transducer Implementation with PyTorch Bindings

transducer A Fast Sequence Transducer Implementation with PyTorch Bindings. The corresponding publication is Sequence Transduction with Recurrent Neur

Awni Hannun 184 Dec 18, 2022
Code and experiments for "Deep Neural Networks for Rank Consistent Ordinal Regression based on Conditional Probabilities"

corn-ordinal-neuralnet This repository contains the orginal model code and experiment logs for the paper "Deep Neural Networks for Rank Consistent Ord

Raschka Research Group 14 Dec 27, 2022
Faster RCNN pytorch windows

Faster-RCNN-pytorch-windows Faster RCNN implementation with pytorch for windows Open cmd, compile this comands: cd lib python setup.py build develop T

Hwa-Rang Kim 1 Nov 11, 2022
Age Progression/Regression by Conditional Adversarial Autoencoder

Age Progression/Regression by Conditional Adversarial Autoencoder (CAAE) TensorFlow implementation of the algorithm in the paper Age Progression/Regre

Zhifei Zhang 603 Dec 22, 2022
Learn the Deep Learning for Computer Vision in three steps: theory from base to SotA, code in PyTorch, and space-repetition with Anki

DeepCourse: Deep Learning for Computer Vision arthurdouillard.com/deepcourse/ This is a course I'm giving to the French engineering school EPITA each

Arthur Douillard 113 Nov 29, 2022
Coarse implement of the paper "A Simultaneous Denoising and Dereverberation Framework with Target Decoupling", On DNS-2020 dataset, the DNSMOS of first stage is 3.42 and second stage is 3.47.

SDDNet Coarse implement of the paper "A Simultaneous Denoising and Dereverberation Framework with Target Decoupling", On DNS-2020 dataset, the DNSMOS

Cyril Lv 43 Nov 21, 2022
A multi-scale unsupervised learning for deformable image registration

A multi-scale unsupervised learning for deformable image registration Shuwei Shao, Zhongcai Pei, Weihai Chen, Wentao Zhu, Xingming Wu and Baochang Zha

ShuweiShao 2 Apr 13, 2022
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
This program writes christmas wish programmatically. It is using turtle as a pen pointer draw christmas trees and stars.

Introduction This is a simple program is written in python and turtle library. The objective of this program is to wish merry Christmas programmatical

Gunarakulan Gunaretnam 1 Dec 25, 2021
Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

Piggyback: https://arxiv.org/abs/1801.06519 Pretrained masks and backbones are available here: https://uofi.box.com/s/c5kixsvtrghu9yj51yb1oe853ltdfz4q

Arun Mallya 165 Nov 22, 2022
CharacterGAN: Few-Shot Keypoint Character Animation and Reposing

CharacterGAN Implementation of the paper "CharacterGAN: Few-Shot Keypoint Character Animation and Reposing" by Tobias Hinz, Matthew Fisher, Oliver Wan

Tobias Hinz 181 Dec 27, 2022
🗣️ Microsoft Edge TTS for Home Assistant, no need for app_key

Microsoft Edge TTS for Home Assistant This component is based on the TTS service of Microsoft Edge browser, no need to apply for app_key. Install Down

152 Dec 31, 2022
Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects This repo contains the code of Segcache described in the followi

TheSys Group @ CMU CS 78 Jan 07, 2023
PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"

UMS for Multi-turn Response Selection Implements the model described in the following paper Do Response Selection Models Really Know What's Next? Utte

Taesun Whang 47 Nov 22, 2022
KDD CUP 2020 Automatic Graph Representation Learning: 1st Place Solution

KDD CUP 2020: AutoGraph Team: aister Members: Jianqiang Huang, Xingyuan Tang, Mingjian Chen, Jin Xu, Bohang Zheng, Yi Qi, Ke Hu, Jun Lei Team Introduc

96 May 30, 2022