Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization

Overview

Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization

This repository contains the source code for the paper (link will be posted).

Requirements

  • GPU
  • Python 3
  • PyTorch 1.9
    • Earlier version may work, but untested.
  • pip install -r requirements.txt
  • If running ResNet-21 or ImageNet experiments, first download and prepare the ImageNet 2012 dataset with bin/imagenet_prep.sh script.

Running

For non-ImageNet experiments, the main python file is main.py. To see its arguments:

python main.py --help

Running for the first time can take a little longer due to automatic downloading of the MNIST and Cifar-10 dataset from the Internet.

For ImageNet experiments, the main python files are main_imagenet_float.py and main_imagenet_binary.py. Too see their arguments:

python main_imagenet_float.py --help

and

python main_imagenet_binary.py --help

The ImageNet dataset must be already downloaded and prepared. Please see the requirements section for details.

Scripts

The main python file has many options. The following scripts runs training with hyper-parameters given in the paper. Output includes a run-log text file and tensorboard files. These files are saved to ./logs and reused for subsequent runs.

300-100-10

Sensitivity Pre-training

# Layer 1. Learning rate 0.1.
./scripts/mnist/300/sensitivity/layer.sh sensitivity forward 0.1 0
# Layer 2. Learning rate 0.1.
./scripts/mnist/300/sensitivity/layer.sh sensitivity 231 0.1 0
# Layer 3. Learning rate 0.1.
./scripts/mnist/300/sensitivity/layer.sh sensitivity reverse 0.1 0

Output files and run-log are written to ./logs/mnist/val/sensitivity/.

Hyperparam search

For floating-point training:

# Learning rate 0.1.
./scripts/mnist/300/val/float.sh hyperparam 0.1 0

For full binary training:

# Learning rate 0.1.
./scripts/mnist/300/val/binary.sh hyperparam 0.1 0

For iterative training:

# Forward order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam forward 0.1 0
# Reverse order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam reverse 0.1 0
# 1, 3, 2 order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam 132 0.1 0
# 2, 1, 3 order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam 213 0.1 0
# 2, 3, 1 order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam 231 0.1 0
# 3, 1, 2 order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam 312 0.1 0

Output files and run-log are written to ./logs/mnist/val/hyperparam/.

Full Training

For floating-point training:

# Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/float.sh full 0.1 316 0

For full binary training:

# Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/binary.sh full 0.1 316 0

For iterative training:

# Forward order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full forward 0.1 316 0
# Reverse order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full reverse 0.1 316 0
# 1, 3, 2 order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full 132 0.1 316 0
# 2, 1, 3 order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full 213 0.1 316 0
# 2, 3, 1 order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full 231 0.1 316 0
# 3, 1, 2 order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full 312 0.1 316 0

Output files and run-log are written to ./logs/mnist/run/full/.

784-100-10

Sensitivity Pre-training

# Layer 1. Learning rate 0.1.
./scripts/mnist/784/sensitivity/layer.sh sensitivity forward 0.1 0
# Layer 2. Learning rate 0.1.
./scripts/mnist/784/sensitivity/layer.sh sensitivity 231 0.1 0
# Layer 3. Learning rate 0.1.
./scripts/mnist/784/sensitivity/layer.sh sensitivity reverse 0.1 0

Output files and run-log are written to ./logs/mnist/val/sensitivity/.

Hyperparam search

For floating-point training:

# Learning rate 0.1.
./scripts/mnist/784/val/float.sh hyperparam 0.1 0

For full binary training:

# Learning rate 0.1.
./scripts/mnist/784/val/binary.sh hyperparam 0.1 0

For iterative training:

# Forward order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam forward 0.1 0
# Reverse order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam reverse 0.1 0
# 1, 3, 2 order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam 132 0.1 0
# 2, 1, 3 order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam 213 0.1 0
# 2, 3, 1 order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam 231 0.1 0
# 3, 1, 2 order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam 312 0.1 0

Output files and run-log are written to ./logs/mnist/val/hyperparam/.

Full Training

For floating-point training:

# Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/float.sh full 0.1 316 0

For full binary training:

# Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/binary.sh full 0.1 316 0

For iterative training:

# Forward order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full forward 0.1 316 0
# Reverse order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full reverse 0.1 316 0
# 1, 3, 2 order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full 132 0.1 316 0
# 2, 1, 3 order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full 213 0.1 316 0
# 2, 3, 1 order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full 231 0.1 316 0
# 3, 1, 2 order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full 312 0.1 316 0

Output files and run-log are written to ./logs/mnist/run/full/.

Vgg-5

Sensitivity Pre-training

# Layer 1. Learning rate 0.1.
./scripts/cifar10/vgg5/sensitivity/layer.sh sensitivity 1 0.1 0
# Layer 2. Learning rate 0.1.
./scripts/cifar10/vgg5/sensitivity/layer.sh sensitivity 2 0.1 0
# Layer 5. Learning rate 0.1.
./scripts/cifar10/vgg5/sensitivity/layer.sh sensitivity 5 0.1 0

Output files and run-log are written to ./logs/cifar10/val/sensitivity/.

Hyperparam Search

For floating-point training:

# Learning rate 0.1.
./scripts/cifar10/vgg5/val/float.sh hyperparam 0.1 0

For full binary training:

# Learning rate 0.1.
./scripts/cifar10/vgg5/val/binary.sh hyperparam 0.1 0

For iterative training:

# Forward order. Learning rate 0.1.
./scripts/cifar10/vgg5/val/layer.sh hyperparam forward 0.1 0
# Ascend order. Learning rate 0.1.
./scripts/cifar10/vgg5/val/layer.sh hyperparam ascend 0.1 0
# Reverse order. Learning rate 0.1.
./scripts/cifar10/vgg5/val/layer.sh hyperparam reverse 0.1 0
# Descend order. Learning rate 0.1.
./scripts/cifar10/vgg5/val/layer.sh hyperparam descend 0.1 0
# Random order. Learning rate 0.1.
./scripts/cifar10/vgg5/val/layer.sh hyperparam random 0.1 0

Output files and run-log are written to ./logs/cifar10/val/hyperparam/.

Full Training

For floating-point training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/float.sh full 0.1 316 0

For full binary training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/binary.sh full 0.1 316 0

For iterative training:

# Forward order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/layer.sh full forward 0.1 316 0
# Ascend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/layer.sh full ascend 0.1 316 0
# Reverse order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/layer.sh full reverse 0.1 316 0
# Descend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/layer.sh full descend 0.1 316 0
# Random order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/layer.sh full random 0.1 316 0

Output files and run-log are written to ./logs/cifar10/run/full/.

Vgg-9

Sensitivity Pre-training

# Layer 1. Learning rate 0.1.
./scripts/cifar10/vgg9/sensitivity/layer.sh sensitivity 1 0.1 0
# Layer 2. Learning rate 0.1.
./scripts/cifar10/vgg9/sensitivity/layer.sh sensitivity 2 0.1 0
# Layer 5. Learning rate 0.1.
./scripts/cifar10/vgg9/sensitivity/layer.sh sensitivity 5 0.1 0

Output files and run-log are written to ./logs/cifar10/val/sensitivity/.

Hyperparam Search

For floating-point training:

# Learning rate 0.1.
./scripts/cifar10/vgg9/val/float.sh hyperparam 0.1 0

For full binary training:

# Learning rate 0.1.
./scripts/cifar10/vgg9/val/binary.sh hyperparam 0.1 0

For iterative training:

# Forward order. Learning rate 0.1.
./scripts/cifar10/vgg9/val/layer.sh hyperparam forward 0.1 0
# Ascend order. Learning rate 0.1.
./scripts/cifar10/vgg9/val/layer.sh hyperparam ascend 0.1 0
# Reverse order. Learning rate 0.1.
./scripts/cifar10/vgg9/val/layer.sh hyperparam reverse 0.1 0
# Descend order. Learning rate 0.1.
./scripts/cifar10/vgg9/val/layer.sh hyperparam descend 0.1 0
# Random order. Learning rate 0.1.
./scripts/cifar10/vgg9/val/layer.sh hyperparam random 0.1 0

Output files and run-log are written to ./logs/cifar10/val/hyperparam/.

Full Training

For floating-point training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/float.sh full 0.1 316 0

For full binary training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/binary.sh full 0.1 316 0

For iterative training:

# Forward order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/layer.sh full forward 0.1 316 0
# Ascend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/layer.sh full ascend 0.1 316 0
# Reverse order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/layer.sh full reverse 0.1 316 0
# Descend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/layer.sh full descend 0.1 316 0
# Random order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/layer.sh full random 0.1 316 0

Output files and run-log are written to ./logs/cifar10/run/full/.

ResNet-20

Sensitivity Pre-training

# Layer 1. Learning rate 0.1.
./scripts/cifar10/resnet20/sensitivity/layer.sh sensitivity 1 0.1 0
# Layer 2. Learning rate 0.1.
./scripts/cifar10/resnet20/sensitivity/layer.sh sensitivity 2 0.1 0
# ...
# Layer 20. Learning rate 0.1.
./scripts/cifar10/resnet20/sensitivity/layer.sh sensitivity 20 0.1 0

Output files and run-log are written to ./logs/cifar10/val/sensitivity/.

Hyperparam Search

For floating-point training:

# Learning rate 0.1
./scripts/cifar10/resnet20/val/float.sh hyperparam 0.1 0

For full binary training:

# Learning rate 0.1
./scripts/cifar10/resnet20/val/binary.sh hyperparam 0.1 0

For iterative training:

# Forward order. Learning rate 0.1
./scripts/cifar10/resnet20/val/layer.sh hyperparam forward 0.1 0
# Ascend order. Learning rate 0.1
./scripts/cifar10/resnet20/val/layer.sh hyperparam ascend 0.1 0
# Reverse order. Learning rate 0.1
./scripts/cifar10/resnet20/val/layer.sh hyperparam reverse 0.1 0
# Descend order. Learning rate 0.1
./scripts/cifar10/resnet20/val/layer.sh hyperparam descend 0.1 0
# Random order. Learning rate 0.1
./scripts/cifar10/resnet20/val/layer.sh hyperparam random 0.1 0

Output files and run-log are written to ./logs/cifar10/val/hyperparam/.

Full Training

For floating-point training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/float.sh full 0.1 316 0

For full binary training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/binary.sh full 0.1 316 0

For iterative training:

# Forward order. Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/layer.sh full forward 0.1 316 0
# Ascend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/layer.sh full ascend 0.1 316 0
# Reverse order. Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/layer.sh full reverse 0.1 316 0
# Descend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/layer.sh full descend 0.1 316 0
# Random order. Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/layer.sh full random 0.1 316 0

Output files and run-log are written to ./logs/cifar10/run/full/.

ResNet-21

To run experiments for ResNet-21, first download and prepare the ImageNet dataset. See the requirements section at the beginning of this readme. We assume the dataset is prepared and is at ./imagenet.

Sensitivity Pre-training

# Layer 1. Learning rate 0.01.
./scripts/imagenet/layer.sh sensitivity ./imagenet 20 "[20]" 20 1 0.01
# Layer 2. Learning rate 0.01.
./scripts/imagenet/layer.sh sensitivity ./imagenet 20 "[20]" 20 2 0.01
# Layer 21. Learning rate 0.01.
./scripts/imagenet/layer.sh sensitivity ./imagenet 20 "[20]" 20 21 0.01

Output files and run-log are written to ./logs/imagenet/sensitivity/.

Full Training

For floating-point training:

# Learning rate 0.01.
./scripts/imagenet/float.sh full ./imagenet 67 "[42,57]" 0.01

For full binary training:

# Learning rate 0.01.
./scripts/imagenet/binary.sh full ./imagenet 67 "[42,57]" 0.01

For layer-by-layer training:

# Forward order
./scripts/imagenet/layer.sh full ./imagenet 67 "[42,57]" 2 forward 0.01
# Ascending order
./scripts/imagenet/layer.sh full ./imagenet 67 "[42,57]" 2 ascend 0.01

For all scripts, output files and run-log are written to ./logs/imagenet/full/.

License

See LICENSE

Contributing

See the contributing guide for details of how to participate in development of the module.

Owner
Rakuten Group, Inc.
Rakuten Group, Inc.
Dynamic hair modeling from monocular videos using deep neural networks

Dynamic Hair Modeling The source code of the networks for our paper "Dynamic hair modeling from monocular videos using deep neural networks" (SIGGRAPH

53 Oct 18, 2022
Official PyTorch Implementation of Rank & Sort Loss [ICCV2021]

Rank & Sort Loss for Object Detection and Instance Segmentation The official implementation of Rank & Sort Loss. Our implementation is based on mmdete

Kemal Oksuz 229 Dec 20, 2022
Multi-Output Gaussian Process Toolkit

Multi-Output Gaussian Process Toolkit Paper - API Documentation - Tutorials & Examples The Multi-Output Gaussian Process Toolkit is a Python toolkit f

GAMES 113 Nov 25, 2022
CLOOB training (JAX) and inference (JAX and PyTorch)

cloob-training Pretrained models There are two pretrained CLOOB models in this repo at the moment, a 16 epoch and a 32 epoch ViT-B/16 checkpoint train

Katherine Crowson 64 Nov 27, 2022
NaijaSenti is an open-source sentiment and emotion corpora for four major Nigerian languages

NaijaSenti is an open-source sentiment and emotion corpora for four major Nigerian languages. This project was supported by lacuna-fund initiatives. Jump straight to one of the sections below, or jus

Hausa Natural Language Processing 14 Dec 20, 2022
Towards End-to-end Video-based Eye Tracking

Towards End-to-end Video-based Eye Tracking The code accompanying our ECCV 2020 publication and dataset, EVE. Authors: Seonwook Park, Emre Aksan, Xuco

Seonwook Park 76 Dec 12, 2022
Fast Axiomatic Attribution for Neural Networks (NeurIPS*2021)

Fast Axiomatic Attribution for Neural Networks This is the official repository accompanying the NeurIPS 2021 paper: R. Hesse, S. Schaub-Meyer, and S.

Visual Inference Lab @TU Darmstadt 11 Nov 21, 2022
A full-fledged version of Pix2Seq

Stable-Pix2Seq A full-fledged version of Pix2Seq What it is. This is a full-fledged version of Pix2Seq. Compared with unofficial-pix2seq, stable-pix2s

peng gao 205 Dec 27, 2022
iris - Open Source Photos Platform Powered by PyTorch

Open Source Photos Platform Powered by PyTorch. Submission for PyTorch Annual Hackathon 2021.

Omkar Prabhu 137 Sep 10, 2022
A python script to lookup Passport Index Dataset

visa-cli A python script to lookup Passport Index Dataset Installation pip install visa-cli Usage usage: visa-cli [-h] [-d DESTINATION_COUNTRY] [-f]

rand-net 16 Oct 18, 2022
This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Data Structure and Algorithms with Python This repository is related to the Arabic tutorial here, within the tutorial we discuss the common data struc

Mohamed Ayman 33 Dec 02, 2022
[CVPR 2020] GAN Compression: Efficient Architectures for Interactive Conditional GANs

GAN Compression project | paper | videos | slides [NEW!] GAN Compression is accepted by T-PAMI! We released our T-PAMI version in the arXiv v4! [NEW!]

MIT HAN Lab 1k Jan 07, 2023
PyTorch code of my WACV 2022 paper Improving Model Generalization by Agreement of Learned Representations from Data Augmentation

Improving Model Generalization by Agreement of Learned Representations from Data Augmentation (WACV 2022) Paper ArXiv Why it matters? When data augmen

Rowel Atienza 5 Mar 04, 2022
Image marine sea litter prediction Shiny

MARLITE Shiny app for floating marine litter detection in aerial images. This directory contains the instructions and software needed to install the S

19 Dec 22, 2022
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python = 3.6 , Pytorch

FuxiVirtualHuman 84 Jan 03, 2023
a basic code repository for basic task in CV(classification,detection,segmentation)

basic_cv a basic code repository for basic task in CV(classification,detection,segmentation,tracking) classification generate dataset train predict de

1 Oct 15, 2021
Retrieve and analysis data from SDSS (Sloan Digital Sky Survey)

Author: Behrouz Safari License: MIT sdss A python package for retrieving and analysing data from SDSS (Sloan Digital Sky Survey) Installation Install

Behrouz 3 Oct 28, 2022
Code for the paper: Sketch Your Own GAN

Sketch Your Own GAN Project | Paper | Youtube | Slides Our method takes in one or a few hand-drawn sketches and customizes an off-the-shelf GAN to mat

677 Dec 28, 2022
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

Meta Research 1.1k Jan 03, 2023
make ASCII Art by Deep Learning

DeepAA This is convolutional neural networks generating ASCII art. This repository is under construction. This work is accepted by NIPS 2017 Workshop,

OsciiArt 1.4k Dec 28, 2022