[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

Overview

F8Net
Fixed-Point 8-bit Only Multiplication for Network Quantization (ICLR 2022 Oral)

OpenReview | arXiv | PDF | Model Zoo | BibTex

PyTorch implementation of neural network quantization with fixed-point 8-bit only multiplication.

F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
Qing Jin1,2, Jian Ren1, Richard Zhuang1, Sumant Hanumante1, Zhengang Li2, Zhiyu Chen3, Yanzhi Wang2, Kaiyuan Yang3, Sergey Tulyakov1
1Snap Inc., 2Northeastern University, 3Rice University
ICLR 2022 Oral.

Overview Neural network quantization implements efficient inference via reducing the weight and input precisions. Previous methods for quantization can be categorized as simulated quantization, integer-only quantization, and fixed-point quantization, with the former two involving high-precision multiplications with 32-bit floating-point or integer scaling. In contrast, fixed-point models can avoid such high-demanding requirements but demonstrates inferior performance to the other two methods. In this work, we study the problem of how to train such models. Specifically, we conduct statistical analysis on values for quantization and propose to determine the fixed-point format from data during training with some semi-empirical formula. Our method demonstrates that high-precision multiplication is not necessary for the quantized model to achieve comparable performance as their full-precision counterparts.

Getting Started

Requirements
  1. Please check the requirements and download packages.

  2. Prepare ImageNet-1k data following pytorch example, and create a softlink to the ImageNet data path to data under current the code directory (ln -s /path/to/imagenet data).

Model Training
Conventional training
  • We train the model with the file distributed_run.sh and the command
    bash distributed_run.sh /path/to/yml_file batch_size
    
  • We set batch_size=2048 for conventional training of floating-/fixed-point ResNet18 and MobileNet V1/V2.
  • Before training, please update the dataset_dir and log_dir arguments in the yaml files for training the floating-/fixed-point models.
  • To train the floating-point model, please use the yaml file ***_floating_train.yml in the conventional subfolder under the corresponding folder of the model.
  • To train the fixed-point model, please first train the floating-point model as the initialization. Please use the yaml file ***_fix_quant_train.yml in the conventional subfolder under the corresponding folder of the model. Please make sure the argument fp_pretrained_file directs to the correct path for the corresponding floating-point checkpoint. We also provide our pretrained floating-point models in the Model Zoo below.
Tiny finetuning
  • We finetune the model with the file run.sh and the command

    bash run.sh /path/to/yml_file batch_size
    
  • We set batch_size=128 and use one GPU for tiny-finetuning of fixed-point ResNet18/50.

  • Before fine-tuning, please update the dataset_dir and log_dir arguments in the yaml files for finetuning the fixed-point models.

  • To finetune the fixed-point model, please use the yaml file ***_fix_quant_***_pretrained_train.yml in the tiny_finetuning subfolder under the corresponding folder of the model. For model pretrained with PytorchCV (Baseline of ResNet18 and Baseline#1 of ResNet50), the floating-point checkpoint will be downloaded automatically during code running. For the model pretrained by Nvidia (Baseline#2 of ResNet50), please download the checkpoint first and make sure the argument nvidia_pretrained_file directs to the correct path of this checkpoint.

Model Testing
  • We test the model with the file run.sh and the command

    bash run.sh /path/to/yml_file batch_size
    
  • We set batch_size=128 and use one GPU for model testing.

  • Before testing, please update the dataset_dir and log_dir arguments in the yaml files. Please update the argument integize_file_path and int_op_only_file_path arguments in the yaml files ***_fix_quant_test***_integize.yml and ***_fix_quant_test***_int_op_only.yml, respectively. Please also update other arguments like nvidia_pretrained_file if necessary (even if they are not used during testing).

  • We use the yaml file ***_floating_test.yml for testing the floating-point model; ***_fix_quant***_test.yml for testing the fixed-point model with the same setting as during training/tiny-finetuning; ***_fix_quant***_test_int_model.yml for testing the fixed-point model on GPU with all quantized weights, bias and inputs implemented with integers (but with float dtype as GPU does not support integer operations) and use the original modules during training (e.g. with batch normalization layers); ***_fix_quant***_test_integize.yml for testing the fixed-point model on GPU with all quantized weights, bias and inputs implemented with integers (but with float dtype as GPU does not support integer operations) and a new equivalent model with only convolution, pooling and fully-connected layers; ***_fix_quant***_test_int_op_only.yml for testing the fixed-point model on CPU with all quantized weights, bias and inputs implemented with integers (with int dtype) and a new equivalent model with only convolution, pooling and fully-connected layers. Note that the accuracy from the four testing files can differ a little due to numerical error.

Model Export
  • We export fixed-point model with integer weights, bias and inputs to run on GPU and CPU during model testing with ***_fix_quant_test_integize.yml and ***_fix_quant_test_int_op_only.yml files, respectively.

  • The exported onnx files are saved to the path given by the arguments integize_file_path and int_op_only_file_path.

F8Net Model Zoo

All checkpoints and onnx files are available at here.

Conventional

Model Type Top-1 Acc.a Checkpoint
ResNet18 FP
8-bit
70.3
71.0
Res18_32
Res18_8
MobileNet-V1 FP
8-bit
72.4
72.8
MBV1_32
MBV1_8
MobileNet-V2b FP
8-bit
72.7
72.6
MBV2b_32
MBV2b_8

Tiny Finetuning

Model Type Top-1 Acc.a Checkpoint
ResNet18 FP
8-bit
73.1
72.3
Res18_32p
Res18_8p
ResNet50b (BL#1) FP
8-bit
77.6
77.6
Res50b_32p
Res50b_8p
ResNet50b (BL#2) FP
8-bit
78.5
78.1
Res50b_32n
Res50b_8n

a The accuracies are obtained from the inference step during training. Test accuracy for the final exported model might have some small accuracy difference due to numerical error.

Technical Details

The main techniques for neural network quantization with 8-bit fixed-point multiplication includes the following:

  • Quantized methods/modules including determining fixed-point formats from statistics or by grid-search, fusing convolution and batch normalization layers, and reformulating PACT with fixed-point quantization are implemented in models/fix_quant_ops.
  • Clipping-level sharing and private fractional length for residual blocks are implemented in the ResNet (models/fix_resnet) and MobileNet V2 (models/fix_mobilenet_v2).

Acknowledgement

This repo is based on AdaBits.

Citation

If our code or models help your work, please cite our paper:

@inproceedings{
  jin2022fnet,
  title={F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization},
  author={Qing Jin and Jian Ren and Richard Zhuang and Sumant Hanumante and Zhengang Li and Zhiyu Chen and Yanzhi Wang and Kaiyuan Yang and Sergey Tulyakov},
  booktitle={International Conference on Learning Representations},
  year={2022},
  url={https://openreview.net/forum?id=_CfpJazzXT2}
}
Owner
Snap Research
Snap Research
The missing CMake project initializer

cmake-init - The missing CMake project initializer Opinionated CMake project initializer to generate CMake projects that are FetchContent ready, separ

1k Jan 01, 2023
Learning to Map Large-scale Sparse Graphs on Memristive Crossbar

Release of AutoGMap:Learning to Map Large-scale Sparse Graphs on Memristive Crossbar For reproduction of our searched model, the Ubuntu OS is recommen

2 Aug 23, 2022
Learning Continuous Image Representation with Local Implicit Image Function

LIIF This repository contains the official implementation for LIIF introduced in the following paper: Learning Continuous Image Representation with Lo

Yinbo Chen 1k Dec 25, 2022
Towards Interpretable Deep Metric Learning with Structural Matching

DIML Created by Wenliang Zhao*, Yongming Rao*, Ziyi Wang, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for paper Towards Interpr

Wenliang Zhao 75 Nov 11, 2022
Expert Finding in Legal Community Question Answering

Expert Finding in Legal Community Question Answering Arian Askari, Suzan Verberne, and Gabriella Pasi. Expert Finding in Legal Community Question Answ

Arian Askari 3 Oct 31, 2022
Aerial Single-View Depth Completion with Image-Guided Uncertainty Estimation (RA-L/ICRA 2020)

Aerial Depth Completion This work is described in the letter "Aerial Single-View Depth Completion with Image-Guided Uncertainty Estimation", by Lucas

ETHZ V4RL 70 Dec 22, 2022
Implementation for Shape from Polarization for Complex Scenes in the Wild

sfp-wild Implementation for Shape from Polarization for Complex Scenes in the Wild project website | paper Code and dataset will be released soon. Int

Chenyang LEI 41 Dec 23, 2022
PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision.

PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision @misc{CV2018, author = {Donny You ( Donny You 40 Sep 14, 2022

3D Pose Estimation for Vehicles

3D Pose Estimation for Vehicles Introduction This work generates 4 key-points and 2 key-edges from vertices and edges of vehicles as ground truth. The

Jingyi Wang 1 Nov 01, 2021
PyTorch implementation of "Learn to Dance with AIST++: Music Conditioned 3D Dance Generation."

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation. Installation pip install -r requirements.txt Prepare Dataset bash data/scripts/pre

Zj Li 8 Sep 07, 2021
Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

Segmentation from Natural Language Expressions This repository contains the code for the following paper: R. Hu, M. Rohrbach, T. Darrell, Segmentation

Ronghang Hu 88 May 24, 2022
Delta Conformity Sociopatterns Analysis - Delta Conformity Sociopatterns Analysis

Delta_Conformity_Sociopatterns_Analysis ∆-Conformity is a local homophily measur

2 Jan 09, 2022
Re-implement CycleGAN in Tensorlayer

CycleGAN_Tensorlayer Re-implement CycleGAN in TensorLayer Original CycleGAN Improved CycleGAN with resize-convolution Prerequisites: TensorLayer Tenso

89 Aug 15, 2022
Buffon’s needle: one of the oldest problems in geometric probability

Buffon-s-Needle Buffon’s needle is one of the oldest problems in geometric proba

3 Feb 18, 2022
An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020

UnpairedSR An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020 turn RCAN(modified) -- xmodel(xilinx

JiaKui Hu 10 Oct 28, 2022
g9.py - Torch interactive graphics

g9.py - Torch interactive graphics A Torch toy in the browser. Demo at https://srush.github.io/g9py/ This is a shameless copy of g9.js, written in Pyt

Sasha Rush 13 Nov 16, 2022
Code for IntraQ, PyTorch implementation of our paper under review

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization paper Requirements Python = 3.7.10 Pytorch == 1.7

1 Nov 19, 2021
License Plate Detection Application

LicensePlate_Project 🚗 🚙 [Project] 2021.02 ~ 2021.09 License Plate Detection Application Overview 1. 데이터 수집 및 라벨링 차량 번호판 이미지를 직접 수집하여 각 이미지에 대해 '번호판

4 Oct 10, 2022
Rotary Transformer

[中文|English] Rotary Transformer Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE). The RoPE is a relative

325 Jan 03, 2023
[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets

[Project] [PDF] This repository contains code for our SIGGRAPH'22 paper "StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets" by Axel Sauer, Katja

742 Jan 04, 2023