Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Last update: Dec 30, 2022

Related tags

Overview

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Abstract

For practical deep neural network design on mobile devices, it is essential to consider the constraints incurred by the computational resources and the inference latency in various applications. Among deep network acceleration related approaches, pruning is a widely adopted practice to balance the computational resource consumption and the accuracy, where unimportant connections can be removed either channel-wisely or randomly with a minimal impact on model accuracy. The channel pruning instantly results in a significant latency reduction, while the random weight pruning is more flexible to balance the latency and accuracy. In this paper, we present a unified framework with Joint Channel pruning and Weight pruning (JCW), and achieves a better Pareto-frontier between the latency and accuracy than previous model compression approaches. To fully optimize the trade-off between the latency and accuracy, we develop a tailored multi-objective evolutionary algorithm in the JCW framework, which enables one single search to obtain the optimal candidate architectures for various deployment requirements. Extensive experiments demonstrate that the JCW achieves a better trade-off between the latency and accuracy against various state-of-the-art pruning methods on the ImageNet classification dataset.

Framework

Evaluation

Resnet18

Method	Latency/ms	Accuracy
Uniform 1x	537	69.8
DMCP	341	69.7
APS	363	70.3
JCW	160	69.2
	194	69.7
	196	69.9
	224	70.2

MobileNetV1

Method	Latency/ms	Accuracy
Uniform 1x	167	70.9
Uniform 0.75x	102	68.4
Uniform 0.5x	53	64.4
AMC	94	70.7
Fast	61	68.4
AutoSlim	99	71.5
AutoSlim	55	67.9
USNet	102	69.5
USNet	53	64.2
JCW	31	69.1
	39	69.9
	43	69.8
	54	70.3
	69	71.4

MobileNetV2

Method	Latency/ms	Accuracy
Uniform 1x	114	71.8
Uniform 0.75x	71	69.8
Uniform 0.5x	41	65.4
APS	110	72.8
APS	64	69.0
DMCP	83	72.4
DMCP	45	67.0
DMCP	43	66.1
Fast	89	72.0
Fast	62	70.2
JCW	30	69.1
	40	69.9
	44	70.8
	59	72.2

Requirements

torch
torchvision
numpy
scipy

Usage

The JCW works in a two-step fashion. i.e. the search step and the training step. The search step seaches for the layer-wise channel numbers and weight sparsity for Pareto-optimal models. The training steps trains the searched models with ADMM. We give a simple example for resnet18.

The search step

Modify the configuration file

First, open the file experiments/res18-search.yaml:
```
vim experiments/res18-search.yaml
```
Go to the 44th line and find the following codes:
```
DATASET:
  data: ImageNet
  root: /path/to/imagenet
  ...
```
and modify the root property of DATASET to the path of ImageNet dataset on your machine.
Apply the search

After modifying the configuration file, you can simply start the search by:
```
python emo_search.py --config experiments/res18-search.yaml | tee experiments/res18-search.log
```
After searching, the search results will be saved in experiments/search.pth

The training step

After searching, we can train the searched models by:

Modify the base configuration file

Open the file experiments/res18-train.yaml:
```
vim experiments/res18-train.yaml
```
Go to the 5th line, find the following codes:
```
root: &root /path/to/imagenet
```
and modify the root property to the path of ImageNet dataset on your machine.
Generate configuration files for training

After modifying the base configuration file, we are ready to generate the configuration files for training. To do that, simply run the following command:
```
python scripts/generate_training_configs.py --base-config experiments/res18-train.yaml --search-result experiments/search.pth --output ./train-configs 
```
After running the above command, the training configuration files will be written into ./train-configs/model-{id}/train.yaml.
Apply the training

After generating the configuration files, simply run the following command to train one certain model:
```
python train.py --config xxxx/xxx/train.yaml | tee xxx/xxx/train.log
```

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Related tags

Overview

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Abstract

Framework

Evaluation

Resnet18

MobileNetV1

MobileNetV2

Requirements

Usage

The search step

The training step

Owner

Graph InfoClust: Leveraging cluster-level node information for unsupervised graph representation learning

A configurable, tunable, and reproducible library for CTR prediction

SNE-RoadSeg in PyTorch, ECCV 2020

A PyTorch implementation of DenseNet.

Unofficial Pytorch Implementation of WaveGrad2

Multi-Modal Machine Learning toolkit based on PyTorch.

This is the code of "Multi-view Contrastive Graph Clustering" in NeurlPS 2021.

Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Code repository for paper `Skeleton Merger: an Unsupervised Aligned Keypoint Detector`.

Official respository for "Modeling Defocus-Disparity in Dual-Pixel Sensors", ICCP 2020

Deep Reinforcement Learning based Trading Agent for Bitcoin

The Ludii general game system, developed as part of the ERC-funded Digital Ludeme Project.

[ICCV2021] Official code for "Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition"

Code repo for "Transformer on a Diet" paper

Bounding Wasserstein distance with couplings

GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

Answer a series of contextually-dependent questions like they may occur in natural human-to-human conversations.

The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

Adaout is a practical and flexible regularization method with high generalization and interpretability

[ICCV 2021] Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation