Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation

Introduction

ACoSP is an online pruning algorithm that compresses convolutional neural networks during training. It learns to select a subset of channels from convolutional layers through a sigmoid function, as shown in the figure. For each channel a w_i is used to scale activations.

The segmentation maps display compressed PSPNet-50 models trained on Cityscapes. The models are up to 16 times smaller.

Repository

This repository is a PyTorch implementation of ACoSP based on hszhao/semseg. It was used to run all experiments used for the publication and is meant to guarantee reproducibility and audibility of our results.

The training, test and configuration infrastructure is kept close to semseg, with only some minor modifications to enable more reproducibility and integrate our pruning code. The model/ package contains the PSPNet50 and SegNet model definitions. In acosp/ all code required to prune during training is defined.

The current configs expect a special folder structure (but can be easily adapted):

/data: Datasets, Pretrained-weights
/logs/exp: Folder to store experiments

Installation

Clone the repository:

git clone [email protected]:merantix/acosp.git

Install ACoSP including requirements:
```
pip install .
```

Using ACoSP

The implementation of ACoSP is encapsulated in /acosp and using it independent of all other experimentation code is quite straight forward.

Create a pruner and adapt the model:

from acosp.pruner import SoftTopKPruner
import acosp.inject

# Create pruner object
pruner = SoftTopKPruner(
    starting_epoch=0,
    ending_epoch=100,  # Pruning duration
    final_sparsity=0.5,  # Final sparsity
)
# Add sigmoid soft k masks to model
pruner.configure_model(model)

In your training loop update the temperature of all masking layers:

# Update the temperature in all masking layers
pruner.update_mask_layers(model, epoch)

Convert the soft pruning to hard pruning when ending_epoch is reached:

if epoch == pruner.ending_epoch:
    # Convert to binary channel mask
    acosp.inject.soft_to_hard_k(model)

Experiments

Highlight:

All initialization models, trained models are available. The structure is:

| init/  # initial models
| exp/
|-- ade20k/  # ade20k/camvid/cityscapes/voc2012/cifar10
| |-- pspnet50_{SPARSITY}/  # the sparsity refers to the relative amount of weights that are removed. I.e. sparsity=0.75 <==> compression_ratio=4 
|   |-- model # model files
|   |-- ... # config/train/test files
|-- evals/  # all result with class wise IoU/Acc

Hardware Requirements: At least 60GB (PSPNet50) / 16GB (SegNet) of GPU RAM. Can be distributed to multiple GPUs.
Train:
- Download related datasets and symlink the paths to them as follows (you can alternatively modify the relevant paths specified in folder config):
```
mkdir -p /
ln -s /path_to_ade20k_dataset /data/ade20k
```
- Download ImageNet pre-trained models and put them under folder /data for weight initialization. Remember to use the right dataset format detailed in FAQ.md.
- Specify the gpu used in config then do training. (Training using acosp have only been carried out on a single GPU. And not been tested with DDP). The general structure to access individual configs is as follows:
```
sh tool/train.sh ${DATASET} ${CONFIG_NAME_WITHOUT_DATASET}
```
  E.g. to train a PSPNet50 on the ade20k dataset and use the config `config/ade20k/ade20k_pspnet50.yaml', execute:
```
sh tool/train.sh ade20k pspnet50
```
Test:
- Download trained segmentation models and put them under folder specified in config or modify the specified paths.
- For full testing (get listed performance):
```
sh tool/test.sh ade20k pspnet50
```
Visualization: tensorboardX incorporated for better visualization.
```
tensorboard --logdir=/logs/exp/ade20k
```
Other:
- Resources: GoogleDrive LINK contains shared models, visual predictions and data lists.
- Models: ImageNet pre-trained models and trained segmentation models can be accessed. Note that our ImageNet pretrained models are slightly different from original ResNet implementation in the beginning part.
- Predictions: Visual predictions of several models can be accessed.
- Datasets: attributes (names and colors) are in folder dataset and some sample lists can be accessed.
- Some FAQs: FAQ.md.

Performance

Description: mIoU/mAcc stands for mean IoU, mean accuracy of each class and all pixel accuracy respectively. General parameters cross different datasets are listed below:

Network: {NETWORK} @ ACoSP-{COMPRESSION_RATIO}
Train Parameters: sync_bn(True), scale_min(0.5), scale_max(2.0), rotate_min(-10), rotate_max(10), zoom_factor(8), aux_weight(0.4), base_lr(1e-2), power(0.9), momentum(0.9), weight_decay(1e-4).
Test Parameters: ignore_label(255).

ADE20K: Train Parameters: classes(150), train_h(473), train_w(473), epochs(100). Test Parameters: classes(150), test_h(473), test_w(473), base_size(512).
- Setting: train on train (20210 images) set and test on val (2000 images) set.
Network mIoU/mAcc

PSPNet50 41.42/51.48

PSPNet50 @ ACoSP-2 38.97/49.56

PSPNet50 @ ACoSP-4 33.67/43.17

PSPNet50 @ ACoSP-8 28.04/35.60

PSPNet50 @ ACoSP-16 19.39/25.52
PASCAL VOC 2012: Train Parameters: classes(21), train_h(473), train_w(473), epochs(50). Test Parameters: classes(21), test_h(473), test_w(473), base_size(512).
- Setting: train on train_aug (10582 images) set and test on val (1449 images) set.
Network mIoU/mAcc

PSPNet50 77.30/85.27

PSPNet50 @ ACoSP-2 72.71/81.87

PSPNet50 @ ACoSP-4 65.84/77.12

PSPNet50 @ ACoSP-8 58.26/69.65

PSPNet50 @ ACoSP-16 48.06/58.83

Network	mIoU/mAcc
PSPNet50	41.42/51.48
PSPNet50 @ ACoSP-2	38.97/49.56
PSPNet50 @ ACoSP-4	33.67/43.17
PSPNet50 @ ACoSP-8	28.04/35.60
PSPNet50 @ ACoSP-16	19.39/25.52

Network	mIoU/mAcc
PSPNet50	77.30/85.27
PSPNet50 @ ACoSP-2	72.71/81.87
PSPNet50 @ ACoSP-4	65.84/77.12
PSPNet50 @ ACoSP-8	58.26/69.65
PSPNet50 @ ACoSP-16	48.06/58.83

Cityscapes: Train Parameters: classes(19), train_h(713/512 -PSP/SegNet), train_h(713/1024 -PSP/SegNet), epochs(200). Test Parameters: classes(19), train_h(713/512 -PSP/SegNet), train_h(713/1024 -PSP/SegNet), base_size(2048).

Setting: train on fine_train (2975 images) set and test on fine_val (500 images) set.

Network	mIoU/mAcc
PSPNet50	77.35/84.27
PSPNet50 @ ACoSP-2	74.11/81.73
PSPNet50 @ ACoSP-4	71.50/79.40
PSPNet50 @ ACoSP-8	66.06/74.33
PSPNet50 @ ACoSP-16	59.49/67.74
SegNet	65.12/73.85
SegNet @ ACoSP-2	64.62/73.19
SegNet @ ACoSP-4	60.77/69.57
SegNet @ ACoSP-8	54.34/62.48
SegNet @ ACoSP-16	44.12/50.87

CamVid: Train Parameters: classes(11), train_h(360), train_w(720), epochs(450). Test Parameters: classes(11), test_h(360), test_w(720), base_size(360).

Setting: train on train (367 images) set and test on test (233 images) set.

Network	mIoU/mAcc
SegNet	55.49+-0.85/65.44+-1.01
SegNet @ ACoSP-2	51.85+-0.83/61.86+-0.85
SegNet @ ACoSP-4	50.10+-1.11/59.79+-1.49
SegNet @ ACoSP-8	47.25+-1.18/56.87+-1.10
SegNet @ ACoSP-16	42.27+-1.95/51.25+-2.02

Cifar10: Train Parameters: classes(10), train_h(32), train_w(32), epochs(50). Test Parameters: classes(10), test_h(32), test_w(32), base_size(32).
- Setting: train on train (50000 images) set and test on test (10000 images) set.
Network mAcc

ResNet18 89.68

ResNet18 @ ACoSP-2 88.50

ResNet18 @ ACoSP-4 86.21

ResNet18 @ ACoSP-8 81.06

ResNet18 @ ACoSP-16 76.81

Network	mAcc
ResNet18	89.68
ResNet18 @ ACoSP-2	88.50
ResNet18 @ ACoSP-4	86.21
ResNet18 @ ACoSP-8	81.06
ResNet18 @ ACoSP-16	76.81

Citation

If you find the acosp/ code or trained models useful, please consider citing:

For the general training code, please also consider referencing hszhao/semseg.

Question

Some FAQ.md collected. You are welcome to send pull requests or give some advices. Contact information: at.

Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation

Related tags

Overview

Pytorch Implementation of Auto-Compressing Subset Pruning for Semantic Image Segmentation

Introduction

Repository

Installation

Using ACoSP

Experiments

Performance

Citation

Question

Owner

Merantix

TextBPN Adaptive Boundary Proposal Network for Arbitrary Shape Text Detection

Reimplementation of Dynamic Multi-scale filters for Semantic Segmentation.

Hack Camera, Microphone, Location, Clipboard With Just a Link. Also, Get Many Details About Victim's Device. And So On...

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

OverFeat is a Convolutional Network-based image classifier and feature extractor.

An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

A curated (most recent) list of resources for Learning with Noisy Labels

This repository is the official implementation of Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

True Few-Shot Learning with Language Models

A Multi-attribute Controllable Generative Model for Histopathology Image Synthesis

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

Speech Recognition using DeepSpeech2.

Implementation of "GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings" in PyTorch

An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models.

A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.

An implementation of a discriminant function over a normal distribution to help classify datasets.

Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification (NeurIPS 2021)

Convex optimization for fun and profit.