Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Last update: Dec 15, 2022

Overview

Learning Pixel-level Semantic Affinity with Image-level Supervision

This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead.

Introduction

The code and trained models of:

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, Jiwoon Ahn and Suha Kwak, CVPR 2018 [Paper]

We have developed a framework based on AffinityNet to generate accurate segmentation labels of training images given their image-level class labels only. A segmentation network learned with our synthesized labels outperforms previous state-of-the-arts by large margins on the PASCAL VOC 2012.

*Our code was first implemented in Tensorflow at the time of CVPR 2018 submssion, and later we migrated to PyTorch. Some trivial details (optimizer, channel size, and etc.) have been changed.

Citation

If you find the code useful, please consider citing our paper using the following BibTeX entry.

@InProceedings{Ahn_2018_CVPR,
author = {Ahn, Jiwoon and Kwak, Suha},
title = {Learning Pixel-Level Semantic Affinity With Image-Level Supervision for Weakly Supervised Semantic Segmentation},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}

Prerequisite

Tested on Ubuntu 16.04, with Python 3.5, PyTorch 0.4, Torchvision 0.2.1, CUDA 9.0, and 1x NVIDIA TITAN X (Pascal).
The PASCAL VOC 2012 development kit: You also need to specify the path ('voc12_root') of your downloaded dev kit.
(Optional) If you want to try with the VGG-16 based network, PyCaffe and VGG-16 ImageNet pretrained weights [vgg16_20M.caffemodel]
(Optional) If you want to try with the ResNet-38 based network, Mxnet and ResNet-38 pretrained weights [ilsvrc-cls_rna-a1_cls1000_ep-0001.params]

Usage

1. Train a classification network to get CAMs.

python3 train_cls.py --lr 0.1 --batch_size 16 --max_epoches 15 --crop_size 448 --network [network.vgg16_cls | network.resnet38_cls] --voc12_root [your_voc12_root_folder] --weights [your_weights_file] --wt_dec 5e-4

2. Generate labels for AffinityNet by applying dCRF on CAMs.

python3 infer_cls.py --infer_list voc12/train_aug.txt --voc12_root [your_voc12_root_folder] --network [network.vgg16_cls | network.resnet38_cls] --weights [your_weights_file] --out_cam [desired_folder] --out_la_crf [desired_folder] --out_ha_crf [desired_folder]

(Optional) Check the accuracy of CAMs.

python3 infer_cls.py --infer_list voc12/val.txt --voc12_root [your_voc12_root_folder] --network network.resnet38_cls --weights res38_cls.pth --out_cam_pred [desired_folder]

3. Train AffinityNet with the labels

python3 train_aff.py --lr 0.1 --batch_size 8 --max_epoches 8 --crop_size 448 --voc12_root [your_voc12_root_folder] --network [network.vgg16_aff | network.resnet38_aff] --weights [your_weights_file] --wt_dec 5e-4 --la_crf_dir [your_output_folder] --ha_crf_dir [your_output_folder]

4. Perform Random Walks on CAMs

python3 infer_aff.py --infer_list [voc12/val.txt | voc12/train.txt] --voc12_root [your_voc12_root_folder] --network [network.vgg16_aff | network.resnet38_aff] --weights [your_weights_file] --cam_dir [your_output_folder] --out_rw [desired_folder]

Results and Trained Models

Class Activation Map

Model	Train (mIoU)	Val (mIoU)
VGG-16	48.9	46.6	[Weights]
ResNet-38	47.7	47.2	[Weights]
ResNet-38	48.0	46.8	CVPR submission

Random Walk with AffinityNet

Model	alpha	Train (mIoU)	Val (mIoU)
VGG-16	4/16/32	59.6	54.0	[Weights]
ResNet-38	4/16/32	61.0	60.2	[Weights]
ResNet-38	4/16/24	58.1	57.0	CVPR submission

*beta=8, gamma=5, t=256 for all settings

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Related tags

Overview

Learning Pixel-level Semantic Affinity with Image-level Supervision

Introduction

Citation

Prerequisite

Usage

1. Train a classification network to get CAMs.

2. Generate labels for AffinityNet by applying dCRF on CAMs.

(Optional) Check the accuracy of CAMs.

3. Train AffinityNet with the labels

4. Perform Random Walks on CAMs

Results and Trained Models

Class Activation Map

Random Walk with AffinityNet

Owner

Jiwoon Ahn

Pretrained Pytorch face detection (MTCNN) and recognition (InceptionResnet) models

MegEngine implementation of YOLOX

Generative Flow Networks

Learning Facial Representations from the Cycle-consistency of Face (ICCV 2021)

Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

A library for optimization on Riemannian manifolds

Official Chainer implementation of GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral)

Simple embedding based text classifier inspired by fastText, implemented in tensorflow

Machine learning for NeuroImaging in Python

Geometric Vector Perceptron --- a rotation-equivariant GNN for learning from biomolecular structure

Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs

RAANet: Range-Aware Attention Network for LiDAR-based 3D Object Detection with Auxiliary Density Level Estimation

PaSST: Efficient Training of Audio Transformers with Patchout

LERP : Label-dependent and event-guided interpretable disease risk prediction using EHRs

This repository contains the scripts for downloading and validating scripts for the documents

Probabilistic Programming and Statistical Inference in PyTorch

Learning Multiresolution Matrix Factorization and its Wavelet Networks on Graphs

This project aims at building a real-time wide band channel sounder using USRPs

Hyperbolic Image Segmentation, CVPR 2022