Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation

Overview

Auto-Seg-Loss

By Hao Li, Chenxin Tao, Xizhou Zhu, Xiaogang Wang, Gao Huang, Jifeng Dai

This is the official implementation of the ICLR 2021 paper Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation.

Introduction

TL; DR.

Auto Seg-Loss is the first general framework for searching surrogate losses for mainstream semantic segmentation metrics.

Abstract.

Designing proper loss functions is essential in training deep networks. Especially in the field of semantic segmentation, various evaluation metrics have been proposed for diverse scenarios. Despite the success of the widely adopted cross-entropy loss and its variants, the mis-alignment between the loss functions and evaluation metrics degrades the network performance. Meanwhile, manually designing loss functions for each specific metric requires expertise and significant manpower. In this paper, we propose to automate the design of metric-specific loss functions by searching differentiable surrogate losses for each metric. We substitute the non-differentiable operations in the metrics with parameterized functions, and conduct parameter search to optimize the shape of loss surfaces. Two constraints are introduced to regularize the search space and make the search efficient. Extensive experiments on PASCAL VOC and Cityscapes demonstrate that the searched surrogate losses outperform the manually designed loss functions consistently. The searched losses can generalize well to other datasets and networks.

ASL-overview ASL-results

License

This project is released under the Apache 2.0 license.

Citing Auto Seg-Loss

If you find Auto Seg-Loss useful in your research, please consider citing:

@inproceedings{li2020auto,
  title={Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation},
  author={Li, Hao and Tao, Chenxin and Zhu, Xizhou and Wang, Xiaogang and Huang, Gao and Dai, Jifeng},
  booktitle={ICLR},
  year={2021}
}

Configs

PASCAL VOC Search experiments

Target Metric mIoU FWIoU mAcc gAcc BIoU BF1
Parameterization bezier bezier bezier bezier bezier bezier
URL config config config config config config

PASCAL VOC Re-training experiments

Target Metric mIoU FWIoU mAcc gAcc BIoU BF1
Cross Entropy 78.69 91.31 87.31 95.17 70.61 65.30
ASL 80.97 91.93 92.95 95.22 79.27 74.83
URL config
log
config
log
config
log
config
log
config
log
config
log

Note:

1. The search experiments are conducted with R50-DeepLabV3+.

2. The re-training experiments are conducted with R101-DeeplabV3+.

Installation

Our implementation is based on MMSegmentation.

Prerequisites

  • Python>=3.7

    We recommend you to use Anaconda to create a conda environment:

    conda create -n auto_segloss python=3.8 -y

    Then, activate the environment:

    conda activate auto_segloss
  • PyTorch>=1.7.0, torchvision>=0.8.0 (following official instructions).

    For example, if your CUDA version is 10.1, you could install pytorch and torchvision as follows:

    conda install pytorch=1.8.0 torchvision=0.9.0 cudatoolkit=10.1 -c pytorch
  • MMCV>=1.3.0 (following official instruction).

    We recommend installing the pre-built mmcv-full. For example, if your CUDA version is 10.1 and pytorch version is 1.8.0, you could run:

    pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html

Installing the modified mmsegmentation

git clone https://github.com/fundamentalvision/Auto-Seg-Loss.git
cd Auto-Seg-Loss
pip install -e .

Usage

Dataset preparation

Please follow the official guide of MMSegmentation to organize the datasets. It's highly recommended to symlink the dataset root to Auto-Seg-Loss/data. The recommended data structure is as follows:

Auto-Seg-Loss
├── mmseg
├── ASL_search
└── data
    └── VOCdevkit
        ├── VOC2012
        └── VOCaug

Training models with the provided parameters

The re-training command format is

./ASL_retrain.sh {CONFIG_NAME} [{NUM_GPUS}] [{SEED}]

For example, the command for training a ResNet-101 DeepLabV3+ with 4 GPUs for mIoU is as follows:

./ASL_retrain.sh miou_bezier_10k.py 4

You can also follow the provided configs to modify the mmsegmentation configs, and use Auto Seg-Loss for training other models on other datasets.

Searching for semantic segmentation metrics

The search command format is

./ASL_search.sh {CONFIG_NAME} [{NUM_GPUS}] [{SEED}]

For example, the command for searching for surrogate loss functions for mIoU with 8 GPUs is as follows:

./ASL_search.sh miou_bezier_lr=0.2_eps=0.2.py 8
Res2Net for Instance segmentation and Object detection using MaskRCNN

Res2Net for Instance segmentation and Object detection using MaskRCNN Since the MaskRCNN-benchmark of facebook is deprecated, we suggest to use our mm

Res2Net Applications 55 Oct 30, 2022
Implicit Deep Adaptive Design (iDAD)

Implicit Deep Adaptive Design (iDAD) This code supports the NeurIPS paper 'Implicit Deep Adaptive Design: Policy-Based Experimental Design without Lik

Desi 12 Aug 14, 2022
This is a Deep Leaning API for classifying emotions from human face and human audios.

Emotion AI This is a Deep Leaning API for classifying emotions from human face and human audios. Starting the server To start the server first you nee

crispengari 5 Oct 02, 2022
This is the face keypoint train code of project face-detection-project

face-key-point-pytorch 1. Data structure The structure of landmarks_jpg is like below: |--landmarks_jpg |----AFW |------AFW_134212_1_0.jpg |------AFW_

I‘m X 3 Nov 27, 2022
JUSTICE: A Benchmark Dataset for Supreme Court’s Judgment Prediction

JUSTICE: A Benchmark Dataset for Supreme Court’s Judgment Prediction CSCI 544 Final Project done by: Mohammed Alsayed, Shaayan Syed, Mohammad Alali, S

Smit Patel 3 Dec 28, 2022
Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Heterogeneous Graph Benchmark Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks. Roadmap We organize our repo by task, and on

THUDM 176 Dec 17, 2022
A library that can print Python objects in human readable format

objprint A library that can print Python objects in human readable format Install pip install objprint Usage op Use op() (or objprint()) to print obj

319 Dec 25, 2022
Task-related Saliency Network For Few-shot learning

Task-related Saliency Network For Few-shot learning This is an official implementation in Tensorflow of TRSN. Abstract An essential cue of human wisdo

1 Nov 18, 2021
AI-Fitness-Tracker - AI Fitness Tracker With Python

AI-Fitness-Tracker We have build a AI based Fitness Tracker using OpenCV and Pyt

Sharvari Mangale 5 Feb 09, 2022
RaceBERT -- A transformer based model to predict race and ethnicty from names

RaceBERT -- A transformer based model to predict race and ethnicty from names Installation pip install racebert Using a virtual environment is highly

Prasanna Parasurama 3 Nov 02, 2022
PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

PyTorch implementation of Conformer: Convolution-augmented Transformer for Speech Recognition. Transformer models are good at capturing content-based

Soohwan Kim 565 Jan 04, 2023
This is the dataset and code release of the OpenRooms Dataset.

This is the dataset and code release of the OpenRooms Dataset.

Visual Intelligence Lab of UCSD 95 Jan 08, 2023
Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode

🤗 Transformers Wav2Vec2 + PyCTCDecode Introduction This repo shows how 🤗 Transformers can be used in combination with kensho-technologies's PyCTCDec

Patrick von Platen 102 Oct 22, 2022
Deep Learning for 3D Point Clouds: A Survey (IEEE TPAMI, 2020)

🔥Deep Learning for 3D Point Clouds (IEEE TPAMI, 2020)

Qingyong 1.4k Jan 08, 2023
Code of Classification Saliency-Based Rule for Visible and Infrared Image Fusion

CSF Code of Classification Saliency-Based Rule for Visible and Infrared Image Fusion Tips: For testing: CUDA_VISIBLE_DEVICES=0 python main.py For trai

Han Xu 14 Oct 31, 2022
A robust pointcloud registration pipeline based on correlation.

PHASER: A Robust and Correspondence-Free Global Pointcloud Registration Ubuntu 18.04+ROS Melodic: Overview Pointcloud registration using correspondenc

ETHZ ASL 101 Dec 01, 2022
Course materials for Fall 2021 "CIS6930 Topics in Computing for Data Science" at New College of Florida

Fall 2021 CIS6930 Topics in Computing for Data Science This repository hosts course materials used for a 13-week course "CIS6930 Topics in Computing f

Yoshi Suhara 101 Nov 30, 2022
Dynamic Attentive Graph Learning for Image Restoration, ICCV2021 [PyTorch Code]

Dynamic Attentive Graph Learning for Image Restoration This repository is for GATIR introduced in the following paper: Chong Mou, Jian Zhang, Zhuoyuan

Jian Zhang 84 Dec 09, 2022
Offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation

Shunted Transformer This is the offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation by Sucheng Ren, Daquan Zhou, Shengf

156 Dec 27, 2022
MDMM - Learning multi-domain multi-modality I2I translation

Multi-Domain Multi-Modality I2I translation Pytorch implementation of multi-modality I2I translation for multi-domains. The project is an extension to

Hsin-Ying Lee 107 Nov 04, 2022