[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation

Last update: Nov 29, 2022

Overview

Mining Latent Classes for Few-shot Segmentation

Lihe Yang, Wei Zhuo, Lei Qi, Yinghuan Shi, Yang Gao.

This codebase contains baseline of our paper Mining Latent Classes for Few-shot Segmentation, ICCV 2021 Oral.

Several key modifications to the simple yet effective metric learning framework:

Remove the final residual stage in ResNet for stronger generalization
Remove the final ReLU for feature matching
Freeze all the BatchNorms from ImageNet pretrained model

Environment

This codebase was tested with the following environment configurations.

Ubuntu 18.04
CUDA 11.2
Python 3.7.4
PyTorch 1.6.0
Pillow, numpy, torchvision, tqdm
Two NVIDIA V100 GPUs

Getting Started

Data Preparation

Pretrained model: ResNet-50 | ResNet-101

Dataset: Pascal JPEGImages | SegmentationClass | ImageSets

File Organization

├── ./pretrained
    ├── resnet50.pth
    └── resnet101.pth
    
├── [Your Pascal Path]
    ├── JPEGImages
    │   ├── 2007_000032.jpg
    │   └── ...
    │
    ├── SegmentationClass
    │   ├── 2007_000032.png
    │   └── ...
    │
    └── ImageSets
        ├── train.txt
        └── val.txt

Run the Code

CUDA_VISIBLE_DEVICES=0,1 python -W ignore main.py \
  --dataset pascal --data-root [Your Pascal Path] \
  --backbone resnet50 --fold 0 --shot 1

You may change the backbone from resnet50 to resnet101, change the fold from 0 to 1/2/3, or change the shot from 1 to 5 for other settings.

Performance and Trained Models

Here we report the performance of our modified baseline on Pascal. You can click on the numbers to download corresponding trained models.

The training time is measured on two V100 GPUs. Compared with other works, our method is efficient to train.

Setting	Backbone	Training time / fold	Fold 0	Fold 1	Fold 2	Fold 3	Mean
1-shot	ResNet-50	40 minutes	54.9	66.5	61.7	48.3	57.9
1-shot	ResNet-101	1.1 hours	57.2	68.5	61.3	53.3	60.1
5-shot	ResNet-50	2.3 hours	61.6	70.3	70.5	56.4	64.7
5-shot	ResNet-101	3.5 hours	64.2	74.0	71.5	61.3	67.8

Acknowledgement

We thank PANet, PPNet, PFENet and other FSS works for their great contributions.

Citation

If you find this project useful for your research, please consider citing:

@inproceedings{yang2021mining,
  title={Mining Latent Classes for Few-shot Segmentation},
  author={Yang, Lihe and Zhuo, Wei and Qi, Lei and Shi, Yinghuan and Gao, Yang},
  booktitle={ICCV},
  year={2021}
}

[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation

Related tags

Overview

Mining Latent Classes for Few-shot Segmentation

Environment

Getting Started

Data Preparation

File Organization

Run the Code

Performance and Trained Models

Acknowledgement

Citation

Owner

Lihe Yang

Le dataset des images du projet d'IA de 2021

Implementation of OpenAI paper with Simple Noise Scale on Fastai V2

一个多模态内容理解算法框架，其中包含数据处理、预训练模型、常见模型以及模型加速等模块。

Synthetic Scene Text from 3D Engines

Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

Causal Influence Detection for Improving Efficiency in Reinforcement Learning

Weakly Supervised 3D Object Detection from Point Cloud with Only Image Level Annotation

Systemic Evolutionary Chemical Space Exploration for Drug Discovery

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

Pytorch implementation of the paper "Class-Balanced Loss Based on Effective Number of Samples"

Sketch-Based 3D Exploration with Stacked Generative Adversarial Networks

Spatial Single-Cell Analysis Toolkit

In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Evolution Strategies in PyTorch

Potato Disease Classification - Training, Rest APIs, and Frontend to test.

Model of an AI powered sign language interpreter.

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

A simple, unofficial implementation of MAE using pytorch-lightning

Hyperparameters tuning and features selection are two common steps in every machine learning pipeline.