[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

Overview

FaPN: Feature-aligned Pyramid Network for Dense Image Prediction [arXiv] [Project Page]

@inproceedings{
  huang2021fapn,
  title={{FaPN}: Feature-aligned Pyramid Network for Dense Image Prediction},
  author={Shihua Huang and Zhichao Lu and Ran Cheng and Cheng He},
  booktitle={International Conference on Computer Vision (ICCV)},
  year={2021}
}

Overview

FaPN vs. FPN Before vs. After Alignment

This project provides the official implementation for our ICCV2021 paper "FaPN: Feature-aligned Pyramid Network for Dense Image Prediction" based on Detectron2. FaPN is a simple yet effective top-down pyramidal architecture to generate multi-scale features for dense image prediction. Comprised of a feature alignment module (FAM) and a feature selection module (FSM), FaPN addresses the issue of feature alignment in the original FPN, leading to substaintial improvements on various dense prediction tasks, such as object detection, semantic, instance, panoptic segmentation, etc.

Installation

This project is based on Detectron2, which can be constructed as follows.

Training

To train a model with 8 GPUs, run:

cd /path/to/detectron2/tools
python3 train_net.py --config-file <config.yaml> --num-gpus 8

For example, to launch Faster R-CNN training (1x schedule) with ResNet-50 backbone on 8 GPUs, one should execute:

cd /path/to/detectron2/tools
python3 train_net.py --config-file ../configs\COCO-Detection\faster_rcnn_R_50_FAN_1x.yaml --num-gpus 8

Evaluation

To evaluate a pre-trained model with 8 GPUs, run:

cd /path/to/detectron2/tools
python3 train_net.py --config-file <config.yaml> --num-gpus 8 --eval-only MODEL.WEIGHTS /path/to/model_checkpoint

Results

COCO Object Detection

Faster R-CNN + FaPN:

Name lr
sched
box
AP
box
APs
box
APm
box
APl
download
R50 1x 39.2 24.5 43.3 49.1 model |  log
R101 3x 42.8 27.0 46.2 54.9 model |  log

Cityscapes Semantic Segmentation

PointRend + FaPN:

Name lr
sched
mask
mIoU
mask
i_IoU
mask
IoU_sup
mask
iIoU_sup
download
R50 1x 80.0 61.3 90.6 78.5 model |  log
R101 1x 80.1 62.2 90.8 78.6 model |  log

COCO Instance Segmentation

Mask R-CNN + FaPN:

Name lr
sched
mask
AP
mask
APs
box
AP
box
APs
download
R50 1x 36.4 18.1 39.8 24.3 model |  log
R101 3x 39.4 20.9 43.8 27.4 model |  log

PointRend + FaPN:

Name lr
sched
mask
AP
mask
APs
box
AP
box
APs
download
R50 1x 37.6 18.6 39.4 24.2 model |  log

COCO Panoptic Segmentation

PanopticFPN + FaPN:

Name lr
sched
PQ mask
mIoU
St
PQ
box
AP
Th
PQ
download
R50 1x 41.1 43.4 32.5 38.7 46.9 model |  log
R101 3x 44.2 45.7 35.0 43.0 53.3 model |  log
Owner
EMI-Group
The Evolving Machine Intelligence (EMI) Group, established in 2018, is motivated to understand how evolution generates complexity, diversity and intelligence.
EMI-Group
A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results

Bag of tricks for long-tailed visual recognition with deep convolutional neural networks This repository is the official PyTorch implementation of AAA

Yong-Shun Zhang 181 Dec 28, 2022
Data Preparation, Processing, and Visualization for MoVi Data

MoVi-Toolbox Data Preparation, Processing, and Visualization for MoVi Data, https://www.biomotionlab.ca/movi/ MoVi is a large multipurpose dataset of

Saeed Ghorbani 51 Nov 27, 2022
Text Extraction Formulation + Feedback Loop for state-of-the-art WSD (EMNLP 2021)

ConSeC is a novel approach to Word Sense Disambiguation (WSD), accepted at EMNLP 2021. It frames WSD as a text extraction task and features a feedback loop strategy that allows the disambiguation of

Sapienza NLP group 36 Dec 13, 2022
Code release to accompany paper "Geometry-Aware Gradient Algorithms for Neural Architecture Search."

Geometry-Aware Gradient Algorithms for Neural Architecture Search This repository contains the code required to run the experiments for the DARTS sear

18 May 27, 2022
[RSS 2021] An End-to-End Differentiable Framework for Contact-Aware Robot Design

DiffHand This repository contains the implementation for the paper An End-to-End Differentiable Framework for Contact-Aware Robot Design (RSS 2021). I

Jie Xu 60 Jan 04, 2023
Provide baselines and evaluation metrics of the task: traffic flow prediction

Note: This repo is adpoted from https://github.com/UNIMIBInside/Smart-Mobility-Prediction. Due to technical reasons, I did not fork their code. Introd

Zhangzhi Peng 11 Nov 02, 2022
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Model

PLAI Group at UBC 42 Dec 06, 2022
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Welcome to AirSim AirSim is a simulator for drones, cars and more, built on Unreal Engine (we now also have an experimental Unity release). It is open

Microsoft 13.8k Jan 03, 2023
Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].

OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data Christoph Reich, Tim Prangemeier, Ă–zdemir Cetin & Heinz Koeppl | Pr

Christoph Reich 23 Sep 21, 2022
ReferFormer - Official Implementation of ReferFormer

The official implementation of the paper: Language as Queries for Referring Video Object Segmentation Language as Queries for Referring Video Object S

Jonas Wu 232 Dec 29, 2022
Cross-Modal Contrastive Learning for Text-to-Image Generation

Cross-Modal Contrastive Learning for Text-to-Image Generation This repository hosts the open source JAX implementation of XMC-GAN. Setup instructions

Google Research 94 Nov 12, 2022
Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker

Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker This is a full project of image segmentation using the model built with

Htin Aung Lu 1 Jan 04, 2022
Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

EagerMOT: 3D Multi-Object Tracking via Sensor Fusion Read our ICRA 2021 paper here. Check out the 3 minute video for the quick intro or the full prese

Aleksandr Kim 276 Dec 30, 2022
NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM

NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM Automatic Evaluation Metric described in the papers BaryScore (EM

Pierre Colombo 28 Dec 28, 2022
Code to reproduce the results in the paper "Tensor Component Analysis for Interpreting the Latent Space of GANs".

Tensor Component Analysis for Interpreting the Latent Space of GANs [ paper | project page ] Code to reproduce the results in the paper "Tensor Compon

James Oldfield 4 Jun 17, 2022
High accurate tool for automatic faces detection with landmarks

faces_detanator High accurate tool for automatic faces detection with landmarks. The library is based on public detectors with high accuracy (TinaFace

Ihar 7 May 10, 2022
Run containerized, rootless applications with podman

Why? restrict scope of file system access run any application without root privileges creates usable "Desktop applications" to integrate into your nor

119 Dec 27, 2022
Image augmentation library in Python for machine learning.

Augmentor is an image augmentation library in Python for machine learning. It aims to be a standalone library that is platform and framework independe

Marcus D. Bloice 4.8k Jan 07, 2023
Sequence-tagging using deep learning

Classification using Deep Learning Requirements PyTorch version = 1.9.1+cu111 Python version = 3.8.10 PyTorch-Lightning version = 1.4.9 Huggingface

Vineet Kumar 2 Dec 20, 2022
Code for paper "ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation"

ASAP-Net This project implements ASAP-Net of paper ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation (BMVC2020). Overview We i

Hanwen Cao 26 Aug 25, 2022