Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Last update: Jan 02, 2023

Related tags

Deep Learning Mask2Former

Overview

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation

Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar

[arXiv] [Project] [BibTeX]

Features

A single architecture for panoptic, instance and semantic segmentation.
Support major segmentation datasets: ADE20K, Cityscapes, COCO, Mapillary Vistas.

Installation

See installation instructions.

Getting Started

See Preparing Datasets for Mask2Former.

See Getting Started with Mask2Former.

Advanced usage

See Advanced Usage of Mask2Former.

Model Zoo and Baselines

We provide a large set of baseline results and trained models available for download in the Mask2Former Model Zoo.

License

Shield:

The majority of Mask2Former is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

However portions of the project are available under separate license terms: Swin-Transformer-Semantic-Segmentation is licensed under the MIT license, Deformable-DETR is licensed under the Apache-2.0 License.

Citing Mask2Former

If you use Mask2Former in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.

@article{cheng2021mask2former,
  title={Masked-attention Mask Transformer for Universal Image Segmentation},
  author={Bowen Cheng and Ishan Misra and Alexander G. Schwing and Alexander Kirillov and Rohit Girdhar},
  journal={arXiv},
  year={2021}
}

If you find the code useful, please also consider the following BibTeX entry.

@inproceedings{cheng2021maskformer,
  title={Per-Pixel Classification is Not All You Need for Semantic Segmentation},
  author={Bowen Cheng and Alexander G. Schwing and Alexander Kirillov},
  journal={NeurIPS},
  year={2021}
}

Acknowledgement

Code is largely based on MaskFormer (https://github.com/facebookresearch/MaskFormer).

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Related tags

Overview

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation

Features

Installation

Getting Started

Advanced usage

Model Zoo and Baselines

License

Citing Mask2Former

Acknowledgement

Owner

Meta Research

History Aware Multimodal Transformer for Vision-and-Language Navigation

The repository for the paper "When Do You Need Billions of Words of Pretraining Data?"

Augmented Traffic Control: A tool to simulate network conditions

fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.

MassiveSumm: a very large-scale, very multilingual, news summarisation dataset

Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

Source code for the paper "SEPP: Similarity Estimation of Predicted Probabilities for Defending and Detecting Adversarial Text" PACLIC 2021

RRL: Resnet as representation for Reinforcement Learning

Crosslingual Segmental Language Model

Prometheus exporter for Cisco Unified Computing System (UCS) Manager

Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

Code accompanying the paper Shared Independent Component Analysis for Multi-subject Neuroimaging

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

GarmentNets: Category-Level Pose Estimation for Garments via Canonical Space Shape Completion

Code for Multinomial Diffusion

A simple root calculater for python

Training RNNs as Fast as CNNs

Gauge equivariant mesh cnn

K Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching (To appear in RA-L 2022)