RTSeg: Real-time Semantic Segmentation Comparative Study

Overview

Real-time Semantic Segmentation Comparative Study

The repository contains the official TensorFlow code used in our papers:

Description

Semantic segmentation benefits robotics related applications especially autonomous driving. Most of the research on semantic segmentation is only on increasing the accuracy of segmentation models with little attention to computationally efficient solutions. The few work conducted in this direction does not provide principled methods to evaluate the     different design choices for segmentation. In RTSeg, we address this gap by presenting a real-time semantic segmentation benchmarking framework with a decoupled design for feature extraction and decoding methods. The code and the experimental results are presented on the CityScapes dataset for urban scenes.



Models

Encoder Skip U-Net DilationV1 DilationV2
VGG-16 Yes Yes Yes No
ResNet-18 Yes Yes Yes No
MobileNet Yes Yes Yes Yes
ShuffleNet Yes Yes Yes Yes

NOTE: The rest of the pretrained weights for all the implemented models will be released soon. Stay in touch for the updates.

Reported Results

Test Set

Model GFLOPs Class IoU Class iIoU Category IoU Category iIoU
SegNet 286.03 56.1 34.2 79.8 66.4
ENet 3.83 58.3 24.4 80.4 64.0
DeepLab - 70.4 42.6 86.4 67.7
SkipNet-VGG16 - 65.3 41.7 85.7 70.1
ShuffleSeg 2.0 58.3 32.4 80.2 62.2
SkipNet-MobileNet 6.2 61.5 35.2 82.0 63.0

Validation Set

Encoder Decoder Coarse mIoU
MobileNet SkipNet No 61.3
ShuffleNet SkipNet No 55.5
ResNet-18 UNet No 57.9
MobileNet UNet No 61.0
ShuffleNet UNet No 57.0
MobileNet Dilation No 57.8
ShuffleNet Dilation No 53.9
MobileNet SkipNet Yes 62.4
ShuffleNet SkipNet Yes 59.3

** GFLOPs is computed on image resolution 360x640. However, the mIOU(s) are computed on the official image resolution required by CityScapes evaluation script 1024x2048.**

** Regarding Inference time, issue is reported here. We were not able to outperform the reported inference time from ENet architecture it could be due to discrepencies in the optimization we perform. People are welcome to improve on the optimization method we're using.

Usage

  1. Download the weights, processed data, and trained meta graphs from here
  2. Extract pretrained_weights.zip
  3. Extract full_cityscapes_res.zip under data/
  4. Extract unet_resnet18.zip under experiments/

Run

The file named run.sh provide a good example for running different architectures. Have a look at this file.

Examples to the running command in run.sh file:

python3 main.py --load_config=[config_file_name].yaml [train/test] [Trainer Class Name] [Model Class Name]
  • Remove comment from run.sh for running fcn8s_mobilenet on the validation set of cityscapes to get its mIoU. Our framework evaluation will produce results lower than the cityscapes evaluation script by small difference, for the final evaluation we use the cityscapes evaluation script. UNet ResNet18 should have 56% on validation set, but with cityscapes script we got 57.9%. The results on the test set for SkipNet-MobileNet and SkipNet-ShuffleNet are publicly available on the Cityscapes Benchmark.
python3 main.py --load_config=unet_resnet18_test.yaml test Train LinkNET
  • To measure running time, run in inference mode.
python3 main.py --load_config=unet_resnet18_test.yaml inference Train LinkNET
  • To run on different dataset or model, take one of the configuration files such as: config/experiments_config/unet_resnet18_test.yaml and modify it or create another .yaml configuration file depending on your needs.

NOTE: The current code does not contain the optimized code for measuring inference time, the final code will be released soon.

Main Dependencies

Python 3 and above
tensorflow 1.3.0/1.4.0
numpy 1.13.1
tqdm 4.15.0
matplotlib 2.0.2
pillow 4.2.1
PyYAML 3.12

All Dependencies

pip install -r [requirements_gpu.txt] or [requirements.txt]

Citation

If you find RTSeg useful in your research, please consider citing our work:

@ARTICLE{2018arXiv180302758S,
   author = {{Siam}, M. and {Gamal}, M. and {Abdel-Razek}, M. and {Yogamani}, S. and
    {Jagersand}, M.},
    title = "{RTSeg: Real-time Semantic Segmentation Comparative Study}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1803.02758},
 primaryClass = "cs.CV",
 keywords = {Computer Science - Computer Vision and Pattern Recognition},
     year = 2018,
    month = mar,
   adsurl = {http://adsabs.harvard.edu/abs/2018arXiv180302758S},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

If you find ShuffleSeg useful in your research, please consider citing it as well:

@ARTICLE{2018arXiv180303816G,
   author = {{Gamal}, M. and {Siam}, M. and {Abdel-Razek}, M.},
    title = "{ShuffleSeg: Real-time Semantic Segmentation Network}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1803.03816},
 primaryClass = "cs.CV",
 keywords = {Computer Science - Computer Vision and Pattern Recognition},
     year = 2018,
    month = mar,
   adsurl = {http://adsabs.harvard.edu/abs/2018arXiv180303816G},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Related Project

Real-time Motion Segmentation using 2-stream shuffleseg Code

Owner
Mennatullah Siam
PhD Student
Mennatullah Siam
Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

Jihye Back 520 Jan 04, 2023
Elegy is a framework-agnostic Trainer interface for the Jax ecosystem.

Elegy Elegy is a framework-agnostic Trainer interface for the Jax ecosystem. Main Features Easy-to-use: Elegy provides a Keras-like high-level API tha

435 Dec 30, 2022
MIMIC Code Repository: Code shared by the research community for the MIMIC-III database

MIMIC Code Repository The MIMIC Code Repository is intended to be a central hub for sharing, refining, and reusing code used for analysis of the MIMIC

MIT Laboratory for Computational Physiology 1.8k Dec 26, 2022
Devkit for 3D -- Some utils for 3D object detection based on Numpy and Pytorch

D3D Devkit for 3D: Some utils for 3D object detection and tracking based on Numpy and Pytorch Please consider siting my work if you find this library

Jacob Zhong 27 Jul 07, 2022
CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability

This is the official repository of the paper: CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability A private copy of the

Fadi Boutros 33 Dec 31, 2022
The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding, where the hidden data can be utilized for various management purposes, including hyper-linking, annotation

Maxim Zaika 1 Nov 17, 2021
SANet: A Slice-Aware Network for Pulmonary Nodule Detection

SANet: A Slice-Aware Network for Pulmonary Nodule Detection This paper (SANet) has been accepted and early accessed in IEEE TPAMI 2021. This code and

Jie Mei 39 Dec 17, 2022
Duke Machine Learning Winter School: Computer Vision 2022

mlwscv2002 Welcome to the Duke Machine Learning Winter School: Computer Vision 2022! The MLWS-CV includes 3 hands-on training sessions on implementing

Duke + Data Science (+DS) 9 May 25, 2022
This is the 3D Implementation of 《Inconsistency-aware Uncertainty Estimation for Semi-supervised Medical Image Segmentation》

CoraNet This is the 3D Implementation of 《Inconsistency-aware Uncertainty Estimation for Semi-supervised Medical Image Segmentation》 Environment pytor

25 Nov 08, 2022
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper

Ponder(ing) Transformer Implementation of a Transformer that learns to adapt the number of computational steps it takes depending on the difficulty of

Phil Wang 65 Oct 04, 2022
Pytorch modules for paralel models with same architecture. Ideal for multi agent-based systems

WideLinears Pytorch parallel Neural Networks A package of pytorch modules for fast paralellization of separate deep neural networks. Ideal for agent-b

1 Dec 17, 2021
ELSED: Enhanced Line SEgment Drawing

ELSED: Enhanced Line SEgment Drawing This repository contains the source code of ELSED: Enhanced Line SEgment Drawing the fastest line segment detecto

Iago Suárez 125 Dec 31, 2022
Official PyTorch Implementation of GAN-Supervised Dense Visual Alignment

GAN-Supervised Dense Visual Alignment — Official PyTorch Implementation Paper | Project Page | Video This repo contains training, evaluation and visua

944 Jan 07, 2023
Lex Rosetta: Transfer of Predictive Models Across Languages, Jurisdictions, and Legal Domains

Lex Rosetta: Transfer of Predictive Models Across Languages, Jurisdictions, and Legal Domains This is an accompanying repository to the ICAIL 2021 pap

4 Dec 16, 2021
Fuzzer for Linux Kernel Drivers

difuze: Fuzzer for Linux Kernel Drivers This repo contains all the sources (including setup scripts), you need to get difuze up and running. Tested on

seclab 344 Dec 27, 2022
Implementation for Stankevičiūtė et al. "Conformal time-series forecasting", NeurIPS 2021.

Conformal time-series forecasting Implementation for Stankevičiūtė et al. "Conformal time-series forecasting", NeurIPS 2021. If you use our code in yo

Kamilė Stankevičiūtė 36 Nov 21, 2022
Omniverse sample scripts - A guide for developing with Python scripts on NVIDIA Ominverse

Omniverse sample scripts ここでは、NVIDIA Omniverse ( https://www.nvidia.com/ja-jp/om

ft-lab (Yutaka Yoshisaka) 37 Nov 17, 2022
CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection

CLOCs is a novel Camera-LiDAR Object Candidates fusion network. It provides a low-complexity multi-modal fusion framework that improves the performance of single-modality detectors. CLOCs operates on

Su Pang 254 Dec 16, 2022
Collection of in-progress libraries for entity neural networks.

ENN Incubator Collection of in-progress libraries for entity neural networks: Neural Network Architectures for Structured State Entity Gym: Abstractio

25 Dec 01, 2022
Pytorch implementation for "Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion" (NeurIPS 2021)

Density-aware Chamfer Distance This repository contains the official PyTorch implementation of our paper: Density-aware Chamfer Distance as a Comprehe

Tong WU 93 Dec 15, 2022