PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Last update: Dec 29, 2022

Related tags

Overview

Shape-aware Convolutional Layer (ShapeConv)

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Introduction

We design a Shape-aware Convolutional(ShapeConv) layer to explicitly model the shape information for enhancing the RGB-D semantic segmentation accuracy. Specifically, we decompose the depth feature into a shape-component and a value component, after which two learnable weights are introduced to handle the shape and value with differentiation. Extensive experiments on three challenging indoor RGB-D semantic segmentation benchmarks, i.e., NYU-Dv2(-13,-40), SUN RGB-D, and SID, demonstrate the effectiveness of our ShapeConv when employing it over five popular architectures.

Usage

Installation

Requirements

Linux
Python 3.6+
PyTorch 1.7.0 or higher
CUDA 10.0 or higher

We have tested the following versions of OS and softwares:

OS: Ubuntu 16.04.6 LTS
CUDA: 10.0
PyTorch 1.7.0
Python 3.6.9

Install dependencies.

pip install -r requirements.txt

Dataset

Download the offical dataset and convert to a format appropriate for this project. See here.

Or download the converted dataset:

Evaluation

Model

Download trained model and put it in folder ./model_zoo. See all trained models here.
Config

Edit config file in ./config. The config files in ./config correspond to the model files in ./models.
1. Set inference.gpu_id = CUDA_VISIBLE_DEVICES. CUDA_VISIBLE_DEVICES is used to specify which GPUs should be visible to a CUDA application, e.g., inference.gpu_id = "0,1,2,3".
2. Set dataset_root = path_to_dataset. path_to_dataset represents the path of dataset. e.g.,dataset_root = "/home/shape_conv/nyu_v2".
Run
1. Ditributed evaluation, please run:
```
./tools/dist_test.sh config_path checkpoint_path gpu_num
```
- config_path is path of config file;
- checkpoint_pathis path of model file;
- gpu_num is the number of GPUs used, note that gpu_num <= len(inference.gpu_id).
E.g., evaluate shape-conv model on NYU-V2(40 categories), please run:
```
./tools/dist_test.sh configs/nyu/nyu40_deeplabv3plus_resnext101_shape.py model_zoo/nyu40_deeplabv3plus_resnext101_shape.pth 4
```
1. Non-distributed evaluation
```
python tools/test.py config_path checkpoint_path
```

Train

Config

Edit config file in ./config.
1. Set inference.gpu_id = CUDA_VISIBLE_DEVICES.
  
  E.g.,inference.gpu_id = "0,1,2,3".
2. Set dataset_root = path_to_dataset.
  
  E.g.,dataset_root = "/home/shape_conv/nyu_v2".

Run

Ditributed training

./tools/dist_train.sh config_path gpu_num

E.g., train shape-conv model on NYU-V2(40 categories) with 4 GPUs, please run:

./tools/dist_train.sh configs/nyu/nyu40_deeplabv3plus_resnext101_shape.py 4

Non-distributed training

python tools/train.py config_path

Result

For more result, please see model zoo.

NYU-V2(40 categories)

Architecture	Backbone	MS & Flip	Shape Conv	mIOU
DeepLabv3plus	ResNeXt-101	False	False	48.9%
DeepLabv3plus	ResNeXt-101	False	True	50.2%
DeepLabv3plus	ResNeXt-101	True	False	50.3%
DeepLabv3plus	ResNeXt-101	True	True	51.3%

SUN-RGBD

Architecture	Backbone	MS & Flip	Shape Conv	mIOU
DeepLabv3plus	ResNet-101	False	False	46.9%
DeepLabv3plus	ResNet-101	False	True	47.6%
DeepLabv3plus	ResNet-101	True	False	47.6%
DeepLabv3plus	ResNet-101	True	True	48.6%

SID(Stanford Indoor Dataset)

Architecture	Backbone	MS & Flip	Shape Conv	mIOU
DeepLabv3plus	ResNet-101	False	False	54.55%
DeepLabv3plus	ResNet-101	False	True	60.6%

Acknowledgments

This repo was developed based on vedaseg.

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Related tags

Overview

Shape-aware Convolutional Layer (ShapeConv)

Introduction

Usage

Installation

Dataset

Evaluation

Train

Result

NYU-V2(40 categories)

SUN-RGBD

SID(Stanford Indoor Dataset)

Acknowledgments

Owner

Hanchao Leng

PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'

Demonstrates how to divide a DL model into multiple IR model files (division) and introduce a simplest way to implement a custom layer works with OpenVINO IR models.

Kinetics-Data-Preprocessing

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Large dataset storage format for Pytorch

The official repository for BaMBNet

Simulations for Turring patterns on an apically expanding domain. T

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

Efficient and Scalable Physics-Informed Deep Learning and Scientific Machine Learning on top of Tensorflow for multi-worker distributed computing

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

The codes of paper 'Active-LATHE: An Active Learning Algorithm for Boosting the Error exponent for Learning Homogeneous Ising Trees'

The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".

Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

The code for two papers: Feedback Transformer and Expire-Span.

Code to reproduce the results for Compositional Attention

An end-to-end machine learning web app to predict rugby scores (Pandas, SQLite, Keras, Flask, Docker)

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

Boosted CVaR Classification (NeurIPS 2021)

Feature board for ERPNext

PyTorch implementation of "Dataset Knowledge Transfer for Class-Incremental Learning Without Memory" (WACV2022)