PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Overview

Shape-aware Convolutional Layer (ShapeConv)

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Introduction

We design a Shape-aware Convolutional(ShapeConv) layer to explicitly model the shape information for enhancing the RGB-D semantic segmentation accuracy. Specifically, we decompose the depth feature into a shape-component and a value component, after which two learnable weights are introduced to handle the shape and value with differentiation. Extensive experiments on three challenging indoor RGB-D semantic segmentation benchmarks, i.e., NYU-Dv2(-13,-40), SUN RGB-D, and SID, demonstrate the effectiveness of our ShapeConv when employing it over five popular architectures.

image

Usage

Installation

  1. Requirements
  • Linux
  • Python 3.6+
  • PyTorch 1.7.0 or higher
  • CUDA 10.0 or higher

We have tested the following versions of OS and softwares:

  • OS: Ubuntu 16.04.6 LTS
  • CUDA: 10.0
  • PyTorch 1.7.0
  • Python 3.6.9
  1. Install dependencies.
pip install -r requirements.txt

Dataset

Download the offical dataset and convert to a format appropriate for this project. See here.

Or download the converted dataset:

Evaluation

  1. Model

    Download trained model and put it in folder ./model_zoo. See all trained models here.

  2. Config

    Edit config file in ./config. The config files in ./config correspond to the model files in ./models.

    1. Set inference.gpu_id = CUDA_VISIBLE_DEVICES. CUDA_VISIBLE_DEVICES is used to specify which GPUs should be visible to a CUDA application, e.g., inference.gpu_id = "0,1,2,3".
    2. Set dataset_root = path_to_dataset. path_to_dataset represents the path of dataset. e.g.,dataset_root = "/home/shape_conv/nyu_v2".
  3. Run

    1. Ditributed evaluation, please run:
    ./tools/dist_test.sh config_path checkpoint_path gpu_num
    • config_path is path of config file;
    • checkpoint_pathis path of model file;
    • gpu_num is the number of GPUs used, note that gpu_num <= len(inference.gpu_id).

    E.g., evaluate shape-conv model on NYU-V2(40 categories), please run:

    ./tools/dist_test.sh configs/nyu/nyu40_deeplabv3plus_resnext101_shape.py model_zoo/nyu40_deeplabv3plus_resnext101_shape.pth 4
    1. Non-distributed evaluation
    python tools/test.py config_path checkpoint_path

Train

  1. Config

    Edit config file in ./config.

    1. Set inference.gpu_id = CUDA_VISIBLE_DEVICES.

      E.g.,inference.gpu_id = "0,1,2,3".

    2. Set dataset_root = path_to_dataset.

      E.g.,dataset_root = "/home/shape_conv/nyu_v2".

  2. Run

    1. Ditributed training
    ./tools/dist_train.sh config_path gpu_num

    E.g., train shape-conv model on NYU-V2(40 categories) with 4 GPUs, please run:

    ./tools/dist_train.sh configs/nyu/nyu40_deeplabv3plus_resnext101_shape.py 4
    1. Non-distributed training
    python tools/train.py config_path

Result

For more result, please see model zoo.

NYU-V2(40 categories)

Architecture Backbone MS & Flip Shape Conv mIOU
DeepLabv3plus ResNeXt-101 False False 48.9%
DeepLabv3plus ResNeXt-101 False True 50.2%
DeepLabv3plus ResNeXt-101 True False 50.3%
DeepLabv3plus ResNeXt-101 True True 51.3%

SUN-RGBD

Architecture Backbone MS & Flip Shape Conv mIOU
DeepLabv3plus ResNet-101 False False 46.9%
DeepLabv3plus ResNet-101 False True 47.6%
DeepLabv3plus ResNet-101 True False 47.6%
DeepLabv3plus ResNet-101 True True 48.6%

SID(Stanford Indoor Dataset)

Architecture Backbone MS & Flip Shape Conv mIOU
DeepLabv3plus ResNet-101 False False 54.55%
DeepLabv3plus ResNet-101 False True 60.6%

Acknowledgments

This repo was developed based on vedaseg.

Owner
Hanchao Leng
Hanchao Leng
PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'

Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning This repository is for EMSRDPN introduced in the foll

7 Feb 10, 2022
Demonstrates how to divide a DL model into multiple IR model files (division) and introduce a simplest way to implement a custom layer works with OpenVINO IR models.

Demonstration of OpenVINO techniques - Model-division and a simplest-way to support custom layers Description: Model Optimizer in Intel(r) OpenVINO(tm

Yasunori Shimura 12 Nov 09, 2022
Kinetics-Data-Preprocessing

Kinetics-Data-Preprocessing Kinetics-400 and Kinetics-600 are common video recognition datasets used by popular video understanding projects like Slow

Kaihua Tang 7 Oct 27, 2022
《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

42 Nov 24, 2022
Large dataset storage format for Pytorch

H5Record Large dataset ( 100G, = 1T) storage format for Pytorch (wip) Support python 3 pip install h5record Why? Writing large dataset is still a

theblackcat102 43 Oct 22, 2022
The official repository for BaMBNet

BaMBNet-Pytorch Paper

Junjun Jiang 18 Dec 04, 2022
Simulations for Turring patterns on an apically expanding domain. T

Turing patterns on expanding domain Simulations for Turring patterns on an apically expanding domain. The details about the models and numerical imple

Yue Liu 0 Aug 03, 2021
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators [Project Website] [Replicate.ai Project] StyleGAN-NADA: CLIP-Guided Domain Adaptation

992 Dec 30, 2022
Efficient and Scalable Physics-Informed Deep Learning and Scientific Machine Learning on top of Tensorflow for multi-worker distributed computing

Notice: Support for Python 3.6 will be dropped in v.0.2.1, please plan accordingly! Efficient and Scalable Physics-Informed Deep Learning Collocation-

tensordiffeq 74 Dec 09, 2022
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

J K Terry 32 Nov 09, 2021
The codes of paper 'Active-LATHE: An Active Learning Algorithm for Boosting the Error exponent for Learning Homogeneous Ising Trees'

Active-LATHE: An Active Learning Algorithm for Boosting the Error exponent for Learning Homogeneous Ising Trees This project contains the codes of pap

0 Apr 20, 2022
The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".

Codebase for learning control flow in transformers The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformer

Csordás Róbert 24 Oct 15, 2022
Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

This is a Pytorch implementation of Janai, J., Güney, F., Ranjan, A., Black, M. and Geiger, A., Unsupervised Learning of Multi-Frame Optical Flow with

Anurag Ranjan 110 Nov 02, 2022
The code for two papers: Feedback Transformer and Expire-Span.

transformer-sequential This repo contains the code for two papers: Feedback Transformer Expire-Span The training code is structured for long sequentia

Facebook Research 125 Dec 25, 2022
Code to reproduce the results for Compositional Attention

Compositional-Attention This repository contains the official implementation for the paper Compositional Attention: Disentangling Search and Retrieval

Sarthak Mittal 58 Nov 30, 2022
An end-to-end machine learning web app to predict rugby scores (Pandas, SQLite, Keras, Flask, Docker)

Rugby score prediction An end-to-end machine learning web app to predict rugby scores Overview An demo project to provide a high-level overview of the

34 May 24, 2022
YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

Adam Van Etten 145 Jan 01, 2023
Boosted CVaR Classification (NeurIPS 2021)

Boosted CVaR Classification Runtian Zhai, Chen Dan, Arun Sai Suggala, Zico Kolter, Pradeep Ravikumar NeurIPS 2021 Table of Contents Quick Start Train

Runtian Zhai 4 Feb 15, 2022
Feature board for ERPNext

ERPNext Feature Board Feature board for ERPNext Development Prerequisites k3d kubectl helm bench Install K3d Cluster # export K3D_FIX_CGROUPV2=1 # use

Revant Nandgaonkar 16 Nov 09, 2022
PyTorch implementation of "Dataset Knowledge Transfer for Class-Incremental Learning Without Memory" (WACV2022)

Dataset Knowledge Transfer for Class-Incremental Learning Without Memory [Paper] [Slides] Summary Introduction Installation Reproducing results Citati

Habib Slim 5 Dec 05, 2022