Official implementation of YOGO for Point-Cloud Processing

Last update: Dec 20, 2022

Related tags

Overview

You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module

By Chenfeng Xu, Bohan Zhai, Bichen Wu, Tian Li, Wei Zhan, Peter Vajda, Kurt Keutzer, and Masayoshi Tomizuka.

This repository contains a Pytorch implementation of YOGO, a new, simple, and elegant model for point-cloud processing. The framework of our YOGO is shown below:

Selected quantitative results of different approaches on the ShapeNet and S3DIS dataset.

ShapeNet part segmentation:

Method	mIoU	Latency (ms)	GPU Memory (GB)
PointNet	83.7	21.4	1.5
RSNet	84.9	73.8	0.8
PointNet++	85.1	77.7	2.0
DGCNN	85.1	86.7	2.4
PointCNN	86.1	134.2	2.5
YOGO(KNN)	85.2	25.6	0.9
YOGO(Ball query)	85.1	21.3	1.0

S3DIS scene parsing:

Method	mIoU	Latency (ms)	GPU Memory (GB)
PointNet	42.9	24.8	1.0
RSNet	51.9	111.5	1.1
PointNet++*	50.7	501.5	1.6
DGCNN	47.9	174.3	2.4
PointCNN	57.2	282.4	4.6
YOGO(KNN)	54.0	27.7	2.0
YOGO(Ball query)	53.8	24.0	2.0

For more detail, please refer to our paper: YOGO. The work is a follow-up work to SqueezeSegV3 and Visual Transformers. If you find this work useful for your research, please consider citing:

@misc{xu2021group,
      title={You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module}, 
      author={Chenfeng Xu and Bohan Zhai and Bichen Wu and Tian Li and Wei Zhan and Peter Vajda and Kurt Keutzer and Masayoshi Tomizuka},
      year={2021},
      eprint={2103.09975},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

Related works:

@inproceedings{xu2020squeezesegv3,
  title={Squeezesegv3: Spatially-adaptive convolution for efficient point-cloud segmentation},
  author={Xu, Chenfeng and Wu, Bichen and Wang, Zining and Zhan, Wei and Vajda, Peter and Keutzer, Kurt and Tomizuka, Masayoshi},
  booktitle={European Conference on Computer Vision},
  pages={1--19},
  year={2020},
  organization={Springer}
}

@misc{wu2020visual,
      title={Visual Transformers: Token-based Image Representation and Processing for Computer Vision}, 
      author={Bichen Wu and Chenfeng Xu and Xiaoliang Dai and Alvin Wan and Peizhao Zhang and Zhicheng Yan and Masayoshi Tomizuka and Joseph Gonzalez and Kurt Keutzer and Peter Vajda},
      year={2020},
      eprint={2006.03677},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License

YOGO is released under the BSD license (See LICENSE for details).

Installation

The instructions are tested on Ubuntu 16.04 with python 3.6 and Pytorch 1.5 with GPU support.

Clone the YOGO repository:

git clone https://github.com/chenfengxu714/YOGO.git

Use pip to install required Python packages:

pip install -r requirements.txt

Install KNN library:

cd convpoint/knn/
python setup.py install --home='.'

Click to download ShapeNet and S3DIS dataset.

Pre-trained Models

The pre-trained YOGO is avalible at Google Drive, you can directly download them.

Inference

To infer the predictions for the entire dataset:

python train.py [config-file] --devices [gpu-ids] --evaluate --configs.evaluate.best_checkpoint_path [path to the model checkpoint]

for example, you can run the below command for ShapeNet inference:

python train.py configs/shapenet/yogo/yogo.py --devices 0 --evaluate --configs.evaluate.best_checkpoint_path ./runs/shapenet/best.pth

Training:

To train the model:

python train.py [config-file] --devices [gpu-ids] --evaluate --configs.evaluate.best_checkpoint_path [path to the model checkpoint]

for example, you can run the below command for ShapeNet training:

python train.py configs/shapenet/yogo/yogo.py --devices 0

You can run the below command for multi-gpu training:

python train.py configs/shapenet/yogo/yogo.py --devices 0,1,2,3

Note that we conduct training on Titan RTX gpu, you can modify the batch size according your GPU memory, the performance is slightly different.

Acknowledgement:

The code is modified from PVCNN and the code for KNN is from Pointconv.

Official implementation of YOGO for Point-Cloud Processing

Related tags

Overview

You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module

ShapeNet part segmentation:

S3DIS scene parsing:

License

Installation

Pre-trained Models

Inference

Training:

Acknowledgement:

Owner

Chenfeng Xu

“英特尔创新大师杯”深度学习挑战赛赛道3：CCKS2021中文NLP地址相关性任务

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing

ObjectDetNet is an easy, flexible, open-source object detection framework

MEDS: Enhancing Memory Error Detection for Large-Scale Applications

Background-Click Supervision for Temporal Action Localization

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

Code for LIGA-Stereo Detector, ICCV'21

SegNet-like Autoencoders in TensorFlow

GeoTransformer - Geometric Transformer for Fast and Robust Point Cloud Registration

A proof of concept ai-powered Recaptcha v2 solver

General purpose Slater-Koster tight-binding code for electronic structure calculations

Reviving Iterative Training with Mask Guidance for Interactive Segmentation

A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal

A vision library for performing sliced inference on large images/small objects

Code/data of the paper "Hand-Object Contact Prediction via Motion-Based Pseudo-Labeling and Guided Progressive Label Correction" (BMVC2021)

Game Agent Framework. Helping you create AIs / Bots that learn to play any game you own!

Resources for the Ki testnet challenge

Eth brownie struct encoding example

Baseline powergrid model for NY

Official implementation of YOGO for Point-Cloud Processing

Related tags

Overview

You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module

ShapeNet part segmentation:

S3DIS scene parsing:

License

Installation

Pre-trained Models

Inference

Training:

Acknowledgement:

Owner

Chenfeng Xu

“英特尔创新大师杯”深度学习挑战赛 赛道3：CCKS2021中文NLP地址相关性任务

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing

ObjectDetNet is an easy, flexible, open-source object detection framework

MEDS: Enhancing Memory Error Detection for Large-Scale Applications

Background-Click Supervision for Temporal Action Localization

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

Code for LIGA-Stereo Detector, ICCV'21

SegNet-like Autoencoders in TensorFlow

GeoTransformer - Geometric Transformer for Fast and Robust Point Cloud Registration

A proof of concept ai-powered Recaptcha v2 solver

General purpose Slater-Koster tight-binding code for electronic structure calculations

Reviving Iterative Training with Mask Guidance for Interactive Segmentation

A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal

A vision library for performing sliced inference on large images/small objects

Code/data of the paper "Hand-Object Contact Prediction via Motion-Based Pseudo-Labeling and Guided Progressive Label Correction" (BMVC2021)

Game Agent Framework. Helping you create AIs / Bots that learn to play any game you own!

Resources for the Ki testnet challenge

Eth brownie struct encoding example

Baseline powergrid model for NY

“英特尔创新大师杯”深度学习挑战赛赛道3：CCKS2021中文NLP地址相关性任务