[arXiv'22] Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

Overview

Panoptic NeRF

Project Page | Paper | Dataset


Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation
Xiao Fu*, Shangzhan zhang*, Tianrun Chen, Yichong Lu, Lanyun Zhu, Xiaowei Zhou, Andreas Geiger, Yiyi Liao
arXiv 2022

image

Installation

  1. Create a virtual environment via conda.
    conda create -n panopticnerf python=3.7
    conda activate panopticnerf
    
  2. Install torch and torchvision.
    conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
    
  3. Install requirements.
    pip install -r requirements.txt
    

Data Preparation

  1. We evaluate our model on KITTI-360. Here we show the structure of a test dataset as follow. You can download it from here and then put it into $ROOT (RGBs should query the KITTI-360 website).

    ├── KITTI-360
      ├── 2013_05_28_drive_0000_sync
        ├── image_00
        ├── image_01
      ├── bbx_intersection
        ├── *_00.npz
        ├── *_01.npz
      ├── calibration
        ├── calib_cam_to_pose.txt
        ├── perspective.txt
      ├── data_3d_bboxes
      ├── data_poses
        ├── cam0_to_world.txt
        ├── poses.txt
      ├── pspnet
      ├── sgm
      ├── visible_id
    
    file Intro
    image_00/01 stereo RGB images
    pspnet 2D pseudo ground truth
    sgm weak stereo depth supervision
    visible_id per-frame bounding primitive IDs
    data_poses system poses in a global Euclidean coordinate
    calibration extrinsics and intrinsics of the perspective cameras
    bbx_intersection ray-mesh intersections, containing depths between hitting points and camera origin, semantic label IDs and bounding primitive IDs
  2. Generate ray-mesh intersections (bbx_intersection/*.npz). The red dots and blue dots indicate where the rays hit into and out of the meshes, respectively. For the given test scene, START=3353, NUM=64.

    # image_00
    python mesh_intersection.py intersection_start_frame ${START} intersection_frames ${NUM} use_stereo False
    # image_01
    python mesh_intersection.py intersection_start_frame ${START} intersection_frames ${NUM} use_stereo True
    

  1. Evaluate the origin of a scene (center_pose) and the distance from the origin to the furthest bounding primitive (dist_min). Then accordingly modify the .yaml file.
    python recenter_pose.py recenter_start_frame ${START} recenter_frames ${NUM}
    

Training and Visualization

  1. We provide the training code. Replace resume False with resume True to load the pretained model.

    python train_net.py --cfg_file configs/panopticnerf_test.yaml pretrain nerf gpus '1,' use_stereo True use_pspnet True use_depth True pseudo_filter True weight_th 0.05 resume False
    
  2. Render semantic map, panoptic map and depth map in a single forward pass, which takes around 10s per-frame on a single 3090 GPU. Please make sure to maximize the GPU memory utilization by increasing the size of the chunk to reduce inference time. Replace use_stereo False with use_stereo True to render the right views.

    python run.py --type visualize --cfg_file configs/panopticnerf_test.yaml use_stereo False
    
  3. Visualize novel view appearance & label synthesis. Before rendering, select a frame and generate corresponding ray-mesh intersections with respect to its novel spiral poses by enabling spiral poses==True in lib.datasets.kitti360.panopticnerf.py.

    monocular

Evaluation

├── KITTI-360
  ├── gt_2d_semantics
  ├── gt_2d_panoptics
  ├── lidar_depth
  1. Download the corresponding pretrained model and put it to $ROOT/data/trained_model/panopticnerf/panopticnerf_test/latest.pth.

  2. We provide some semantic & panoptic GTs and LiDAR point clouds for evaluation. The details of evaluation metrics can be found in the paper.

  3. Eval mean intersection-over-union (mIoU)

python run.py --type eval_miou --cfg_file configs/panopticnerf_test.yaml use_stereo False
  1. Eval panoptic quality (PQ)
sh eval_pq_test.sh
  1. Eval depth with 0-100m LiDAR point clouds, where the far depth can be adjusted to evaluate the closer scene.
python run.py --type eval_depth --cfg_file configs/panopticnerf_test.yaml use_stereo False max_depth 100.
  1. Eval Multi-view Consistency (MC)
python eval_consistency.py --cfg_file configs/panopticnerf_test.yaml use_stereo False consistency_thres 0.1

News

  • 12/04/2022 Code released.
  • 29/03/2022 Repo created. Code will come soon.

Citation

@article{fu2022panoptic,
  title={Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation},
  author={Fu, Xiao and Zhang, Shangzhan and Chen, Tianrun and Lu, Yichong and Zhu, Lanyun and Zhou, Xiaowei and Geiger, Andreas and Liao, Yiyi},
  journal={arXiv preprint arXiv:2203.15224},
  year={2022}
}

Copyright © 2022, Zhejiang University. All rights reserved. We favor any positive inquiry, please contact [email protected].

Owner
Xiao Fu
Don’t regret anything in life.
Xiao Fu
This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation.

ERFNet This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation. NEW!! New PyTorch

Edu 104 Jan 05, 2023
Pytorch implementation of Zero-DCE++

Zero-DCE++ You can find more details here: https://li-chongyi.github.io/Proj_Zero-DCE++.html. You can find the details of our CVPR version: https://li

Chongyi Li 157 Dec 23, 2022
noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

ProSelfLC: CVPR 2021 ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks For any specific discussion or potential fu

amos_xwang 57 Dec 04, 2022
[CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning

Transform and Tell: Entity-Aware News Image Captioning This repository contains the code to reproduce the results in our CVPR 2020 paper Transform and

Alasdair Tran 85 Dec 13, 2022
DANA paper supplementary materials

DANA Supplements This repository stores the data, results, and R scripts to generate these reuslts and figures for the corresponding paper Depth Norma

0 Dec 17, 2021
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

This is a release of our VIMPAC paper to illustrate the implementations. The pretrained checkpoints and scripts will be soon open-sourced in HuggingFace transformers.

Hao Tan 74 Dec 03, 2022
Predicting the duration of arrival delays for commercial flights.

Flight Delay Prediction Our objective is to predict arrival delays of commercial flights. According to the US Department of Transportation, about 21%

Jordan Silke 1 Jan 11, 2022
Adaptation through prediction: multisensory active inference torque control

Adaptation through prediction: multisensory active inference torque control Submitted to IEEE Transactions on Cognitive and Developmental Systems Abst

Cristian Meo 1 Nov 07, 2022
Franka Emika Panda manipulator kinematics&dynamics simulation

pybullet_sim_panda Pybullet simulation environment for Franka Emika Panda Dependency pybullet, numpy, spatial_math_mini Simple example (please check s

0 Jan 20, 2022
Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data (CVPR 2022) Potentials of primitive shapes f

31 Sep 27, 2022
This repository contains tutorials for the py4DSTEM Python package

py4DSTEM Tutorials This repository contains tutorials for the py4DSTEM Python package. For more information about py4DSTEM, including installation ins

11 Dec 23, 2022
Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders

1 Oct 11, 2021
Official repository of the paper 'Essentials for Class Incremental Learning'

Essentials for Class Incremental Learning Official repository of the paper 'Essentials for Class Incremental Learning' This Pytorch repository contain

33 Nov 27, 2022
[ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets"

EarlyBERT This is the official implementation for the paper in ACL-IJCNLP 2021 "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by

VITA 13 May 11, 2022
Flaxformer: transformer architectures in JAX/Flax

Flaxformer is a transformer library for primarily NLP and multimodal research at Google.

Google 116 Jan 05, 2023
Attentive Implicit Representation Networks (AIR-Nets)

Attentive Implicit Representation Networks (AIR-Nets) Preprint | Supplementary | Accepted at the International Conference on 3D Vision (3DV) teaser.mo

29 Dec 07, 2022
[CVPR 2021] Forecasting the panoptic segmentation of future video frames

Panoptic Segmentation Forecasting Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing - CVPR 2021 [Link to paper] We propose

Niantic Labs 44 Nov 29, 2022
Materials for upcoming beginner-friendly PyTorch course (work in progress).

Learn PyTorch for Deep Learning (work in progress) I'd like to learn PyTorch. So I'm going to use this repo to: Add what I've learned. Teach others in

Daniel Bourke 2.3k Dec 29, 2022
Knowledge Management for Humans using Machine Learning & Tags

HyperTag HyperTag helps humans intuitively express how they think about their files using tags and machine learning.

Ravn Tech, Inc. 165 Nov 04, 2022
Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency(ECCV 2020) This is an official python implementati

304 Jan 03, 2023