Cross-view Transformers for real-time Map-view Semantic Segmentation (CVPR 2022 Oral)

Overview

Cross View Transformers


This repository contains the source code and data for our paper:

Cross-view Transformers for real-time Map-view Semantic Segmentation
Brady Zhou, Philipp Krähenbühl
CVPR 2022

Demos


Map-view Segmentation: The model uses multi-view images to produce a map-view segmentation at 45 FPS

Map Making: With vehicle pose, we can construct a map by fusing model predictions over time

Cross-view Attention: For a given map-view location, we show which image patches are being attended to

Installation

# Clone repo
git clone https://github.com/bradyz/cross_view_transformers.git

cd cross_view_transformers

# Setup conda environment
conda create -y --name cvt python=3.8

conda activate cvt
conda install -y pytorch torchvision cudatoolkit=11.3 -c pytorch

# Install dependencies
pip install -r requirements.txt
pip install -e .

Data


Documentation:


Download the original datasets and our generated map-view labels

Dataset Labels
nuScenes keyframes + map expansion (60 GB) cvt_labels_nuscenes.tar.gz (361 MB)
Argoverse 1.1 3D tracking coming soon™

The structure of the extracted data should look like the following

/datasets/
├─ nuscenes/
│  ├─ v1.0-trainval/
│  ├─ v1.0-mini/
│  ├─ samples/
│  ├─ sweeps/
│  └─ maps/
│     ├─ basemap/
│     └─ expansion/
└─ cvt_labels_nuscenes/
   ├─ scene-0001/
   ├─ scene-0001.json
   ├─ ...
   ├─ scene-1000/
   └─ scene-1000.json

When everything is setup correctly, check out the dataset with

python3 scripts/view_data.py \
  data=nuscenes \
  data.dataset_dir=/media/datasets/nuscenes \
  data.labels_dir=/media/datasets/cvt_labels_nuscenes \
  data.version=v1.0-mini \
  visualization=nuscenes_viz \
  +split=val

Training

             

An average job of 50k training iterations takes ~8 hours.
Our models were trained using 4 GPU jobs, but also can be trained on single GPU.

To train a model,

python3 scripts/train.py \
  +experiment=cvt_nuscenes_vehicle
  data.dataset_dir=/media/datasets/nuscenes \
  data.labels_dir=/media/datasets/cvt_labels_nuscenes

For more information, see

  • config/config.yaml - base config
  • config/model/cvt.yaml - model architecture
  • config/experiment/cvt_nuscenes_vehicle.yaml - additional overrides

Additional Information

Awesome Related Repos

License

This project is released under the MIT license

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{zhou2022cross,
    title={Cross-view Transformers for real-time Map-view Semantic Segmentation},
    author={Zhou, Brady and Kr{\"a}henb{\"u}hl, Philipp},
    booktitle={CVPR},
    year={2022}
}
Owner
Brady Zhou
hey
Brady Zhou
Simple torch.nn.module implementation of Alias-Free-GAN style filter and resample

Alias-Free-Torch Simple torch module implementation of Alias-Free GAN. This repository including Alias-Free GAN style lowpass sinc filter @filter.py A

이준혁(Junhyeok Lee) 64 Dec 22, 2022
SlideGraph+: Whole Slide Image Level Graphs to Predict HER2 Status in Breast Cancer

SlideGraph+: Whole Slide Image Level Graphs to Predict HER2 Status in Breast Cancer A novel graph neural network (GNN) based model (termed SlideGraph+

28 Dec 24, 2022
Let's Git - Versionsverwaltung & Open Source Hausaufgabe

Let's Git - Versionsverwaltung & Open Source Hausaufgabe Herzlich Willkommen zu dieser Hausaufgabe für unseren MOOC: Let's Git! Wir hoffen, dass Du vi

1 Dec 13, 2021
Depth image based mouse cursor visual haptic

Depth image based mouse cursor visual haptic How to run it. Install pyqt5. Install python modules pip install Pillow pip install numpy For illustrati

Xiong Jie 17 Dec 20, 2022
Display, filter and search log messages in your terminal

Textualog Display, filter and search logging messages in the terminal. This project is powered by rich and textual. Some of the ideas and code in this

Rik Huygen 24 Dec 10, 2022
Learning to Estimate Hidden Motions with Global Motion Aggregation

Learning to Estimate Hidden Motions with Global Motion Aggregation (GMA) This repository contains the source code for our paper: Learning to Estimate

Shihao Jiang (Zac) 221 Dec 18, 2022
Multiband spectro-radiometric satellite image analysis with K-means cluster algorithm

Multi-band Spectro Radiomertric Image Analysis with K-means Cluster Algorithm Overview Multi-band Spectro Radiomertric images are images comprising of

Chibueze Henry 6 Mar 16, 2022
[AAAI-2021] Visual Boundary Knowledge Translation for Foreground Segmentation

Trans-Net Code for (Visual Boundary Knowledge Translation for Foreground Segmentation, AAAI2021). [https://ojs.aaai.org/index.php/AAAI/article/view/16

ZJU-VIPA 2 Mar 04, 2022
A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation

A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation

196 Jan 05, 2023
A simple and lightweight genetic algorithm for optimization of any machine learning model

geneticml This package contains a simple and lightweight genetic algorithm for optimization of any machine learning model. Installation Use pip to ins

Allan Barcelos 8 Aug 10, 2022
Introducing neural networks to predict stock prices

IntroNeuralNetworks in Python: A Template Project IntroNeuralNetworks is a project that introduces neural networks and illustrates an example of how o

Vivek Palaniappan 637 Jan 04, 2023
Campsite Reservation Finder

yellowstone-camping UPDATE: yellowstone-camping is being expanded and renamed to camply. The updated tool now interfaces with the Recreation.gov API a

Justin Flannery 233 Jan 08, 2023
Official PyTorch(Geometric) implementation of DPGNN(DPGCN) in "Distance-wise Prototypical Graph Neural Network for Node Imbalance Classification"

DPGNN This repository is an official PyTorch(Geometric) implementation of DPGNN(DPGCN) in "Distance-wise Prototypical Graph Neural Network for Node Im

Yu Wang (Jack) 18 Oct 12, 2022
LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

LiDAR Distillation Paper | Model LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection Yi Wei, Zibu Wei, Yongming Rao, Jiax

Yi Wei 75 Dec 22, 2022
Code for the paper "Controllable Video Captioning with an Exemplar Sentence"

SMCG Code for the paper "Controllable Video Captioning with an Exemplar Sentence" Introduction We investigate a novel and challenging task, namely con

10 Dec 04, 2022
The project page of paper: Architecture disentanglement for deep neural networks [ICCV 2021, oral]

This is the project page for the paper: Architecture Disentanglement for Deep Neural Networks, Jie Hu, Liujuan Cao, Tong Tong, Ye Qixiang, ShengChuan

Jie Hu 15 Aug 30, 2022
An index of algorithms for learning causality with data

awesome-causality-algorithms An index of algorithms for learning causality with data. Please cite our survey paper if this index is helpful. @article{

Ruocheng Guo 2.3k Jan 08, 2023
This project aim to create multi-label classification annotation tool to boost annotation speed and make it more easier.

This project aim to create multi-label classification annotation tool to boost annotation speed and make it more easier.

4 Aug 02, 2022
LSTM-VAE Implementation and Relevant Evaluations

LSTM-VAE Implementation and Relevant Evaluations Before using any file in this repository, please create two directories under the root directory name

Lan Zhang 5 Oct 08, 2022
This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is accepted to ICCV2021.

GMPQ: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation This is the pytorch implementation for the paper: Generalizable Mix

18 Sep 02, 2022