Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

Overview

NeX: Real-time View Synthesis with Neural Basis Expansion

Project Page | Video | Paper | COLAB | Shiny Dataset

Open NeX in Colab

NeX

We present NeX, a new approach to novel view synthesis based on enhancements of multiplane image (MPI) that can reproduce NeXt-level view-dependent effects---in real time. Unlike traditional MPI that uses a set of simple RGBα planes, our technique models view-dependent effects by instead parameterizing each pixel as a linear combination of basis functions learned from a neural network. Moreover, we propose a hybrid implicit-explicit modeling strategy that improves upon fine detail and produces state-of-the-art results. Our method is evaluated on benchmark forward-facing datasets as well as our newly-introduced dataset designed to test the limit of view-dependent modeling with significantly more challenging effects such as the rainbow reflections on a CD. Our method achieves the best overall scores across all major metrics on these datasets with more than 1000× faster rendering time than the state of the art.

Table of contents



Getting started

conda env create -f environment.yml
./download_demo_data.sh
conda activate nex
python train.py -scene data/crest_demo -model_dir crest -http
tensorboard --logdir runs/

Installation

We provide environment.yml to help you setup a conda environment.

conda env create -f environment.yml

Dataset

Shiny dataset

Download: Shiny dataset.

We provide 2 directories named shiny and shiny_extended.

  • shiny contains benchmark scenes used to report the scores in our paper.
  • shiny_extended contains additional challenging scenes used on our website project page and video

NeRF's real forward-facing dataset

Download: Undistorted front facing dataset

For real forward-facing dataset, NeRF is trained with the raw images, which may contain lens distortion. But we use the undistorted images provided by COLMAP.

However, you can try running other scenes from Local lightfield fusion (Eg. airplant) without any changes in the dataset files. In this case, the images are not automatically undistorted.

Deepview's spaces dataset

Download: Modified spaces dataset

We slightly modified the file structure of Spaces dataset in order to determine the plane placement and split train/test sets.

Using your own images.

Running NeX on your own images. You need to install COLMAP on your machine.

Then, put your images into a directory following this structure

<scene_name>
|-- images
     | -- image_name1.jpg
     | -- image_name2.jpg
     ...

The training code will automatically prepare a scene for you. You may have to tune planes.txt to get better reconstruction (see dataset explaination)

Training

Run with the paper's config

python train.py -scene ${PATH_TO_SCENE} -model_dir ${MODEL_TO_SAVE_CHECKPOINT} -http

This implementation uses scikit-image to resize images during training by default. The results and scores in the paper are generated using OpenCV's resize function. If you want the same behavior, please add -cv2resize argument.

Note that this code is tested on an Nvidia V100 32GB and 4x RTX 2080Ti GPU.

For a GPU/GPUs with less memory (e.g., a single RTX 2080Ti), you can run using the following command:

python train.py -scene ${PATH_TO_SCENE} -model_dir ${MODEL_TO_SAVE_CHECKPOINT} -http -layers 12 -sublayers 6 -hidden 256

Note that when your GPU runs ouut of memeory, you can try reducing the number of layers, sublayers, and sampled rays.

Rendering

To generate a WebGL viewer and a video result.

python train.py -scene ${scene} -model_dir ${MODEL_TO_SAVE_CHECKPOINT} -predict -http

Video rendering

To generate a video that matches the real forward-facing rendering path, add -nice_llff argument, or -nice_shiny for shiny dataset

Citation

@inproceedings{Wizadwongsa2021NeX,
    author = {Wizadwongsa, Suttisak and Phongthawee, Pakkapon and Yenphraphai, Jiraphon and Suwajanakorn, Supasorn},
    title = {NeX: Real-time View Synthesis with Neural Basis Expansion},
    booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
    year = {2021},
}

Visit us 🦉

Vision & Learning Laboratory VISTEC - Vidyasirimedhi Institute of Science and Technology

⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x using fastT5.

Reduce T5 model size by 3X and increase the inference speed up to 5X. Install Usage Details Functionalities Benchmarks Onnx model Quantized onnx model

Kiran R 399 Jan 05, 2023
KLUE-baseline contains the baseline code for the Korean Language Understanding Evaluation (KLUE) benchmark.

KLUE Baseline Korean(한국어) KLUE-baseline contains the baseline code for the Korean Language Understanding Evaluation (KLUE) benchmark. See our paper fo

74 Dec 13, 2022
Text classification is one of the popular tasks in NLP that allows a program to classify free-text documents based on pre-defined classes.

Deep-Learning-for-Text-Document-Classification Text classification is one of the popular tasks in NLP that allows a program to classify free-text docu

Happy N. Monday 2 Mar 17, 2022
Translation for Trilium Notes. Trilium Notes 中文版.

Trilium Translation 中文说明 This repo provides a translation for the awesome Trilium Notes. Currently, I have translated Trilium Notes into Chinese. Test

743 Jan 08, 2023
Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Neural G2P to portuguese language Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written for

fluz 11 Nov 16, 2022
Nmt - TensorFlow Neural Machine Translation Tutorial

Neural Machine Translation (seq2seq) Tutorial Authors: Thang Luong, Eugene Brevdo, Rui Zhao (Google Research Blogpost, Github) This version of the tut

6.1k Dec 29, 2022
Textpipe: clean and extract metadata from text

textpipe: clean and extract metadata from text textpipe is a Python package for converting raw text in to clean, readable text and extracting metadata

Textpipe 298 Nov 21, 2022
Using context-free grammar formalism to parse English sentences to determine their structure to help computer to better understand the meaning of the sentence.

Sentance Parser Executing the Program Make sure Python 3.6+ is installed. Install requirements $ pip install requirements.txt Run the program:

Vaibhaw 12 Sep 28, 2022
Facilitating the design, comparison and sharing of deep text matching models.

MatchZoo Facilitating the design, comparison and sharing of deep text matching models. MatchZoo 是一个通用的文本匹配工具包,它旨在方便大家快速的实现、比较、以及分享最新的深度文本匹配模型。 🔥 News

Neural Text Matching Community 3.7k Jan 02, 2023
Some embedding layer implementation using ivy library

ivy-manual-embeddings Some embedding layer implementation using ivy library. Just for fun. It is based on NYCTaxiFare dataset from kaggle (cut down to

Ishtiaq Hussain 2 Feb 10, 2022
A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

MIDI Language Introduction Reference Paper: Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions: code This

Robert Bogan Kang 3 May 25, 2022
Repository for the paper: VoiceMe: Personalized voice generation in TTS

🗣 VoiceMe: Personalized voice generation in TTS Abstract Novel text-to-speech systems can generate entirely new voices that were not seen during trai

Pol van Rijn 80 Dec 29, 2022
Задания КЕГЭ по информатике 2021 на Python

КЕГЭ 2021 на Python В этом репозитории мои решения типовых заданий КЕГЭ по информатике в 2021 году, БЕСПЛАТНО! Задания Взяты с https://inf-ege.sdamgia

8 Oct 13, 2022
Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

smart-school-chatbot Smart discord chatbot integrated with Dialogflow to interact with students naturally and manage different classes in a school. De

Tom Huynh 5 Oct 24, 2022
a test times augmentation toolkit based on paddle2.0.

Patta Image Test Time Augmentation with Paddle2.0! Input | # input batch of images / / /|\ \ \ # apply

AgentMaker 110 Dec 03, 2022
Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

Jifan Chen 22 Oct 21, 2022
A method to generate speech across multiple speakers

VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n

Facebook Archive 873 Dec 15, 2022
Spert NLP Relation Extraction API deployed with torchserve for inference

URLMask Python program for Linux users to change a URL to ANY domain. A program than can take any url and mask it to any domain name you like. E.g. ne

Zichu Chen 1 Nov 24, 2021
The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

Data and code for EMNLP 2021 paper "FinQA: A Dataset of Numerical Reasoning over Financial Data"

Zhiyu Chen 114 Dec 29, 2022
Train BPE with fastBPE, and load to Huggingface Tokenizer.

BPEer Train BPE with fastBPE, and load to Huggingface Tokenizer. Description The BPETrainer of Huggingface consumes a lot of memory when I am training

Lizhuo 1 Dec 23, 2021