ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

Last update: Nov 25, 2022

Related tags

Overview

ViSER

Installation with conda

conda env create -f viser.yml
conda activate viser-release
# install softras
cd third_party/softras; python setup.py install; cd -;
# install manifold remeshing
git clone --recursive git://github.com/hjwdzh/Manifold; cd Manifold; mkdir build; cd build; cmake .. -DCMAKE_BUILD_TYPE=Release;make -j8; cd ../../

Data preparation

Create folders to store intermediate data and training logs

mkdir log; mkdir tmp;

Download pre-processed data (rgb, mask, flow) following the link here and unzip under ./database/DAVIS/. The dataset is organized as:

DAVIS/
    Annotations/
        Full-Resolution/
            sequence-name/
                {%05d}.png
    JPEGImages/
        Full-Resolution/
            sequence-name/
                {%05d}.jpg
    FlowBW/ and FlowFw/
        Full-Resolution/
            sequence-name/ and optionally seqname-name_{%02d}/ (frame interval)
                flo-{%05d}.pfm
                occ-{%05d}.pfm
                visflo-{%05d}.jpg
                warp-{%05d}.jpg

To run preprocessing scripts on other videos, see install.md.

Example: breakdance-flare

Run

bash scripts/template.sh breakdance-flare

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_result.sh breakdance-flare log/breakdance-flare-1003-ft2/pred_net_20.pth 36

Example outputs:

Example: elephants

Run

bash scripts/relephant-walk.sh

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_elephants.sh log/elephant-walk-1003-6/pred_net_10.pth

Additional Notes

Distributed training

The current codebase supports single-node multi-gpu training with pytorch distributed data-parallel. Please modify dev and ngpu in scripts/template.sh to select devices.

Potential bugs

When setting batch_size to 3, rendered flow may become constant values.

Acknowledgement

The code borrows the skeleton of CMR

External repos:

Citation

To cite our paper

@inproceedings{yang2021viser,
  title={ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Liu, Ce
      and Ramanan, Deva},
  booktitle = {NeurIPS},
  year={2021}
}

@inproceedings{yang2021lasr,
  title={LASR: Learning Articulated Shape Reconstruction from a Monocular Video},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Chang, Huiwen
      and Ramanan, Deva
      and Freeman, William T
      and Liu, Ce},
  booktitle={CVPR},
  year={2021}
}

TODO

data pre-processing scripts
evaluation data and scripts
code clean up

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

Related tags

Overview

ViSER

Installation with conda

Data preparation

Example: breakdance-flare

Example: elephants

Additional Notes

Acknowledgement

Citation

TODO

Owner

Gengshan Yang

HALO: A Skeleton-Driven Neural Occupancy Representation for Articulated Hands

BMVC 2021: This is the github repository for "Few Shot Temporal Action Localization using Query Adaptive Transformers" accepted in British Machine Vision Conference (BMVC) 2021, Virtual

GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!

"Graph Neural Controlled Differential Equations for Traffic Forecasting", AAAI 2022

Nvidia Semantic Segmentation monorepo

Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)

PyTorch implementation of "Contrast to Divide: self-supervised pre-training for learning with noisy labels"

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

Patch-Diffusion Code (AAAI2022)

STBP is a way to train SNN with datasets by Backward propagation.

Assginment for UofT CSC420: Intro to Image Understanding

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.

AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models

Code, Models and Datasets for OpenViDial Dataset

Code for “ACE-HGNN: Adaptive Curvature ExplorationHyperbolic Graph Neural Network”

Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)

MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python

Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Autotype on websites that have copy-paste disabled like Moodle, HackerEarth contest etc.