Learning Generative Models of Textured 3D Meshes from Real-World Images, ICCV 2021

Last update: Jan 07, 2023

Related tags

Overview

Learning Generative Models of Textured 3D Meshes from Real-World Images

This is the reference implementation of "Learning Generative Models of Textured 3D Meshes from Real-World Images", accepted at ICCV 2021.

Dario Pavllo, Jonas Kohler, Thomas Hofmann, Aurelien Lucchi. Learning Generative Models of Textured 3D Meshes from Real-World Images. In IEEE/CVF International Conference on Computer Vision (ICCV), 2021.

This work is a follow-up of Convolutional Generation of Textured 3D Meshes, in which we learn a GAN for generating 3D triangle meshes and the corresponding texture maps using 2D supervision. In this work, we relax the requirement for keypoints in the pose estimation step, and generalize the approach to unannotated collections of images and new categories/datasets such as ImageNet.

Setup

Instructions on how to set up dependencies, datasets, and pretrained models can be found in SETUP.md

Quick start

In order to test our pretrained models, the minimal setup described in SETUP.md is sufficient. No dataset setup is required. We provide an interface for evaluating FID scores, as well as an interface for exporting a sample of generated 3D meshes (both as a grid of renderings and as .obj meshes).

Exporting a sample

You can export a sample of generated meshes using --export-sample. Here are some examples:

python run_generation.py --name pretrained_imagenet_car_singletpl --dataset imagenet_car --gpu_ids 0 --batch_size 10 --export_sample --how_many 40
python run_generation.py --name pretrained_imagenet_airplane_singletpl --dataset imagenet_airplane --gpu_ids 0 --batch_size 10 --export_sample --how_many 40
python run_generation.py --name pretrained_imagenet_elephant_singletpl --dataset imagenet_elephant --gpu_ids 0 --batch_size 10 --export_sample --how_many 40
python run_generation.py --name pretrained_cub_singletpl --dataset cub --gpu_ids 0 --batch_size 10 --export_sample --how_many 40
python run_generation.py --name pretrained_all_singletpl --dataset all --conditional_class --gpu_ids 0 --batch_size 10 --export_sample --how_many 40

This will generate a sample of 40 meshes, render them from random viewpoints, and export the final result to the output directory as a png image. In addition, the script will export the meshes as .obj files (along with material and texture). These can be imported into Blender or other modeling tools. You can switch between the single-template and multi-template settings by appending either _singletpl or _multitpl to the experiment name.

Evaluating FID on pretrained models

You can evaluate the FID of a model by specifying --evaluate. For the models trained to generate a single category (setting A):

python run_generation.py --name pretrained_cub_singletpl --dataset cub --gpu_ids 0,1,2,3 --batch_size 64 --evaluate
python run_generation.py --name pretrained_p3d_car_singletpl --dataset p3d_car --gpu_ids 0,1,2,3 --batch_size 64 --evaluate
python run_generation.py --name pretrained_imagenet_zebra --dataset imagenet_zebra_singletpl --gpu_ids 0,1,2,3 --batch_size 64 --evaluate

For the conditional models trained to generate all classes (setting B), you can specify the category to evaluate (e.g. motorcycle):

python run_generation.py --name pretrained_all_singletpl --dataset all --conditional_class --gpu_ids 0,1,2,3 --batch_size 64 --evaluate --filter_class motorcycle

As before, you can switch between the single-template and multi-template settings by appending either _singletpl or _multitpl to the experiment name. You can of course also adjust the number of GPUs and batch size to suit your computational resources. For evaluation, 16 elements per GPU is a sensible choice. You can also tune the number of data-loading threads using the --num_workers argument (default: 4 threads). Note that the FID will exhibit a small variance depending on the chosen batch size.

Training

See TRAINING.md for the instructions on how to generate the pseudo-ground-truth dataset and train a new model from scratch. The documentation also provides instructions on how to run the pose estimation steps and run the pipeline from scratch on a custom dataset.

Citation

If you use this work in your research, please consider citing our paper(s):

@inproceedings{pavllo2021textured3dgan,
  title={Learning Generative Models of Textured 3D Meshes from Real-World Images},
  author={Pavllo, Dario and Kohler, Jonas and Hofmann, Thomas and Lucchi, Aurelien},
  booktitle={IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021}
}

@inproceedings{pavllo2020convmesh,
  title={Convolutional Generation of Textured 3D Meshes},
  author={Pavllo, Dario and Spinks, Graham and Hofmann, Thomas and Moens, Marie-Francine and Lucchi, Aurelien},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2020}
}

License and Acknowledgments

Our work is licensed under the MIT license. For more details, see LICENSE. This repository builds upon convmesh and includes third-party libraries which may be subject to their respective licenses: Synchronized-BatchNorm-PyTorch, the data loader from CMR, and FID evaluation code from pytorch-fid.

Comments

CVE-2007-4559 Patch

Patching CVE-2007-4559

Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

opened by TrellixVulnTeam 0
how to test with the picture

I am very appreciated with your work.But I am wondering how can I test with my own picture. For example,I input an image of a car,and directly get the .obj and .png

opened by lisentao 1
caffe2 error for detectron

Hi,

I am trying to test the code on a custom dataset. I downloaded seg_every_thing in the root, copied detections_vg3k.py to tools of the former. Built detectron from scratch, but still it gives me: AssertionError: Detectron ops lib not found; make sure that your Caffe2 version includes Detectron module There is no make file in the Ops lib of detectron. How can I fix this?

opened by sinAshish 2
Person mesh and reconstruction reconstructing texture

Thanks for your great work ... Wanna work on person class to create mesh as well as corresponding texture. can you refer dataset and steps to reach out..?

opened by sharoseali 0
training on custom dataset

Thank you for your great work! currently, I'm following your work and trying to train on custom datasets. When I move on the data preparation part, I found the model weights in seg_every_thing repo are no long avaiable. I wonder is it possible for you to share the weights ('lib/datasets/data/trained_models/33219850_model_final_coco2vg3k_seg.pkl') used in tools/detection_tool_vg3k.py with us? Looking forward to your reply! Thanks~

opened by pingping-lu 1

Releases(v1.0)

v1.0(Aug 17, 2021)
Pretrained models

Cache directory with mesh templates, detections, poses, metadata, and precomputed FID statistics

The archive poses_perspective.zip is used for follow-up experimentation and is not required.

The archives must be extracted to the root directory of the cloned repository. For the minimal setup (evaluating FID and exporting samples) you do not need to download checkpoints_reconstruction.zip.

Update 11/2021: we have also uploaded the (optional) model weights for the seg_every_thing model 33219850_model_final_coco2vg3k_seg, since the original link is currently broken. You only need to download this if you want to infer part segmentations from scratch on a new dataset.
Source code(tar.gz)
Source code(zip)
33219850_model_final_coco2vg3k_seg.zip(1829.90 MB)
cache_datasets_templates.zip(511.50 MB)
checkpoints_gan.zip(1322.91 MB)
checkpoints_reconstruction.zip(1714.82 MB)
poses_perspective.zip(27.91 MB)

Owner

Dario Pavllo

PhD Student @ ETH Zurich

GitHub Repository

Repository of best practices for deep learning in Julia, inspired by fastai

FastAI Docs: Stable | Dev FastAI.jl is inspired by fastai, and is a repository of best practices for deep learning in Julia. Its goal is to easily ena

532 Jan 02, 2023

Baseline for the Spoofing-aware Speaker Verification Challenge 2022

Introduction This repository contains several materials that supplements the Spoofing-Aware Speaker Verification (SASV) Challenge 2022 including: calc

40 Dec 28, 2022

Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

Autoregressive Predictive Coding This repository contains the official implementation (in PyTorch) of Autoregressive Predictive Coding (APC) proposed

173 Dec 18, 2022

Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics This repository is the official PyTorch implementation of "Physics-aware Differ

46 Nov 20, 2022

Sky Computing: Accelerating Geo-distributed Computing in Federated Learning

Sky Computing Introduction Sky Computing is a load-balanced framework for federated learning model parallelism. It adaptively allocate model layers to

72 Dec 27, 2022

Code release for General Greedy De-bias Learning

General Greedy De-bias for Dataset Biases This is an extention of "Greedy Gradient Ensemble for Robust Visual Question Answering" (ICCV 2021, Oral). T

4 Mar 15, 2022

A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions

Overview This is a hobby project which includes a hand-gesture controlled virtual piano using an android phone camera and some OpenCV library. My moti

1 Nov 19, 2021

MT-GAN-PyTorch - PyTorch Implementation of Learning to Transfer: Unsupervised Domain Translation via Meta-Learning

MT-GAN-PyTorch PyTorch Implementation of AAAI-2020 Paper "Learning to Transfer: Unsupervised Domain Translation via Meta-Learning" Dependency: Python

29 Oct 19, 2022

Code accompanying the NeurIPS 2021 paper "Generating High-Quality Explanations for Navigation in Partially-Revealed Environments"

Generating High-Quality Explanations for Navigation in Partially-Revealed Environments This work presents an approach to explainable navigation under

1 Oct 28, 2022

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

32 Oct 26, 2022

public repo for ESTER dataset and modeling (EMNLP'21)

Project / Paper Introduction This is the project repo for our EMNLP'21 paper: https://arxiv.org/abs/2104.08350 Here, we provide brief descriptions of

19 Oct 27, 2022

Learning Generative Models of Textured 3D Meshes from Real-World Images, ICCV 2021

Related tags

Overview

Learning Generative Models of Textured 3D Meshes from Real-World Images

Setup

Quick start

Exporting a sample

Evaluating FID on pretrained models

Training

Citation

License and Acknowledgments

Comments

CVE-2007-4559 Patch

Patching CVE-2007-4559

how to test with the picture

caffe2 error for detectron

Person mesh and reconstruction reconstructing texture

training on custom dataset

Releases(v1.0)

v1.0(Aug 17, 2021)

Owner

Dario Pavllo

Repository of best practices for deep learning in Julia, inspired by fastai

Baseline for the Spoofing-aware Speaker Verification Challenge 2022

Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

Sky Computing: Accelerating Geo-distributed Computing in Federated Learning

Code release for General Greedy De-bias Learning

A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions

MT-GAN-PyTorch - PyTorch Implementation of Learning to Transfer: Unsupervised Domain Translation via Meta-Learning

Code accompanying the NeurIPS 2021 paper "Generating High-Quality Explanations for Navigation in Partially-Revealed Environments"

Multi-query Video Retreival

Code release for Convolutional Two-Stream Network Fusion for Video Action Recognition

A simple approach to emable dense segmentation with ViT.

Implementation of the state of the art beat-detection, downbeat-detection and tempo-estimation model

Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

Source code for "Pack Together: Entity and Relation Extraction with Levitated Marker"

Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery

Educational 2D SLAM implementation based on ICP and Pose Graph

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

public repo for ESTER dataset and modeling (EMNLP'21)