Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

Last update: Dec 16, 2022

Related tags

Deep Learning StackGAN-v2

Overview

StackGAN-v2

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks by Han Zhang*, Tao Xu*, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas.

Dependencies

python 2.7

Pytorch

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

tensorboard
python-dateutil
easydict
pandas
torchfile

Data

Download our preprocessed char-CNN-RNN text embeddings for birds and save them to data/

[Optional] Follow the instructions reedscot/icml2016 to download the pretrained char-CNN-RNN text encoders and extract text embeddings.

Download the birds image data. Extract them to data/birds/
Download ImageNet dataset and extract the images to data/imagenet/
Download LSUN dataset and save the images to data/lsun

Training

Train a StackGAN-v2 model on the bird (CUB) dataset using our preprocessed embeddings:
- python main.py --cfg cfg/birds_3stages.yml --gpu 0
Train a StackGAN-v2 model on the ImageNet dog subset:
- python main.py --cfg cfg/dog_3stages_color.yml --gpu 0
Train a StackGAN-v2 model on the ImageNet cat subset:
- python main.py --cfg cfg/cat_3stages_color.yml --gpu 0
Train a StackGAN-v2 model on the lsun bedroom subset:
- python main.py --cfg cfg/bedroom_3stages_color.yml --gpu 0
Train a StackGAN-v2 model on the lsun church subset:
- python main.py --cfg cfg/church_3stages_color.yml --gpu 0
*.yml files are example configuration files for training/evaluation our models.
If you want to try your own datasets, here are some good tips about how to train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex datasets.

Pretrained Model

StackGAN-v2 for bird. Download and save it to models/ (The inception score for this Model is 4.04±0.05)
StackGAN-v2 for dog. Download and save it to models/ (The inception score for this Model is 9.55±0.11)
StackGAN-v2 for cat. Download and save it to models/
StackGAN-v2 for bedroom. Download and save it to models/
StackGAN-v2 for church. Download and save it to models/

Evaluating

Run python main.py --cfg cfg/eval_birds.yml --gpu 1 to generate samples from captions in birds validation set.
Change the eval_*.yml files to generate images from other pre-trained models.

Examples generated by StackGAN-v2

Tsne visualization of randomly generated birds, dogs, cats, churchs and bedrooms

Citing StackGAN++

If you find StackGAN useful in your research, please consider citing:

@article{Han17stackgan2,
  author    = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
  title     = {StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks},
  journal   = {arXiv: 1710.10916},
  year      = {2017},
}

@inproceedings{han2017stackgan,
Author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
Title = {StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks},
Year = {2017},
booktitle = {{ICCV}},
}

Our follow-up work

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks [Supplementary][code]

References

Generative Adversarial Text-to-Image Synthesis Paper Code
Learning Deep Representations of Fine-grained Visual Descriptions Paper Code

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

Related tags

Overview

StackGAN-v2

Dependencies

Citing StackGAN++

Owner

Han Zhang

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

Codebase for Image Classification Research, written in PyTorch.

Data and code for the paper "Importance of Kernel Bandwidth in Quantum Machine Learning"

Code for our paper "MG-GAN: A Multi-Generator Model Preventing Out-of-Distribution Samples in Pedestrian Trajectory Prediction" published at ICCV 2021.

Discretized Integrated Gradients for Explaining Language Models (EMNLP 2021)

Underwater image enhancement

Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks

Count the MACs / FLOPs of your PyTorch model.

Reference implementation for Deep Unsupervised Learning using Nonequilibrium Thermodynamics

A dual benchmarking study of visual forgery and visual forensics techniques

The world's simplest facial recognition api for Python and the command line

Depression Asisstant GDSC Challenge Solution

Implementation of C-RNN-GAN.

Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).

An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

:fire: 2D and 3D Face alignment library build using pytorch

RL-GAN: Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation

TransMorph: Transformer for Medical Image Registration

This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

Pathdreamer: A World Model for Indoor Navigation