This is a re-implementation of TransGAN: Two Pure Transformers Can Make One Strong GAN (CVPR 2021) in PyTorch.

Last update: Jan 05, 2023

Related tags

Overview

TransGAN: Two Transformers Can Make One Strong GAN [YouTube Video]

Paper Authors: Yifan Jiang, Shiyu Chang, Zhangyang Wang

CVPR 2021

This is re-implementation of TransGAN: Two Transformers Can Make One Strong GAN, and That Can Scale Up, CVPR 2021 in PyTorch.

Generative Adversarial Networks-GAN builded completely free of Convolutions and used Transformers architectures which became popular since Vision Transformers-ViT. In this implementation, CIFAR-10 dataset was used.

0 Epoch	40 Epoch	100 Epoch	200 Epoch

Related Work - Vision Transformers (ViT)

In this implementation, as a discriminator, Vision Transformer(ViT) Block was used. In order to get more info about ViT, you can look at the original paper here

Credits for illustration of ViT: @lucidrains

Installation

Before running train.py, check whether you have libraries in requirements.txt! Also, create ./fid_stat folder and download the fid_stats_cifar10_train.npz file in this folder. To save your model during training, create ./checkpoint folder using mkdir checkpoint.

Training

python train.py

Pretrained Model

You can find pretrained model here. You can download using:

wget https://drive.google.com/file/d/134GJRMxXFEaZA0dF-aPpDS84YjjeXPdE/view

curl gdrive.sh | bash -s https://drive.google.com/file/d/134GJRMxXFEaZA0dF-aPpDS84YjjeXPdE/view

License

MIT

Citation

@article{jiang2021transgan,
  title={TransGAN: Two Transformers Can Make One Strong GAN},
  author={Jiang, Yifan and Chang, Shiyu and Wang, Zhangyang},
  journal={arXiv preprint arXiv:2102.07074},
  year={2021}
}

@article{dosovitskiy2020,
  title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
  author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and  Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},
  journal={arXiv preprint arXiv:2010.11929},
  year={2020}
}

@inproceedings{zhao2020diffaugment,
  title={Differentiable Augmentation for Data-Efficient GAN Training},
  author={Zhao, Shengyu and Liu, Zhijian and Lin, Ji and Zhu, Jun-Yan and Han, Song},
  booktitle={Conference on Neural Information Processing Systems (NeurIPS)},
  year={2020}
}

Comments

GPU memory, Modifying batch size
Hello,

I saw your comment in VITA-Group's implementation of TransGAN and started looking at your implementation here.

Without modifying anything and attempting to run "python train.py" results in CUDA out of memory; I believe the GPU I'm using cannot handle the model size/training images that you've specified. I tried editing the batch size on lines 35 and 36 of train.py (--gener_batch_size, changing default from 64 to 32, etc.), but I get a RuntimeError of:

Output 0 of UnbindBackward is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such fuctions do not allow the otutput views to be modified inplace. You should replace the inplace operation by an out-of-place one.

My two questions are:

How would you suggest modifying the training parameters to deal with GPU running out of memory? and,

Is there a better way to edit the batch size, and what else do I need to change in order for the code to not break when the batch size is changed?

Thanks!
opened by Andrew-X-Wang 10
Create your own FID stats file

Hello and thanks for the implementation. I'm trying to train this model on a different datset, but to do so I need a custom fid_stats file for my dataset. How can I create it ?

opened by IlyasMoutawwakil 2
FID score: nan

Thank you for your contribution. But in the training processing, FID score is Nan. I want to known whether it is appropriate. Should I make some chance to solve this problem?

opened by Jamie-Cheung 1
TransGAN fid problem

hello,I would like to humbly ask you what is the difference beetween TransGAN-main and TransGAN-master?can Trans-main reproduce similar results of the original paper? The results obtained by using CIFAR in TransGAN-main are quite different from those in the paper,and WGAN-EP loss concussion,so I want to ask you.

opened by Stephenlove 1
How do you test on your own dataset with the checkpoint.pth generated?

I want to use the checkpoint saved to generate my own results from a testing dataset and use those images later to calculate my own evaluation metrics. Please help

opened by meh-naz 0

Releases(v2.0)

v2.0(Jul 6, 2021)

More qualified generated images with TransGAN on CIFAR10 dataset.
Source code(tar.gz)
Source code(zip)
v1.0(May 31, 2021)

In this version of re-implementation, MNIST and CIFAR-10 datasets were used for TransGAN-S.
Source code(tar.gz)
Source code(zip)

Owner

Ahmet Sarigun

Yet, another human being!

GitHub Repository https://arxiv.org/abs/2102.07074

JupyterNotebook - C/C++, Javascript, HTML, LaTex, Shell scripts in Jupyter Notebook Also run them on remote computer

JupyterNotebook Read, write and execute C, C++, Javascript, Shell scripts, HTML, LaTex in jupyter notebook, And also execute them on remote computer R

1 Jan 09, 2022

Extract MNIST handwritten digits dataset binary file into bmp images

MNIST-dataset-extractor Extract MNIST handwritten digits dataset binary file into bmp images More info at http://yann.lecun.com/exdb/mnist/ Dependenci

6 May 24, 2021

A copy of Ares that costs 30 fucking dollars.

Finalement, j'ai décidé d'abandonner cette idée, je me suis comporté comme un enfant qui été en colère. Comme m'ont dit certaines personnes j'ai des c

24 Apr 14, 2022

Crab is a ﬂexible, fast recommender engine for Python that integrates classic information ﬁltering recommendation algorithms in the world of scientiﬁc Python packages (numpy, scipy, matplotlib).

Crab - A Recommendation Engine library for Python Crab is a ﬂexible, fast recommender engine for Python that integrates classic information ﬁltering r

1.2k Dec 21, 2022

Lua-parser-lark - An out-of-box Lua parser written in Lark

An out-of-box Lua parser written in Lark Such parser handles a relaxed version o

2 Jul 19, 2022

A library for finding knowledge neurons in pretrained transformer models.

knowledge-neurons An open source repository replicating the 2021 paper Knowledge Neurons in Pretrained Transformers by Dai et al., and extending the t

96 Dec 21, 2022

Pytorch implementation of our paper under review — Lottery Jackpots Exist in Pre-trained Models

Lottery Jackpots Exist in Pre-trained Models (Paper Link) Requirements Python = 3.7.4 Pytorch = 1.6.1 Torchvision = 0.4.1 Reproduce the Experiment

27 Jun 28, 2022

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

170 Jan 04, 2023

Code for Multinomial Diffusion

Code for Multinomial Diffusion Abstract Generative flows and diffusion models have been predominantly trained on ordinal data, for example natural ima

104 Jan 04, 2023

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

1.3k Dec 31, 2022

Open AI's Python library

OpenAI Python Library The OpenAI Python library provides convenient access to the OpenAI API from applications written in the Python language. It incl

3 Jul 10, 2022

Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow

Perceiver This Python package implements Perceiver: General Perception with Iterative Attention by Andrew Jaegle in TensorFlow. This model builds on t

84 Oct 15, 2022

Pytorch cuda extension of grid_sample1d

Grid Sample 1d pytorch cuda extension of grid sample 1d. Since pytorch only supports grid sample 2d/3d, I extend the 1d version for efficiency. The fo

24 Dec 03, 2022

Awesome-AI-books - Some awesome AI related books and pdfs for learning and downloading

Awesome AI books Some awesome AI related books and pdfs for downloading and learning. Preface This repo only used for learning, do not use in business

1k Jan 01, 2023

PyTorch wrapper for Taichi data-oriented class

Stannum PyTorch wrapper for Taichi data-oriented class PRs are welcomed, please see TODOs. Usage from stannum import Tin import torch data_oriented =

86 Dec 23, 2022

A facial recognition doorbell system using a Raspberry Pi

Facial Recognition Doorbell This project expands on the person-detecting doorbell system to allow it to identify faces, and announce names accordingly

22 Apr 15, 2022

Machine Learning automation and tracking

The Open-Source MLOps Orchestration Framework MLRun is an open-source MLOps framework that offers an integrative approach to managing your machine-lea

873 Jan 04, 2023

Logistic Bandit experiments. Official code for the paper "Jointly Efficient and Optimal Algorithms for Logistic Bandits".

Code for the paper Jointly Efficient and Optimal Algorithms for Logistic Bandits, by Louis Faury, Marc Abeille, Clément Calauzènes and Kwang-Sun Jun.

1 Jan 22, 2022

[NeurIPS 2021] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects

[NeurIPS 2021] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects YouTube | arXiv Prerequisites Kaolin is available here:

107 Dec 26, 2022

Tiny Object Detection in Aerial Images.

AI-TOD AI-TOD is a dataset for tiny object detection in aerial images. [Paper] [Dataset] Description AI-TOD comes with 700,621 object instances for ei

116 Dec 30, 2022