Styleformer - Official Pytorch Implementation

Last update: Dec 12, 2022

Related tags

Deep Learning Styleformer

Overview

Styleformer -- Official PyTorch implementation

Styleformer: Transformer based Generative Adversarial Networks with Style Vector(https://arxiv.org/abs/2106.07023)

Requirements

We have done all testing and development using 4 Titan RTX GPUs with 24GB.
64-bit Python 3.7 and PyTorch 1.7.1.
Python libraries: pip install click requests tqdm pyspng ninja imageio-ffmpeg==0.4.3. We use the Anaconda3 2020.11 distribution which installs most of these by default.

Pretrained pickle

CIFAR-10 Styleformer-Large with FID 2.82 IS 9.94

STL-10 Styleformer-Medium with FID 20.11 IS 10.16

CelebA Styleformer-Linformer with FID 3.66

LSUN-Church Styleformer-Linformer with FID 7.99

Generating images

Pre-trained networks are stored as *.pkl files that can be referenced using local filenames

# Generate images using pretrained_weight 
python generate.py --outdir=out --seeds=100-105 \
    --network=path_to_pkl_file

Outputs from the above commands are placed under out/*.png, controlled by --outdir. Downloaded network pickles are cached under $HOME/.cache/dnnlib, which can be overridden by setting the DNNLIB_CACHE_DIR environment variable. The default PyTorch extension build directory is $HOME/.cache/torch_extensions, which can be overridden by setting TORCH_EXTENSIONS_DIR.

Preparing datasets

CIFAR-10: Download the CIFAR-10 python version and convert to ZIP archive:

python dataset_tool.py --source=~/downloads/cifar-10-python.tar.gz --dest=~/datasets/cifar10.zip

STL-10: Download the stl-10 dataset 5k training, 100k unlabeled images from STL-10 dataset page and convert to ZIP archive:

python dataset_tool.py --source=~/downloads/cifar-10-python.tar.gz --dest=~/datasets/stl10.zip \
    ---width=48 --height=48

CelebA: Download the CelebA dataset Aligned&Cropped Images from CelebA dataset page and convert to ZIP archive:

python dataset_tool.py --source=~/downloads/cifar-10-python.tar.gz --dest=~/datasets/stl10.zip \
    ---width=64 --height=64

LSUN Church: Download the desired categories(church) from the LSUN project page and convert to ZIP archive:

python dataset_tool.py --source=~/downloads/lsun/raw/church_lmdb --dest=~/datasets/lsunchurch.zip \
    --width=128 --height=128

Training new networks

In its most basic form, training new networks boils down to:

python train.py --outdir=~/training-runs --data=~/mydataset.zip --gpus=1 --batch=32 --cfg=cifar --g_dict=256,64,16 \
    --num_layers=1,2,2 --depth=32

--g_dict= it means 'Hidden size' in paper, and it must be match with image resolution.
--num_layers= it means 'Layers' in paper, and it must be match with image resolution.
--depth=32 it means minimum required depth is 32, described in Section 2 at paper.
--linformer=1 apply informer to Styleformer.

Please refer to python train.py --help for the full list. To train STL-10 dataset with same setting at paper, please fix the starting resolution 88 to 1212 at training/networks_Generator.py.

Quality metrics

Quality metrics can be computed after the training:

# Pre-trained network pickle: specify dataset explicitly, print result to stdout.
python calc_metrics.py --metrics=fid50k_full --data=~/datasets/lsunchurch.zip \
    --network=path_to_pretrained_lsunchurch_pkl_file
    
python calc_metrics.py --metrics=is50k --data=~/datasets/lsunchurch.zip \
    --network=path_to_pretrained_lsunchurch_pkl_file

Citation

If you found our work useful, please don't forget to cite

@misc{park2021styleformer,
      title={Styleformer: Transformer based Generative Adversarial Networks with Style Vector}, 
      author={Jeeseung Park and Younggeun Kim},
      year={2021},
      eprint={2106.07023},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

The code is heavily based on the stylegan2-ada-pytorch implementation

Styleformer - Official Pytorch Implementation

Related tags

Overview

Styleformer -- Official PyTorch implementation

Requirements

Pretrained pickle

Generating images

Preparing datasets

Training new networks

Quality metrics

Citation

Owner

Jeeseung Park

MPLP: Metapath-Based Label Propagation for Heterogenous Graphs

SimBERT升级版（SimBERTv2）！

retweet 4 satoshi ⚡️

Volsdf - Volume Rendering of Neural Implicit Surfaces

Reading list for research topics in Masked Image Modeling

MediaPipe Kullanarak İleri Seviye Bilgisayarla Görü

A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.

Propose a principled and practically effective framework for unsupervised accuracy estimation and error detection tasks with theoretical analysis and state-of-the-art performance.

This script runs neural style transfer against the provided content image.

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Deployment of PyTorch chatbot with Flask

Exploration-Exploitation Dilemma Solving Methods

CT-Net: Channel Tensorization Network for Video Classification

Object DGCNN and DETR3D, Our implementations are built on top of MMdetection3D.

This repository contains python code necessary to replicated the experiments performed in our paper "Invariant Ancestry Search"

Streamlit component for TensorBoard, TensorFlow's visualization toolkit

Contrastive Learning for Compact Single Image Dehazing, CVPR2021

Multispectral Object Detection with Yolov5

Tilted Empirical Risk Minimization (ICLR '21)

Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes