StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

Demo video: CVPR 2021 Oral:

Single Channel Manipulation: Localized or attribute specific Manipulation:

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
Zongze Wu, Dani Lischinski, Eli Shechtman
paper (CVPR 2021 Oral) video

Abstract: We explore and analyze the latent style space of StyleGAN2, a state-of-the-art architecture for image generation, using models pretrained on several different datasets. We first show that StyleSpace, the space of channel-wise style parameters, is significantly more disentangled than the other intermediate latent spaces explored by previous works. Next, we describe a method for discovering a large collection of style channels, each of which is shown to control a distinct visual attribute in a highly localized and disentangled manner. Third, we propose a simple method for identifying style channels that control a specific attribute, using a pretrained classifier or a small number of example images. Manipulation of visual attributes via these StyleSpace controls is shown to be better disentangled than via those proposed in previous works. To show this, we make use of a newly proposed Attribute Dependency metric. Finally, we demonstrate the applicability of StyleSpace controls to the manipulation of real images. Our findings pave the way to semantically meaningful and well-disentangled image manipulations via simple and intuitive interfaces.

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

Related tags

Overview

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

generated face manipulation

generated car and bedroom manipulation

real face manipulation

Owner

Zongze Wu

Constructing interpretable quadratic accuracy predictors to serve as an objective function for an IQCQP problem that represents NAS under latency constraints and solve it with efficient algorithms.

Generate Cartoon Images using Generative Adversarial Network

Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite

SberSwap Video Swap base on deep learning

Fiddle is a Python-first configuration library particularly well suited to ML applications.

NumQMBasic - A mini-course offered to Undergrad physics students

Repository containing detailed experiments related to the paper "Memotion Analysis through the Lens of Joint Embedding".

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Code for "Hierarchical Skills for Efficient Exploration" HSD-3 Algorithm and Baselines

PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

Implementations of LSTM: A Search Space Odyssey variants and their training results on the PTB dataset.

Fine-Tune EleutherAI GPT-Neo to Generate Netflix Movie Descriptions in Only 47 Lines of Code Using Hugginface And DeepSpeed

A clear, concise, simple yet powerful and efficient API for deep learning.

A tool to visualise the results of AlphaFold2 and inspect the quality of structural predictions

Code for our paper "Multi-scale Guided Attention for Medical Image Segmentation"

Mask-invariant Face Recognition through Template-level Knowledge Distillation

You Only Look Once for Panopitic Driving Perception

Scalable Multi-Agent Reinforcement Learning

这是一个利用facenet和retinaface实现人脸识别的库，可以进行在线的人脸识别。

Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)