A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Last update: Nov 25, 2022

Related tags

Deep Learning dong_iccv_2017

Overview

Semantic Image Synthesis via Adversarial Learning

This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning.

Requirements

PyTorch 0.2
Torchvision
Pillow
fastText.py (Note: if you have a problem when loading a pretrained model, try my fixed code)
NLTK

Pretrained word vectors for fastText

Download a pretrained English word vectors. You can see the list of pretrained vectors on this page.

Datasets

Oxford-102 flowers: images and captions
Caltech-200 birds: images and captions

The caption data is from this repository. After downloading, modify CONFIG file so that all paths of the datasets point to the data you downloaded.

Run

scripts/train_text_embedding_[birds/flowers].sh
Train a visual-semantic embedding model using the method of Kiros et al..
scripts/train_[birds/flowers].sh
Train a GAN using a pretrained text embedding model.
scripts/test_[birds/flowers].sh
Generate some examples using original images and semantically relevant texts.

Results

Acknowledgements

We would like to thank Hao Dong, who is one of the first authors of the paper Semantic Image Synthesis via Adversarial Learning, for providing helpful advice for the implementation.

A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Related tags

Overview

Semantic Image Synthesis via Adversarial Learning

Requirements

Pretrained word vectors for fastText

Datasets

Run

Results

Acknowledgements

Owner

Seonghyeon Nam

Adaptive Prototype Learning and Allocation for Few-Shot Segmentation (CVPR 2021)

Direct Multi-view Multi-person 3D Human Pose Estimation

Heat transfer problemas solved using python

Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation (CoRL 2021)

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

Machine Unlearning with SISA

Explaining neural decisions contrastively to alternative decisions.

A list of multi-task learning papers and projects.

Lama-cleaner: Image inpainting tool powered by LaMa

Alignment Attention Fusion framework for Few-Shot Object Detection

A library that can print Python objects in human readable format

Testing and Estimation of structural breaks in Stata

Punctuation Restoration using Transformer Models for High-and Low-Resource Languages

Backdoor Attack through Frequency Domain

Camera calibration & 3D pose estimation tools for AcinoSet

Reinforcement Learning for the Blackjack

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

A mini library for Policy Gradients with Parameter-based Exploration, with reference implementation of the ClipUp optimizer from NNAISENSE.

Automated image registration. Registrationimation was too much of a mouthful.