Learning to Draw: Emergent Communication through Sketching

Overview

Learning to Draw: Emergent Communication through Sketching

This is the official code for the paper "Learning to Draw: Emergent Communication through Sketching".

ArXivPapers With CodeGetting StartedGame setupsModel setupDatasets

About

We demonstrate that it is possible for a communication channel based on line drawing to emerge between agents playing a visual referential communication game. Furthermore we show that with a simple additional self-supervised loss that the drawings the agent produces are interpretable by humans.

Getting started

You'll need to install the required dependencies listed in requirements.txt. This includes installing the differentiable rasteriser from the DifferentiableSketching repository, and the source version of https://github.com/pytorchbearer/torchbearer:

pip install git+https://github.com/jonhare/DifferentiableSketching.git
pip install git+https://github.com/pytorchbearer/torchbearer.git
pip install -r requirements.txt

Once the dependencies are installed, you can run the commgame.py script to train and test models:

python commgame.py train [args]
python commgame.py test [args]

For example, to train a pair of agents on the original game using the STL10 dataset (which will be downloaded if required), you would run:

python commgame.py train --dataset STL10 --output stl10-original-model --sigma2 5e-4 --nlines 20 --learning-rate 0.0001 --imagenet-weights --freeze-vgg --imagenet-norm --epochs 250 --invert --batch-size 100

The options --sigma2 and --nlines control the thickness and number of lines respectively. --imagenet-weights uses the standard pretrained imagenet vgg16 weights (use --sin-weights for stylized imagenet weights). Finally, --freeze-vgg freezes the backbone CNN, --imagenet-norm specifies to apply the imagenet normalisation to images (this should be used when using either imagenet or stylized imagenet weights), and --invert draws black strokes on a white canvas.

The training scripts compute a running communication rate in addition to loss and this is displayed as training progresses. After each epoch a validation pass is performed and images of the sketches and sender inputs and receiver targets are saved to the output directory along with a model snapshot. The output directory also contains a log file with the training and validation statistics per epoch.

Example commands to run the experiments in the paper are given in commands.md

Further details on commandline arguments are given below.

Game setups

All the setups involve a referential game where the reciever tries to select the "correct" image from a pool on the basis of a "sketch" provided by the sender. The primary measure of success is the communication rate. The different command line arguments to control the different game variants are listed in the following subsections:

Havrylov and Titov's Original Game Setup

Sender sees one image; Reciever sees many, where one is exactly the same as sender.

Number of reciever images (target + distractors) is controlled by the batch-size. Number of sender images per iteration can also be controlled for completeness, but defaults to the same as batch size (e.g. each forward pass with a batch plays all possible game combinations using each of the images as a target).

arguments:
--batch-size
[--sender-images-per-iter]

Object-oriented Game Setup (same)

Sender sees one image; Reciever sees many, where one is exactly the same as sender and the others are all of different classes.

arguments:
--object-oriented same
[--num-targets]
[--sender-images-per-iter]

Object-oriented Game Setup (different)

Sender sees one image; Reciever sees many, each of different classes; one of the images is the same class as the sender, but is a completely different image).

arguments:
--object-oriented different 
[--num-targets]
[--sender-images-per-iter]
[--random-transform-sender]

Model setup

Sender

The "sender" consists of a backbone VGG16 CNN which translates the input image into a latent vector and a "decoder" with an MLP that projects the latent representation from the backbone to a set of drawing commands that are differentiably rendered into an image which is sent to the "reciever".

The backbone can optionally be initialised with pretrained weight and also optionally frozen (except for the final linear projection). The backbone, including linear projection can be shared between sender and reciever (default) or separate (--separate_encoders).

arguments:
[--freeze-vgg]
[--imagenet-weights --imagenet-norm] 
[--sin-weights --imagenet-norm] 
[--separate_encoders]

Receiver

The "receiver" consists of a backbone CNN which is used to convert visual inputs (both the images in the pool and the sketch) into a latent vector which is then transformed into a different latent representation by an MLP. These projected latent vectors are used for prediction and in the loss as described below.

The actual backbone CNN model architecture will be the same as the sender's. The backbone can optionally share parameters with the "sender" agent. Alternatively it can be initialised with pre-trained weights, and also optionally frozen.

arguments:
[--freeze-vgg]
[--imagenet-weights --imagenet-norm]
[--separate_encoders]

Datasets

  • MNIST
  • CIFAR-10 / CIFAR-100
  • TinyImageNet
  • CelebA (--image-size to control size; default 64px)
  • STL-10
  • Caltech101 (training data is balanced by supersampling with augmentation)

Datasets will be downloaded to the dataset root directory (default ./data) as required.

arguments: 
--dataset {CIFAR10,CelebA,MNIST,STL10,TinyImageNet,Caltech101}  
[--dataset-root]

Citation

If you find this repository useful for your research, please cite our paper using the following.

  @@inproceedings{
  mihai2021learning,
  title={Learning to Draw: Emergent Communication through Sketching},
  author={Daniela Mihai and Jonathon Hare},
  booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
  year={2021},
  url={https://openreview.net/forum?id=YIyYkoJX2eA}
  }
Multispectral Object Detection with Yolov5

Multispectral-Object-Detection Intro Official Code for Cross-Modality Fusion Transformer for Multispectral Object Detection. Multispectral Object Dete

Richard Fang 121 Jan 01, 2023
Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News 💪 3DCrowdNet achieves the state-of-the-art accuracy on 3D

Hongsuk Choi 113 Dec 21, 2022
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Pyserini Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations. Retrieval using sparse re

Castorini 706 Dec 29, 2022
High-quality implementations of standard and SOTA methods on a variety of tasks.

Uncertainty Baselines The goal of Uncertainty Baselines is to provide a template for researchers to build on. The baselines can be a starting point fo

Google 1.1k Dec 30, 2022
GPU Accelerated Non-rigid ICP for surface registration

GPU Accelerated Non-rigid ICP for surface registration Introduction Preivous Non-rigid ICP algorithm is usually implemented on CPU, and needs to solve

Haozhe Wu 144 Jan 04, 2023
Unofficial PyTorch Implementation of "Augmenting Convolutional networks with attention-based aggregation"

Pytorch Implementation of Augmenting Convolutional networks with attention-based aggregation This is the unofficial PyTorch Implementation of "Augment

DK 20 Sep 09, 2022
Deep Learning Package based on TensorFlow

White-Box-Layer is a Python module for deep learning built on top of TensorFlow and is distributed under the MIT license. The project was started in M

YeongHyeon Park 7 Dec 27, 2021
Compare GAN code.

Compare GAN This repository offers TensorFlow implementations for many components related to Generative Adversarial Networks: losses (such non-saturat

Google 1.8k Jan 05, 2023
Official PyTorch implementation of the paper "TEMOS: Generating diverse human motions from textual descriptions"

TEMOS: TExt to MOtionS Generating diverse human motions from textual descriptions Description Official PyTorch implementation of the paper "TEMOS: Gen

Mathis Petrovich 187 Dec 27, 2022
The versatile ocean simulator, in pure Python, powered by JAX.

Veros is the versatile ocean simulator -- it aims to be a powerful tool that makes high-performance ocean modeling approachable and fun. Because Veros

TeamOcean 245 Dec 20, 2022
Code for Subgraph Federated Learning with Missing Neighbor Generation (NeurIPS 2021)

To run the code Unzip the package to your local directory; Run 'pip install -r requirements.txt' to download required packages; Open file ~/nips_code/

32 Dec 26, 2022
Rlmm blender toolkit - A set of tools to streamline level generation in UDK straight from Blender

rlmm_blender_toolkit A set of tools to streamline level generation in UDK straig

Rocket League Mapmaking 0 Jan 15, 2022
Multi-Scale Progressive Fusion Network for Single Image Deraining

Multi-Scale Progressive Fusion Network for Single Image Deraining (MSPFN) This is an implementation of the MSPFN model proposed in the paper (Multi-Sc

Kuijiang 128 Nov 21, 2022
Learning to Stylize Novel Views

Learning to Stylize Novel Views [Project] [Paper] Contact: Hsin-Ping Huang ([ema

34 Nov 27, 2022
The code for SAG-DTA: Prediction of Drug–Target Affinity Using Self-Attention Graph Network.

SAG-DTA The code is the implementation for the paper 'SAG-DTA: Prediction of Drug–Target Affinity Using Self-Attention Graph Network'. Requirements py

Shugang Zhang 7 Aug 02, 2022
PaRT: Parallel Learning for Robust and Transparent AI

PaRT: Parallel Learning for Robust and Transparent AI This repository contains the code for PaRT, an algorithm for training a base network on multiple

Mahsa 0 May 02, 2022
NeROIC: Neural Object Capture and Rendering from Online Image Collections

NeROIC: Neural Object Capture and Rendering from Online Image Collections This repository is for the source code for the paper NeROIC: Neural Object C

Snap Research 647 Dec 27, 2022
Code accompanying the paper Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs (Chen et al., CVPR 2020, Oral).

Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs This repository contains PyTorch implementation of our pa

Shizhe Chen 178 Dec 29, 2022
World Models with TensorFlow 2

World Models This repo reproduces the original implementation of World Models. This implementation uses TensorFlow 2.2. Docker The easiest way to hand

Zac Wellmer 234 Nov 30, 2022
Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready inference.

Yolov5 running on TorchServe (GPU compatible) ! This is a dockerfile to run TorchServe for Yolo v5 object detection model. (TorchServe (PyTorch librar

82 Nov 29, 2022