Pytorch implementation of CoCon: A Self-Supervised Approach for Controlled Text Generation

Overview

COCON_ICLR2021

This is our Pytorch implementation of COCON.

CoCon: A Self-Supervised Approach for Controlled Text Generation (ICLR 2021)
Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Jie Fu
https://arxiv.org/abs/2010.02684

TL;DR: We propose CoCon to control the content of text generation from LMs by conditioning on content inputs at an interleave layer.

Requirements

  • Python 3.7.6 on Linux
  • PyTorch 1.4

Dependencies

Install dependencies with:

pip install -r requirements.txt

Dataset

  1. Download COCON's training data from https://github.com/openai/gpt-2-output-dataset
  2. Place the medium-345M-k40.${split}.jsonl files inside the data/gpt2output/ folder

COCON Training

Train COCON with a GPT-2 language model, with the parameters reported in the paper:

sh train_cocon.sh

After training, the COCON block's weights will be saved as models/COCON/cocon_block_pytorch_model.bin.

Training Key Arguments

--do_train : whether to train COCON or not
--output_dir : directory of COCON weights
--model_name_or_path : type of language model to train COCON with
--output_hidden_for_cocon_after_block_ind : index of transformer block whose hidden states are used as input to COCON for content conditioning, value is 6 for results reported in paper, meaning that the output of GPT-2's 7th transformer block is used as COCON block's input.

Pretrained COCON weights

You can download COCON's pretrained weights here and save it in models/COCON/ to start generating with COCON.

COCON Controlled Generation

Sample script on how to generate COCON sentiment-controlled text:

sh generation/generate_cocon_sentiments.sh

Sample script on how to generate COCON topic-controlled text:

sh generation/generate_cocon_topics.sh

COCON-generated texts correspond to the cocon_output key in the output .jsonl files and Cocon AR output in the output .txt files.

Generation Key Arguments

--do_cocon_compute : whether to do COCON generation
--output_dir : directory of COCON block's weights
--model_name_or_path : type of language model
--cocon_output_filename : path of saved generation samples
--cocon_compute_history_source_data_file : filename of text file containing prompt texts for generation
--cocon_compute_context_source_data_file : filename of text file containing target content for generation

Summary of Key Folders/Files

  • transformers/: code for models and optimizers
  • transformers/modeling_gpt2.py: code for COCON block and GPT-2 language model
  • BOW/: target content tokens used for COCON topic control
  • attr_markers/: target content tokens used for COCON sentiment control
  • prompts/: prompt text used for text generation

Citation

If you find our repository useful, please consider citing our paper:

@inproceedings{
chan2021cocon,
title={CoCon: A Self-Supervised Approach for Controlled Text Generation},
author={Alvin Chan and Yew-Soon Ong and Bill Pung and Aston Zhang and Jie Fu},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=VD_ozqvBy4W}
}

Acknowledgements

Code is based largely on:

Owner
alvinchangw
CS PhD Student @ Nanyang Technological University, Singapore
alvinchangw
SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

SEOVER-Master This code is the implementation of paper: SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

4 Feb 24, 2022
This repository accompanies the ACM TOIS paper "What can I cook with these ingredients?" - Understanding cooking-related information needs in conversational search

In this repository you find data that has been gathered when conducting in-situ experiments in a conversational cooking setting. These data include tr

6 Sep 22, 2022
Pure python implementation reverse-mode automatic differentiation

MiniGrad A minimal implementation of reverse-mode automatic differentiation (a.k.a. autograd / backpropagation) in pure Python. Inspired by Andrej Kar

Kenny Song 76 Sep 12, 2022
TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning Authors: Yixuan Su, Fangyu Liu, Zaiqiao Meng, Lei Shu, Ehsan Shareghi, and Nig

Yixuan Su 79 Nov 04, 2022
Code, environments, and scripts for the paper: "How Private Is Your RL Policy? An Inverse RL Based Analysis Framework"

Privacy-Aware Inverse RL (PRIL) Analysis Framework Code, environments, and scripts for the paper: "How Private Is Your RL Policy? An Inverse RL Based

1 Dec 06, 2021
[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion Code for Multi-Temporal Scene Classification and Scene Ch

Lixiang Ru 33 Dec 12, 2022
Vision Transformer for 3D medical image registration (Pytorch).

ViT-V-Net: Vision Transformer for Volumetric Medical Image Registration keywords: vision transformer, convolutional neural networks, image registratio

Junyu Chen 192 Dec 20, 2022
This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

Stock Market Buy/Sell/Hold prediction Using convolutional Neural Network This repo is an attempt to implement the research paper titled "Algorithmic F

Asutosh Nayak 136 Dec 28, 2022
Official Repo of my work for SREC Nandyal Machine Learning Bootcamp

About the Bootcamp A 3-day Machine Learning Bootcamp organised by Department of Electronics and Communication Engineering, Santhiram Engineering Colle

MS 1 Nov 29, 2021
This is the research repository for Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition.

Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition This is the research repository for Vid2

Future Interfaces Group (CMU) 26 Dec 24, 2022
The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

OverlapTransformer The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for

HAOMO.AI 136 Jan 03, 2023
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning This repository contains the setup for all experiments performed in our Paper

Emanuel Metzenthin 3 Dec 16, 2022
Perception-aware multi-sensor fusion for 3D LiDAR semantic segmentation (ICCV 2021)

Perception-Aware Multi-Sensor Fusion for 3D LiDAR Semantic Segmentation (ICCV 2021) [中文|EN] 概述 本工作主要探索一种高效的多传感器(激光雷达和摄像头)融合点云语义分割方法。现有的多传感器融合方法主要将点云投影

ICE 126 Dec 30, 2022
Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Semantic Diversity Learning for Zero-Shot Multi-label Classification Paper Official PyTorch Implementation Avi Ben-Cohen, Nadav Zamir, Emanuel Ben Bar

28 Aug 29, 2022
Reinfore learning tool box, contains trpo, a3c algorithm for continous action space

RL_toolbox all the algorithm is running on pycharm IDE, or the package loss error may exist. implemented algorithm: trpo a3c a3c:for continous action

yupei.wu 44 Oct 10, 2022
Rethinking Transformer-based Set Prediction for Object Detection

Rethinking Transformer-based Set Prediction for Object Detection Here are the code for the ICCV paper. The code is adapted from Detectron2 and AdelaiD

Zhiqing Sun 62 Dec 03, 2022
Neural Ensemble Search for Performant and Calibrated Predictions

Neural Ensemble Search Introduction This repo contains the code accompanying the paper: Neural Ensemble Search for Performant and Calibrated Predictio

AutoML-Freiburg-Hannover 26 Dec 12, 2022
Transformer based SAR image despeckling

Transformer based SAR image despeckling Using the code: The code is stable while using Python 3.6.13, CUDA =10.1 Clone this repository: git clone htt

27 Nov 13, 2022
The official repository for "Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds"

Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds The why Im

3 Mar 29, 2022
CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability

This is the official repository of the paper: CR-FIQA: Face Image Quality Assessment by Learning Sample Relative Classifiability A private copy of the

Fadi Boutros 33 Dec 31, 2022