Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)

Overview

Time-aware Large Kernel (TaLK) Convolutions (Lioutas et al., 2020)

This repository contains the source code, pre-trained models, as well as instructions to reproduce results for our paper Time-aware Large Kernel Convolutions (ICML 2020).

TaLK Convolutions is a sequence modeling method that uses an adaptive convolution operation that learns to predict the size of a summation kernel instead of using a fixed-sized learnable kernel matrix. It utilizes a fast parallelized implementation of the summed-area table, also known as the integral image operation, to efficiently calculate the convolution output that uses the summation kernel. We generate relative offsets for each timestep of the input sequence, which are used to adaptively expand the size of the summation kernel conditioned on the input. This method yields a time complexity of O(n), effectively making the sequence encoding process linear to the number of tokens.

Video Presentation:

Time-aware Large Kernel Convolutions (ICML 2020)

Citation:

@inproceedings{lioutas2020timeaware,
    author={Vasileios Lioutas and Yuhong Guo},
    title={Time-aware Large Kernel Convolutions},
    booktitle={Proceedings of the 37th International Conference on Machine Learning (ICML)},
    year={2020}
}

Setup

Requirements

  • PyTorch version >= 1.3.1
  • fairseq version >= 0.10.1
  • Python version >= 3.6
  • CUDA >= 10.1
  • NVIDIA's apex library (for mixed-precision training)

Clone this repository

git clone https://github.com/lioutasb/TaLKConvolutions.git
cd TaLKConvolutions

Efficient CUDA Kernels

In order to support the parallelization of TaLK Convolutions, we have developed our own CUDA primitives. To install the kernels, use the commands below. We tested compiling the kernels using CUDA 10.1 but if a future CUDA release does not work, please feel free to open an issue.

cd talkconv/talkconv_module/
python setup.py install

We are welcoming contributions from experienced CUDA developers regarding making the CUDA kernels more efficient.

Translation

Pre-trained models

Dataset Model Prepared test set
IWSLT14 German-English download (.pt) IWSLT14 test: download (.zip)
WMT16 English-German download (.pt) newstest2014: download (.zip)
WMT14 English-French download (.pt) newstest2014: download (.zip)

Preprocessing the training datasets

Please follow the instructions https://github.com/pytorch/fairseq/blob/master/examples/translation/README.md to preprocess the data.

IWSLT14 De-En

Training and evaluating TaLK Convolutions on a single GPU:

# Training
SAVE="checkpoints/talkconv_iwslt_deen"
mkdir -p $SAVE

CUDA_VISIBLE_DEVICES=0 \
fairseq-train data-bin/iwslt14.tokenized.de-en \
    --user-dir talkconv/talkconv_fairseq \
    --arch talkconv_iwslt_de_en \
    --optimizer adam  --fp16 --lr 0.0005 \
    --source-lang de --target-lang en --max-tokens 4000 \
    --min-lr '1e-09' --weight-decay 0.0001 \
    --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
    --lr-scheduler inverse_sqrt \
    --dropout 0.3 --attention-dropout 0.1 --weight-dropout 0.1  \
    --max-update 85000 --warmup-updates 4000 --warmup-init-lr '1e-07' \
    --adam-betas '(0.9, 0.98)' --left-pad-source "False" --max-epoch 52 --seed 1024 \
    --save-dir $SAVE 

python utils/average_checkpoints.py --inputs $SAVE \
    --num-epoch-checkpoints 10 --output "${SAVE}/model.pt"

# Evaluation
fairseq-generate data-bin/iwslt14.tokenized.de-en --user-dir talkconv/talkconv_fairseq \
    --path "${SAVE}/model.pt" \
    --batch-size 128 --beam 5 --remove-bpe --lenpen 1.6 --gen-subset test --quiet 

WMT16 En-De

Training and evaluating TaLK Convolutions on WMT16 En-De using cosine scheduler on one machine with 8 NVIDIA GPUs:

# Training
SAVE="checkpoints/talkconv_wmt_ende_big"
mkdir -p $SAVE

python -m torch.distributed.launch --nproc_per_node 8 fairseq-train \
    data-bin/wmt16_en_de_bpe32k --fp16 --log-interval 100 --no-progress-bar --distributed-no-spawn \
    --user-dir talkconv/talkconv_fairseq \
    --max-update 30243 --share-all-embeddings --optimizer adam \
    --adam-betas '(0.9, 0.98)' --clip-norm 0.0 --weight-decay 0.0 \
    --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
    --min-lr 1e-09 --update-freq 16 \
    --ddp-backend=no_c10d --max-tokens 3584 \
    --lr-scheduler cosine --warmup-init-lr 1e-7 --warmup-updates 10000 \
    --lr-shrink 1 --max-lr 0.001 --lr 1e-7 --min-lr 1e-9 --warmup-init-lr 1e-07 \
    --t-mult 1 --lr-period-updates 20000 \
    --arch talkconv_wmt_en_de_big \
    --save-dir $SAVE

# Checkpoint averaging
python utilss/average_checkpoints.py --inputs $SAVE \
    --num-epoch-checkpoints 10 --output "${SAVE}/model.pt"

# Evaluation on newstest2014
CUDA_VISIBLE_DEVICES=0 \
fairseq-generate data-bin/wmt16_en_de_bpe32k --user-dir talkconv/talkconv_fairseq \
  --path "${SAVE}/model.pt" \
  --batch-size 128 --beam 4 --remove-bpe --lenpen 0.35 --gen-subset test > wmt14_gen_ende.txt 

bash utils/compound_split_bleu.sh wmt14_gen_ende.txt 

WMT14 En-Fr

Training and evaluating TaLK Convolutions on WMT14 En-Fr using cosine scheduler on one machine with 8 NVIDIA GPUs:

# Training
SAVE="checkpoints/talkconv_wmt_enfr_big"
mkdir -p $SAVE
python -m torch.distributed.launch --nproc_per_node 8 fairseq-train \
    data-bin/wmt14_en_fr --fp16 --log-interval 100 --no-progress-bar --distributed-no-spawn \
    --user-dir talkconv/talkconv_fairseq \
    --max-update 80000 --share-all-embeddings --optimizer adam \
    --adam-betas '(0.9, 0.98)' --clip-norm 0.0 --weight-decay 0.0 \
    --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
    --min-lr 1e-09 --update-freq 32 \
    --ddp-backend=no_c10d --max-tokens 1800 \
    --lr-scheduler cosine --warmup-init-lr 1e-7 --warmup-updates 10000 \
    --lr-shrink 1 --max-lr 0.001 --lr 1e-7 --min-lr 1e-9 --warmup-init-lr 1e-07 \
    --t-mult 1 --lr-period-updates 70000 \
    --arch talkconv_wmt_en_fr_big \
    --save-dir $SAVE

# Checkpoint averaging
python utils/average_checkpoints.py --inputs $SAVE \
    --num-epoch-checkpoints 10 --output "${SAVE}/model.pt"

# Evaluation
CUDA_VISIBLE_DEVICES=0 \
fairseq-generate data-bin/wmt14_en_fr --user-dir talkconv/talkconv_fairseq \
    --path "${SAVE}/model.pt" \
    --batch-size 128 --beam 6 --remove-bpe --lenpen 0.65 --gen-subset test --quiet 

License

This project is MIT-licensed. The license applies to the pre-trained models as well.

Owner
Vasileios Lioutas
PhD student at the University of British Columbia | M.Sc. in CS at Carleton University and ex-Machine Learning Researcher at Huawei Noah's Ark Lab
Vasileios Lioutas
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Speech-Backbones This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab. Grad-TTS Official implementation of the Grad-

HUAWEI Noah's Ark Lab 295 Jan 07, 2023
PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Cross-Covariance Image Transformer (XCiT) PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer L

Facebook Research 605 Jan 02, 2023
1 Jun 28, 2022
A look-ahead multi-entity Transformer for modeling coordinated agents.

baller2vec++ This is the repository for the paper: Michael A. Alcorn and Anh Nguyen. baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling

Michael A. Alcorn 30 Dec 16, 2022
Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

beyond masking Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers The code is coming Figure 1: Pipeline of token-based pre-

Yunjie Tian 23 Sep 27, 2022
Main repository for the chatbot Bobotinho.

Bobotinho Bot Main repository for the chatbot Bobotinho. â„šī¸ Introduction Twitch chatbot with entertainment commands. ‎ đŸ’ģ Technologies Concurrent code

Bobotinho 14 Nov 29, 2022
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)

Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)

Weitang Liu 1.6k Jan 03, 2023
Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks. It takes raw videos/images + text as inputs, and outputs task predictions. ClipB

Jie Lei é›ˇæ° 612 Jan 04, 2023
Leon is an open-source personal assistant who can live on your server.

Leon Your open-source personal assistant. Website :: Documentation :: Roadmap :: Contributing :: Story 👋 Introduction Leon is an open-source personal

Leon AI 11.7k Dec 30, 2022
NLP command-line assistant powered by OpenAI

NLP command-line assistant powered by OpenAI

Axel 16 Dec 09, 2022
Conversational text Analysis using various NLP techniques

Conversational text Analysis using various NLP techniques

Rita Anjana 159 Jan 06, 2023
Need: Image Search With Python

Need: Image Search The problem is that a user needs to search for a specific ima

Surya Komandooru 1 Dec 30, 2021
Chinese segmentation library

What is loso? loso is a Chinese segmentation system written in Python. It was developed by Victor Lin ( Fang-Pen Lin 82 Jun 28, 2022

Telegram bot to auto post messages of one channel in another channel as soon as it is posted, without the forwarded tag.

Channel Auto-Post Bot This bot can send all new messages from one channel, directly to another channel (or group, just in case), without the forwarded

Aditya 128 Dec 29, 2022
A Paper List for Speech Translation

Keyword: Speech Translation, Spoken Language Processing, Natural Language Processing

138 Dec 24, 2022
The Sudachi synonym dictionary in Solar format.

solr-sudachi-synonyms The Sudachi synonym dictionary in Solar format. Summary Run a script that checks for updates to the Sudachi dictionary every hou

Karibash 3 Aug 19, 2022
Application for shadowing Chinese.

chinese-shadowing Simple APP for shadowing chinese. With this application, it is very easy to record yourself, play the sound recorded and listen to s

Thomas Hirtz 5 Sep 06, 2022
A library that integrates huggingface transformers with the world of fastai, giving fastai devs everything they need to train, evaluate, and deploy transformer specific models.

blurr A library that integrates huggingface transformers with version 2 of the fastai framework Install You can now pip install blurr via pip install

ohmeow 253 Dec 31, 2022
Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

Chenhe Dong 28 Nov 10, 2022
Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2. It is trained (finetuned) on a curated list of approximately 45K Python (~470MB) files gathered from the

Galois Autocompleter 91 Sep 23, 2022