The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Last update: Nov 21, 2022

Overview

Language Models are Few-shot Multilingual Learners

Paper

This is the source code of the paper [Arxiv] [ACL Anthology]:

This code has been written using PyTorch. If you use source codes or datasets included in this toolkit in your work, please cite the following paper:

@inproceedings{winata-etal-2021-language,
    title = "Language Models are Few-shot Multilingual Learners",
    author = "Winata, Genta Indra  and
      Madotto, Andrea  and
      Lin, Zhaojiang  and
      Liu, Rosanne  and
      Yosinski, Jason  and
      Fung, Pascale",
    booktitle = "Proceedings of the 1st Workshop on Multilingual Representation Learning",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.mrl-1.1",
    pages = "1--15",
}

Setup Environment

GPU Machine

pip install -r requirements.txt

GPU Machine for Running GPT-J 6B Model

apt install zstd

# the "slim" version contain only bf16 weights and no optimizer parameters, which minimizes bandwidth and memory
wget -c https://the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd

tar -I zstd -xf step_383500_slim.tar.zstd

pip install -r mesh_transformer_jax/requirements.txt

# jax 0.2.12 is required due to a regression with xmap in 0.2.13
pip install mesh-transformer-jax/ jax==0.2.12

# cuda[your_cuda_version]
pip install jaxlib==0.1.67+cuda101 -f https://storage.googleapis.com/jax-releases/jax_releases.html

How to run

Zero-shot Cross-task

❱❱❱ CUDA_VISIBLE_DEVICES=0 python evaluate.py  --dataset snips --model_checkpoint facebook/bart-large-mnli --cuda --length 5 --label_type value --src_lang en --tgt_lang en --seed 42 --use_log_prob --use_confidence --is_cross_task

Finetune

❱❱❱ CUDA_VISIBLE_DEVICES=0 python finetune.py  --dataset snips --model_checkpoint bert-base-multilingual-uncased --cuda --label_type value --src_lang en --tgt_lang en --seed 42

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Related tags

Overview

Language Models are Few-shot Multilingual Learners

Paper

Setup Environment

GPU Machine

GPU Machine for Running GPT-J 6B Model

How to run

Zero-shot Cross-task

Finetune

Owner

Genta Indra Winata

Code for paper "Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features"

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-generated Hate Speech Evaluation Datasets

Automatically search Stack Overflow for the command you want to run

Train BPE with fastBPE, and load to Huggingface Tokenizer.

a CTF web challenge about making screenshots

Searching keywords in PDF file folders

Codes for processing meeting summarization datasets AMI and ICSI.

In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset.

Rhythm-Finder is a unsupervised ML driven python powered web-application that can find the songs that suits you.

An A-SOUL Text Generator Based on CPM-Distill.

Modified GPT using average pooling to reduce the softmax attention memory constraints.

CLIPfa: Connecting Farsi Text and Images

Weird Sort-and-Compress Thing

A Plover python dictionary allowing for consistent symbol input with specification of attachment and capitalisation in one stroke.

Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation

I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive

Precision Medicine Knowledge Graph (PrimeKG)

Chinese Grammatical Error Diagnosis

Creating an LSTM model to generate music

Higher quality textures for the Metal Gear Solid series.