Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

Related tags

Text Data & NLPCLIF
Overview

Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning

This repo is for Findings at EMNLP 2021 paper: Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning. Code clean-up is still in progress.

Data

Please extract the downloaded data and place it under PROJECT_DIR/datasets. Our training data stream and few-shot datasets are curated from https://github.com/iesl/leopard and https://github.com/INK-USC/CrossFit.

The directory structure is

PROJECT_DIR/datasets/crossfit_data/data/ + 55 classification tasks from the link above, e.g. PROJECT_DIR/datasets/crossfit_data/data/anli
PROJECT_DIR/datasets/leopard/ + 17 tasks from the link above, e.g. PROJECT_DIR/datasets/leopard/airline

Environment

Our code uses PyTorch 1.7.1. To allow fp16 training, you should also install apex.

Running Experiments

Training on CLIF-26

reg=0.01
lr=1e-4
seed=0
python run_model.py --tasks cola sst2 mrpc stsb qqp mnli qnli rte wnli \
--output_dir runs/glue_cfew_10k_choice_hnet_hardlong_sample_reg${reg}_s64_d256_limit/${lr}/${seed} \
--do_train --eval_period 100000 --eval_at_epoch_end  --wait_step 3 --num_train_epochs 100 --seed ${seed} \
--train_batch_size 64 --gradient_accumulation_steps 2 --learning_rate ${lr} --max_output_length 8 \
--generator_hdim 32 --example_limit 100 --train_limit 10000 --cl_method hnet --h_l2reg ${reg} \
--adapter_dim 256 --adapter_dim_final 64  --hard_long_term  --limit_label_vocab_space \
--sample_batch --scale_loss --stm_size 64

Few-shot evaluation on CLIF-26

python run_model.py --task_collection leopard --k_shot 16 --max_input_length 100  \
--output_dir /runs/glue_cfew_10k_choice_hnet_hardlong_sample_reg${reg}_s64_d256_limit/${lr}/${seed} \
--do_few_shot_predict --eval_period 100000 --eval_at_epoch_end  --wait_step 3 --num_train_epochs 100 \
--seed ${seed} --train_batch_size 64 --predict_batch_size 16 --few_shot_train_batch_size 16 \
--few_shot_wait_step 100000 --few_shot_num_train_epochs 800 --wait_step 3 --gradient_accumulation_steps 4 \
--scale_by_accumulation --learning_rate ${lr} --max_output_length 8  --generator_hdim 32 \
--example_limit 100 --train_limit 10000 --cl_method naive --h_l2reg ${reg} --adapter_dim 256 \
--adapter_dim_final 64 --hard_long_term --limit_label_vocab_space --no_short_term --long_term_task_emb_num 9 \
--postfix "naive_16shot"  --sample_batch --stm_size 64 --few_shot_eval_period 200

Training and evaluation on CLIF-55

reg=0.01
lr=1e-4
seed=0
python run_model.py  --task_collection crossfit_cls_train --crossfit_k_shot 16 --ssd --output_dir runs/crossfit_hnet_merge_space_${reg}/${lr}/${seed} --skip_intermediate_ckpt --add_space --merge_split --split_id ${seed} --seed ${seed} --do_train --eval_every_k_tasks 5 --eval_period 100 --skip_intermediate_ckpt --train_batch_size 64 --wait_step 3 --num_train_epochs 10000000  --learning_rate ${lr} --max_output_length 64 --example_limit 100 --train_limit 10000 --cl_method hnet --h_l2reg ${reg} --adapter_dim 256 --generator_hdim 32 --adapter_dim_final 64 --sample_batch --hard_long_term --stm_size 64
python run_model.py --task_collection crossfit_cls_train --crossfit_k_shot 16 --ssd --output_dir runs/crossfit_hnet_merge_space${reg}/${lr}/${seed} --skip_intermediate_ckpt --add_space --merge_split --split_id ${seed} --seed ${seed} --do_predict --eval_every_k_tasks 5 --eval_period 100 --skip_intermediate_ckpt --train_batch_size 64 --wait_step 3 --num_train_epochs 10000000  --learning_rate ${lr} --max_output_length 64 --example_limit 100 --train_limit 10000 --cl_method hnet --h_l2reg ${reg} --adapter_dim 256 --generator_hdim 32 --adapter_dim_final 64 --sample_batch --hard_long_term --stm_size 64
for split_id in 0 1 2 3 4
do
  python run_model.py --task_collection crossfit_cls_test --crossfit_k_shot 16 --ssd --postfix "split${split_id}"  --long_term_task_emb_num 45 --do_few_shot_predict --few_shot_eval_period 200 --few_shot_num_train_epochs 800 --few_shot_train_batch_size 64 --few_shot_wait_step 100 --mtl_task_num 45 --output_dir runs/crossfit_hnet_merge_space_${reg}/${lr}/${seed} --add_space  --limit_label_vocab_space --split_id ${split_id} --seed ${seed} --eval_period 100 --train_batch_size 64 --gradient_accumulation_steps 1 --wait_step 6 --num_train_epochs 10000  --learning_rate ${lr} --max_output_length 64 --example_limit 100 --train_limit 10000 --cl_method naive --adapter_dim 256 --generator_hdim 32 --adapter_dim_final 64 --sample_batch --hard_long_term
done

Here are mapping between command line arguments and implemented methods.

  • BART-Single without adapter: --cl_method naive --no_param_gen --skip_adapter --train_all
  • BART-Single-MTL: --cl_method naive --no_param_gen --skip_mtl --mtl --train_all
  • BiHNET-Vanilla: --cl_method naive --hard_long_term
  • BiHNET with trained task embeddings: --cl_method hnet --no_short_term --train_task_embs --hard_long_term
  • BART-Adapter-Single: --cl_method naive --no_param_gen --lr 3e-4
Owner
INK Lab @ USC
Intelligence and Knowledge Discovery (INK) Research Lab at University of Southern California
INK Lab @ USC
[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Counterfactual Attention Learning Created by Yongming Rao*, Guangyi Chen*, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for ICCV

Yongming Rao 89 Dec 18, 2022
Various Algorithms for Short Text Mining

Short Text Mining in Python Introduction This package shorttext is a Python package that facilitates supervised and unsupervised learning for short te

Kwan-Yuet 466 Dec 06, 2022
Input english text, then translate it between languages n times using the Deep Translator Python Library.

mass-translator About Input english text, then translate it between languages n times using the Deep Translator Python Library. How to Use Install dep

2 Mar 04, 2022
hashily is a Python module that provides a variety of text decoding and encoding operations.

hashily is a python module that performs a variety of text decoding and encoding functions. It also various functions for encrypting and decrypting text using various ciphers.

DevMysT 5 Jul 17, 2022
An extension for asreview implements a version of the tf-idf feature extractor that saves the matrix and the vocabulary.

Extension - matrix and vocabulary extractor for TF-IDF and Doc2Vec An extension for ASReview that adds a tf-idf extractor that saves the matrix and th

ASReview 4 Jun 17, 2022
A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

RITA DSL This is a language, loosely based on language Apache UIMA RUTA, focused on writing manual language rules, which compiles into either spaCy co

Šarūnas Navickas 60 Sep 26, 2022
PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing

PhoNLP is a multi-task learning model for joint part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing. Experiments on Vietnamese benchmark datasets show that PhoNLP prod

VinAI Research 109 Dec 02, 2022
HuggingTweets - Train a model to generate tweets

HuggingTweets - Train a model to generate tweets Create in 5 minutes a tweet generator based on your favorite Tweeter Make my own model with the demo

Boris Dayma 318 Jan 04, 2023
Treemap visualisation of Maya scene files

Ever wondered which nodes are responsible for that 600 mb+ Maya scene file? Features Fast, resizable UI Parsing at 50 mb/sec Dependency-free, single-f

Marcus Ottosson 76 Nov 12, 2022
edge-SR: Super-Resolution For The Masses

edge-SR: Super Resolution For The Masses Citation Pablo Navarrete Michelini, Yunhua Lu and Xingqun Jiang. "edge-SR: Super-Resolution For The Masses",

Pablo 40 Nov 10, 2022
Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models

Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models. A paraphrase framework is more than just a paraphrasing model.

Prithivida 681 Jan 01, 2023
Black for Python docstrings and reStructuredText (rst).

Style-Doc Style-Doc is Black for Python docstrings and reStructuredText (rst). It can be used to format docstrings (Google docstring format) in Python

Telekom Open Source Software 13 Oct 24, 2022
Sploitus - Command line search tool for sploitus.com. Think searchsploit, but with more POCs

Sploitus Command line search tool for sploitus.com. Think searchsploit, but with

watchdog2000 5 Mar 07, 2022
Question answering app is used to answer for a user given question from user given text.

Question answering app is used to answer for a user given question from user given text.It is created using HuggingFace's transformer pipeline and streamlit python packages.

Siva Prakash 3 Apr 05, 2022
A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

MIDI Language Introduction Reference Paper: Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions: code This

Robert Bogan Kang 3 May 25, 2022
Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

2017 VQA Challenge Winner (CVPR'17 Workshop) pytorch implementation of Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challeng

Mark Dong 166 Dec 11, 2022
Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Neural G2P to portuguese language Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written for

fluz 11 Nov 16, 2022
A fast and lightweight python-based CTC beam search decoder for speech recognition.

pyctcdecode A fast and feature-rich CTC beam search decoder for speech recognition written in Python, providing n-gram (kenlm) language model support

Kensho 315 Dec 21, 2022
Python package for Turkish Language.

PyTurkce Python package for Turkish Language. Documentation: https://pyturkce.readthedocs.io. Installation pip install pyturkce Usage from pyturkce im

Mert Cobanov 14 Oct 09, 2022
An ActivityWatch watcher to pose questions to the user and record her answers.

aw-watcher-ask An ActivityWatch watcher to pose questions to the user and record her answers. This watcher uses Zenity to present dialog boxes to the

Bernardo Chrispim Baron 33 Dec 03, 2022