Transformers-regression - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates

Overview

Regression Free Model Update

Code for the paper: Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates [Paper]

Modified from the Hugging Face Transformers examples of text-classification. [Original code]

Haven't tested under this environment, and document needs completion.

If you have any question, please contact me through email:[email protected]

Text classification examples

GLUE tasks

Based on the script run_glue.py.

Fine-tuning the library models for sequence classification on the GLUE benchmark: General Language Understanding Evaluation. This script can fine-tune any of the models on the hub and can also be used for a dataset hosted on our hub or your own data in a csv or a JSON file (the script might need some tweaks in that case, refer to the comments inside for help).

GLUE is made up of a total of 9 different tasks. Here is how to run the script on one of them:

export TASK_NAME=mrpc

python run_glue.py \
  --model_name_or_path bert-base-cased \
  --task_name $TASK_NAME \
  --do_train \
  --do_eval \
  --max_seq_length 128 \
  --per_device_train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 3 \
  --output_dir /tmp/$TASK_NAME/

where task name can be one of cola, sst2, mrpc, stsb, qqp, mnli, qnli, rte, wnli.

We get the following results on the dev set of the benchmark with the previous commands (with an exception for MRPC and WNLI which are tiny and where we used 5 epochs instead of 3). Trainings are seeded so you should obtain the same results with PyTorch 1.6.0 (and close results with different versions), training times are given for information (a single Titan RTX was used):

Task Metric Result Training time
CoLA Matthews corr 56.53 3:17
SST-2 Accuracy 92.32 26:06
MRPC F1/Accuracy 88.85/84.07 2:21
STS-B Pearson/Spearman corr. 88.64/88.48 2:13
QQP Accuracy/F1 90.71/87.49 2:22:26
MNLI Matched acc./Mismatched acc. 83.91/84.10 2:35:23
QNLI Accuracy 90.66 40:57
RTE Accuracy 65.70 57
WNLI Accuracy 56.34 24

Some of these results are significantly different from the ones reported on the test set of GLUE benchmark on the website. For QQP and WNLI, please refer to FAQ #12 on the website.

The following example fine-tunes BERT on the imdb dataset hosted on our hub:

python run_glue.py \
  --model_name_or_path bert-base-cased \
  --dataset_name imdb  \
  --do_train \
  --do_predict \
  --max_seq_length 128 \
  --per_device_train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 3 \
  --output_dir /tmp/imdb/

Mixed precision training

If you have a GPU with mixed precision capabilities (architecture Pascal or more recent), you can use mixed precision training with PyTorch 1.6.0 or latest, or by installing the Apex library for previous versions. Just add the flag --fp16 to your command launching one of the scripts mentioned above!

Using mixed precision training usually results in 2x-speedup for training with the same final results:

Task Metric Result Training time Result (FP16) Training time (FP16)
CoLA Matthews corr 56.53 3:17 56.78 1:41
SST-2 Accuracy 92.32 26:06 91.74 13:11
MRPC F1/Accuracy 88.85/84.07 2:21 88.12/83.58 1:10
STS-B Pearson/Spearman corr. 88.64/88.48 2:13 88.71/88.55 1:08
QQP Accuracy/F1 90.71/87.49 2:22:26 90.67/87.43 1:11:54
MNLI Matched acc./Mismatched acc. 83.91/84.10 2:35:23 84.04/84.06 1:17:06
QNLI Accuracy 90.66 40:57 90.96 20:16
RTE Accuracy 65.70 57 65.34 29
WNLI Accuracy 56.34 24 56.34 12

PyTorch version, no Trainer

Based on the script run_glue_no_trainer.py.

Like run_glue.py, this script allows you to fine-tune any of the models on the hub on a text classification task, either a GLUE task or your own data in a csv or a JSON file. The main difference is that this script exposes the bare training loop, to allow you to quickly experiment and add any customization you would like.

It offers less options than the script with Trainer (for instance you can easily change the options for the optimizer or the dataloaders directly in the script) but still run in a distributed setup, on TPU and supports mixed precision by the mean of the ๐Ÿค— Accelerate library. You can use the script normally after installing it:

pip install accelerate

then

export TASK_NAME=mrpc

python run_glue_no_trainer.py \
  --model_name_or_path bert-base-cased \
  --task_name $TASK_NAME \
  --max_length 128 \
  --per_device_train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 3 \
  --output_dir /tmp/$TASK_NAME/

You can then use your usual launchers to run in it in a distributed environment, but the easiest way is to run

accelerate config

and reply to the questions asked. Then

accelerate test

that will check everything is ready for training. Finally, you can launch training with

export TASK_NAME=mrpc

accelerate launch run_glue_no_trainer.py \
  --model_name_or_path bert-base-cased \
  --task_name $TASK_NAME \
  --max_length 128 \
  --per_device_train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 3 \
  --output_dir /tmp/$TASK_NAME/

This command is the same and will work for:

  • a CPU-only setup
  • a setup with one GPU
  • a distributed training with several GPUs (single or multi node)
  • a training on TPUs

Note that this library is in alpha release so your feedback is more than welcome if you encounter any problem using it.

Owner
Yuqing Xie
Yuqing Xie
Abhijith Neil Abraham 2 Nov 05, 2021
A minimal Conformer ASR implementation adapted from ESPnet.

Conformer ASR A minimal Conformer ASR implementation adapted from ESPnet. Introduction I want to use the pre-trained English ASR model provided by ESP

Niu Zhe 3 Jan 24, 2022
Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

Chenhe Dong 28 Nov 10, 2022
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

GPT Neo ๐ŸŽ‰ 1T or bust my dudes ๐ŸŽ‰ An implementation of model & data parallel GPT3-like models using the mesh-tensorflow library. If you're just here t

EleutherAI 6.7k Dec 28, 2022
neural network based speaker embedder

Content What is deepaudio-speaker? Installation Get Started Model Architecture How to contribute to deepaudio-speaker? Acknowledge What is deepaudio-s

20 Dec 29, 2022
Translation to python of Chris Sims' optimization function

pycsminwel This is a locol minimization algorithm. Uses a quasi-Newton method with BFGS update of the estimated inverse hessian. It is robust against

Gustavo Amarante 1 Mar 21, 2022
Common Voice Dataset explorer

Common Voice Dataset Explorer Common Voice Dataset is by Mozilla Made during huggingface finetuning week Usage pip install -r requirements.txt streaml

Ceyda Cinarel 22 Nov 16, 2022
Gold standard corpus annotated with verb-preverb connections for Hungarian.

Hungarian Preverb Corpus A gold standard corpus manually annotated with verb-preverb connections for Hungarian. corpus The corpus consist of the follo

RIL Lexical Knowledge Representation Research Group 3 Jan 27, 2022
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

Google Research 4.6k Jan 01, 2023
Indonesia spellchecker with python

indonesia-spellchecker Ganti kata yang terdapat pada file teks.txt untuk diperiksa kebenaran kata. Run on local machine python3 main.py

Rahmat Agung Julians 1 Sep 14, 2022
Ongoing research training transformer language models at scale, including: BERT & GPT-2

Megatron (1 and 2) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA.

NVIDIA Corporation 3.5k Dec 30, 2022
Fixes mojibake and other glitches in Unicode text, after the fact.

ftfy: fixes text for you print(fix_encoding("(ร ยธโ€ก'รขล’ยฃ')ร ยธโ€ก")) (เธ‡'โŒฃ')เธ‡ Full documentation: https://ftfy.readthedocs.org Testimonials โ€œMy life is li

Luminoso Technologies, Inc. 3.4k Dec 29, 2022
Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)

Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation This repository contains the implementation of the following paper: Live Speech

OldSix 575 Dec 31, 2022
Implemented shortest-circuit disambiguation, maximum probability disambiguation, HMM-based lexical annotation and BiLSTM+CRF-based named entity recognition

Implemented shortest-circuit disambiguation, maximum probability disambiguation, HMM-based lexical annotation and BiLSTM+CRF-based named entity recognition

0 Feb 13, 2022
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

(Framework for Adapting Representation Models) What is it? FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready. It's built u

deepset 1.6k Dec 27, 2022
This repository contains the code for "Generating Datasets with Pretrained Language Models".

Datasets from Instructions (DINO ๐Ÿฆ• ) This repository contains the code for Generating Datasets with Pretrained Language Models. The paper introduces

Timo Schick 154 Jan 01, 2023
Anomaly Detection ์ด์ƒ์น˜ ํƒ์ง€ ์ „์ฒ˜๋ฆฌ ๋ชจ๋“ˆ

Anomaly Detection ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์ด์ƒ์น˜ ํƒ์ง€ 1. Kernel Density Estimation์„ ํ™œ์šฉํ•œ ์ด์ƒ์น˜ ํƒ์ง€ train_data_path์™€ test_data_path์— ์กด์žฌํ•˜๋Š” ์‹œ์  ์ •๋ณด๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ๋Š” csv ํ˜•ํƒœ์˜ train data์™€

CLUST-consortium 43 Nov 28, 2022
Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

MLP Singer Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis. Audio samples are available on our demo page.

Neosapience 103 Dec 23, 2022
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

Simplemma: a simple multilingual lemmatizer for Python Purpose Lemmatization is the process of grouping together the inflected forms of a word so they

Adrien Barbaresi 70 Dec 29, 2022
HAIS_2GNN: 3D Visual Grounding with Graph and Attention

HAIS_2GNN: 3D Visual Grounding with Graph and Attention This repository is for the HAIS_2GNN research project. Tao Gu, Yue Chen Introduction The motiv

Yue Chen 1 Nov 26, 2022