Sequence-Labeling-Early-Exit

Code for ACL 2021 paper: Accelerating BERT Inference for Sequence Labeling via Early-Exit

Requirement:

Please refer to requirements.txt

How to run?

For ontonotes (CN):

you should claim your dataset path in paths.py, and then

For the first stage training:

python -u main.py --device 0  --seed 100 --fast_ptm_name bert --lr 5e-5  --use_crf 0 --dataset ontonotes_cn --fix_ptm_epoch 2 --warmup_step 3000 --use_fastnlp_bert 0 --sampler bucket  --after_bert linear --use_char 0 --use_bigram 0 --gradient_clip_norm_other 5 --gradient_clip_norm_bert 1 --train_mode joint --test_mode joint --if_save 1 --warmup_schedule inverse_square --epoch 20 --joint_weighted 1 --ptm_lr_rate 0.1 --cls_common_lr_scale 0

Then find the exp_path in the corresponding fitlog entry, and self-sampling further train the model.

For the self-sampling training:

python -u further_train.py --seed 100 --msg fuxian --if_save 1 --warmup_schedule inverse_square --epoch 30 --keep_norm_same 1 --sandwich_small 2 --sandwich_full 4 --max_t_level_t -0.5 --train_mode joint_sample_copy --further 0 --flooding 1 --flooding_bias 0 --lr 1e-4 --ptm_lr_rate 0.1 --fix_ptm_epoch 2 --min_win_size 5 --copy_wordpiece all --ckpt_epoch 7 --exp_path 05_11_22_20_52.210103 --device 2 --max_threshold 0.25 --max_threshold_2 0.5

Then find the exp_path and best epoch in the corresponding fitlog entry, and use it for early-exit inference as:

speed 2X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 15 --threshold 0.1 --ckpt_epoch [ckpt_path] --exp_path [exp_path]
speed 3X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 5 --threshold 0.15 --ckpt_epoch [ckpt_path] --exp_path [exp_path]
speed 4X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 5 --threshold 0.25 --ckpt_epoch [ckpt_path] --exp_path [exp_path]

Other datasets' scripts coming soon

If you have any question, do not hesitate to ask it in issue. (English or Chinese both ok)

Accelerating BERT Inference for Sequence Labeling via Early-Exit

Related tags

Overview

Sequence-Labeling-Early-Exit

Requirement:

How to run?

Owner

李孝男

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation, CVPR2022

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention

Remote sensing change detection using PaddlePaddle

A robust camera and Lidar fusion based velocity estimator to undistort the pointcloud.

Code for Fold2Seq paper from ICML 2021

Deep Learning for Time Series Classification

3D ResNets for Action Recognition (CVPR 2018)

Covid-19 Test AI (Deep Learning - NNs) Software. Accuracy is the %96.5, loss is the 0.09 :)

API for RL algorithm design & testing of BCA (Building Control Agent) HVAC on EnergyPlus building energy simulator by wrapping their EMS Python API

Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

Tensorflow Implementation of ECCV'18 paper: Multimodal Human Motion Synthesis

Tensorforce: a TensorFlow library for applied reinforcement learning

A library for augmentation of a YOLO-formated dataset

DCSL - Generalizable Crowd Counting via Diverse Context Style Learning

DivNoising is an unsupervised denoising method to generate diverse denoised samples for any noisy input image. This repository contains the code to reproduce the results reported in the paper https://openreview.net/pdf?id=agHLCOBM5jP

Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore

An implementation of an abstract algebra for music tones (pitches).

Directed Greybox Fuzzing with AFL

An open source Python package for plasma science that is under development