Tiny-NewsRec: Efficient and Effective PLM-based News Recommendation

Overview

Tiny-NewsRec

The source codes for our paper "Tiny-NewsRec: Efficient and Effective PLM-based News Recommendation".

Requirements

  • PyTorch == 1.6.0
  • TensorFlow == 1.15.0
  • horovod == 0.19.5
  • transformers == 3.0.2

Prepare Data

You can download and unzip the public MIND dataset with the following command:

# Under Tiny-NewsRec/
mkdir MIND && mkdir log_all && mkdir model_all
cd MIND
wget https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip
wget https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip
wget https://mind201910small.blob.core.windows.net/release/MINDlarge_test.zip
unzip MINDlarge_train.zip -d MINDlarge_train
unzip MINDlarge_dev.zip -d MINDlarge_dev
unzip MINDlarge_test.zip -d MINDlarge_test
cd ../

Then, you should run python split_file.py under Tiny-NewsRec/ to prepare the training data. Set N in line 13 of split_file.py to the number of available GPUs. This script will construct the training samples and split them into N files for multi-GPU training.

Experiments

  • PLM-NR (FT)

    Tiny-NewsRec/PLM-NR/demo.sh is the script used to train PLM-NR (FT).

    Set hvd_size to the number of available GPUs. Modify the value of num_hidden_layers to change the number of Transformer layers in the PLM and set bert_trainable_layers to the indexes of its last two layers (start from 0). Set use_pretrain_model as False and then you can start training with bash demo.sh train.

  • PLM-NR (FP)

    First, you need to run the notebook Further_Pre-train.ipynb to further pre-train the 12-layer UniLMv2 with the MLM task. This will generate a checkpoint named FP_12_layer.pt under Tiny-NewsRec/.

    Then you can use the script Tiny-NewsRec/PLM-NR/demo.sh to finetune it with the news recommendation task. Remember to set use_pretrain_model as True and set pretrain_model_path as ../FP_12_layer.pt.

  • PLM-NR (DP)

    First, you need to run the notebook Domain-specific_Post-train.ipynb to domain-specifically post-train the 12-layer UniLMv2. This will generate a checkpoint named DP_12_layer.pt under Tiny-NewsRec/. It will also generate two .pkl files named teacher_title_emb.pkl and teacher_body_emb.pkl which are used for the first stage knowledge distillation in our Tiny-NewsRec method.

    Then you can use the script Tiny-NewsRec/PLM-NR/demo.sh to finetune it with the news recommendation task. Remembert to set use_pretrain_model as True and set pretrain_model_path as ../DP_12_layer.pt.

  • TinyBERT

    Tiny-NewsRec/TinyBERT/demo.sh is the script used to train TinyBERT.

    Set hvd_size to the number of available GPUs. Modify the value of num_student_layers to change the number of Transformer layers in the student model and set bert_trainable_layers to the indexes of its last two layers (start from 0). Set teacher_ckpt as the path to the previous PLM-NR-12 (DP) checkpoint. Set use_pretrain_model as False and then you can start training with bash demo.sh train.

  • NewsBERT

    Tiny-NewsRec/NewsBERT/demo.sh is the script used to train NewsBERT.

    Set hvd_size to the number of available GPUs. Modify the value of num_student_layers to change the number of Transformer layers in the student model and set student_trainable_layers to the indexes of its last two layers (start from 0). Set teacher_ckpt as ../DP_12_layer.pt to initialize the teacher model with the domain-specifically post-trained UniLMv2 and then you can start training with bash demo.sh train.

  • Tiny-NewsRec

    First, you need to train 4 PLM-NR-12 (DP) as the teacher models.

    Second, you need to run the notebook First-Stage.ipynb to run the first-stage knowledge distillation in our approach. Modify args.num_hidden_layers to change the number of Transformer layers in the student model. This will generate a checkpoint of the student model under Tiny-NewsRec/.

    Then you need to run bash demo.sh get_teacher_emb under Tiny-NewsRec/Tiny-NewsRec to generate the news embeddings of the teacher models. Set teacher_ckpts as the path to the teacher models (separate by space).

    Finally, you can run the second-stage knowledge distillation in our approach with the script Tiny-NewsRec/Tiny-NewsRec/demo.sh. Modify the value of num_student_layers to change the number of Transformer layers in the student model and set bert_trainable_layers to the indexes of its last two layers (start from 0). Set use_pretrain_model as True and set pretrain_model_path as the path to the checkpoint generated by the notebook First-Stage.ipynb. Then you can start training with bash demo.sh train.

Citation

If you want to cite Tiny-NewsRec in your papers, you can cite it as follows:

@article{yu2021tinynewsrec,
    title={Tiny-NewsRec: Efficient and Effective PLM-based News Recommendation},
    author={Yang Yu and Fangzhao Wu and Chuhan Wu and Jingwei Yi and Tao Qi and Qi Liu},
    year={2021},
    journal={arXiv preprint arXiv:2112.00944}
}
Owner
Yang Yu
Yang Yu
Reproduction of Vision Transformer in Tensorflow2. Train from scratch and Finetune.

Vision Transformer(ViT) in Tensorflow2 Tensorflow2 implementation of the Vision Transformer(ViT). This repository is for An image is worth 16x16 words

sungjun lee 42 Dec 27, 2022
Model Agnostic Interpretability for Multiple Instance Learning

MIL Model Agnostic Interpretability This repo contains the code for "Model Agnostic Interpretability for Multiple Instance Learning". Overview Executa

Joe Early 10 Dec 17, 2022
Industrial Image Anomaly Localization Based on Gaussian Clustering of Pre-trained Feature

Industrial Image Anomaly Localization Based on Gaussian Clustering of Pre-trained Feature Q. Wan, L. Gao, X. Li and L. Wen, "Industrial Image Anomaly

smiler 6 Dec 25, 2022
Discovering and Achieving Goals via World Models

Discovering and Achieving Goals via World Models [Project Website] [Benchmark Code] [Video (2min)] [Oral Talk (13min)] [Paper] Russell Mendonca*1, Ole

Oleg Rybkin 71 Dec 22, 2022
Turning SymPy expressions into JAX functions

sympy2jax Turn SymPy expressions into parametrized, differentiable, vectorizable, JAX functions. All SymPy floats become trainable input parameters. S

Miles Cranmer 38 Dec 11, 2022
A simple AI that will give you si ple task and this is made with python

Crystal-AI A simple AI that will give you si ple task and this is made with python Prerequsites: Python3.6.2 pyttsx3 pip install pyttsx3 pyaudio pip i

CrystalAnd 1 Dec 25, 2021
Code for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network (ICCV 2021).

AA-RMVSNet Code for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network (ICCV 2021) in PyTorch. paper link: arXiv | CVF Change Log Ju

Qingtian Zhu 97 Dec 30, 2022
Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

DroneCrowd Paper Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark. Introduction This paper proposes a space-time multi-scale atte

VisDrone 98 Nov 16, 2022
A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

S³FD: Single Shot Scale-invariant Face Detector A PyTorch Implementation of Single Shot Scale-invariant Face Detector. Eval python wider_eval_pytorch.

carwin 235 Jan 07, 2023
Train/evaluate a Keras model, get metrics streamed to a dashboard in your browser.

Hera Train/evaluate a Keras model, get metrics streamed to a dashboard in your browser. Setting up Step 1. Plant the spy Install the package pip

Keplr 495 Dec 10, 2022
A curated list and survey of awesome Vision Transformers.

English | 简体中文 A curated list and survey of awesome Vision Transformers. You can use mind mapping software to open the mind mapping source file. You c

OpenMMLab 281 Dec 21, 2022
Official implementation for the paper "SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization".

SAPE Project page Paper Official implementation for the paper "SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization". Environment Cre

36 Dec 09, 2022
This repository contains the code to replicate the analysis from the paper "Moving On - Investigating Inventors' Ethnic Origins Using Supervised Learning"

Replication Code for 'Moving On' - Investigating Inventors' Ethnic Origins Using Supervised Learning This repository contains the code to replicate th

Matthias Niggli 0 Jan 04, 2022
image scene graph generation benchmark

Scene Graph Benchmark in PyTorch 1.7 This project is based on maskrcnn-benchmark Highlights Upgrad to pytorch 1.7 Multi-GPU training and inference Bat

Microsoft 303 Dec 27, 2022
Deep Learning agent of Starcraft2, similar to AlphaStar of DeepMind except size of network.

Introduction This repository is for Deep Learning agent of Starcraft2. It is very similar to AlphaStar of DeepMind except size of network. I only test

Dohyeong Kim 136 Jan 04, 2023
Context-Sensitive Misspelling Correction of Clinical Text via Conditional Independence, CHIL 2022

cim-misspelling Pytorch implementation of Context-Sensitive Spelling Correction of Clinical Text via Conditional Independence, CHIL 2022. This model (

Juyong Kim 11 Dec 19, 2022
This repository contains tutorials for the py4DSTEM Python package

py4DSTEM Tutorials This repository contains tutorials for the py4DSTEM Python package. For more information about py4DSTEM, including installation ins

11 Dec 23, 2022
ViSD4SA, a Vietnamese Span Detection for Aspect-based sentiment analysis dataset

UIT-ViSD4SA PACLIC 35 General Introduction This repository contains the data of the paper: Span Detection for Vietnamese Aspect-Based Sentiment Analys

Nguyễn Thị Thanh Kim 5 Nov 13, 2022
Local Attention - Flax module for Jax

Local Attention - Flax Autoregressive Local Attention - Flax module for Jax Install $ pip install local-attention-flax Usage from jax import random fr

Phil Wang 16 Jun 16, 2022
GenshinMapAutoMarkTools - Tools To add/delete/refresh resources mark in Genshin Impact Map

使用说明 适配 windows7以上 64位 原神1920x1080窗口(其他分辨率后续适配) 待更新渊下宫 English version is to be

Zero_Circle 209 Dec 28, 2022