An example project using OpenPrompt under pytorch-lightning for prompt-based SST2 sentiment analysis model

Last update: Oct 21, 2022

Related tags

Overview

pl_prompt_sst

An example project using OpenPrompt under the framework of pytorch-lightning for a training prompt-based text classification model on SST2 sentiment analysis dataset. Leveraging the pytorch-lightning features like logging, gradient accumulation and early stopping, etc. Can be used as a template for further development.

Run

Install requirement

pip install -r requirements.txt

Setup the prompt to use in sst2/prompt_config.json

{
    "template_text": "{\"placeholder\": \"text_a\"} In summary, the film was {\"mask\"}.",
    "label_words": [["bad"], ["good"]]
}

Adjust the arguments in run.sh or the code below for your need, and run it.

CUDA_VISIBLE_DEVICES=0 python -u main.py --input_dir ./sst2 \
                                         --prompt_config_dir ./sst2/prompt_config.json \
                                         --model_class bert \
                                         --model_name_or_path prajjwal1/bert-tiny \
                                         --lr 2e-4
                                         --bs 32 \
                                         --max_seq_length 64 \
                                         --patience 4 \
                                         --accumulation 2 \
                                         --seed 666

In my preliminary experiment with the settings above, the model achieve 0.822 F1 compared to 0.820 without prompt.

Note

Can only be executed after this fix on state_dict()

An example project using OpenPrompt under pytorch-lightning for prompt-based SST2 sentiment analysis model

Related tags

Overview

pl_prompt_sst

Run

Note

Owner

Zhiling Zhang

Meta learning algorithms to train cross-lingual NLI (multi-task) models

Augmenty is an augmentation library based on spaCy for augmenting texts.

The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

Simple Annotated implementation of GPT-NeoX in PyTorch

Text to speech for Vietnamese, ez to use, ez to update

I can help you convert your images to pdf file.

This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

This repository serves as a place to document a toy attempt on how to create a generative text model in Catalan, based on GPT-2

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

Python implementation of TextRank for phrase extraction and summarization of text documents

A music comments dataset, containing 39,051 comments for 27,384 songs.

Converts python code into c++ by using OpenAI CODEX.

Korean stereoypte detector with TUNiB-Electra and K-StereoSet

Beautiful visualizations of how language differs among document types.

The code from the whylogs workshop in DataTalks.Club on 29 March 2022

AudioCLIP Extending CLIP to Image, Text and Audio

CoNLL-English NER Task (NER in English)

Predict an emoji that is associated with a text

Bu Chatbot, Konya Bilim Merkezi Yen için tasarlanmış olan bir projedir.

In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset.