for a paper about leveraging discourse markers for training new models

Last update: Nov 02, 2022

Related tags

Deep Learning TSLM-DISCOURSE-MARKERS

Overview

TSLM-DISCOURSE-MARKERS

Scope

This repository contains:

(1) Code to extract discourse markers from wikipedia (TSA).

(1) Code to extract significant discoßurse markers from predictions over a sample

Usage

Evaluation code:

Installation

Using pip:

pip install git+ssh://[email protected]/IBM/tslm-discourse-markers.git#egg=tslm-discourse-markers

Alternatively, you can first clone the code, and install the requirements:

1. git clone [email protected]:IBM/tslm-discousrse-markers.git
2. cd tslm-discourse-markers
3. pip install -r requirements.txt

You also need to download fasttext model: curl https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin -o ~/Downloads/lid.176.bin and spacy english model: python -m spacy download en_core_web_sm

Running

Citing tslm-discourse-markers

If you are using tslm-discourse-markers in a publication, please cite the following paper:

Liat Ein-Dor, Ilya Shnayderman, Artem Spector, Lena Dankin,Ranit Aharonov and Noam Slonim 2022 Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis. AAAI-2022.

Model

SenDM model can be found at: https://huggingface.co/ibm/tslm-discourse-markers

Loading dataset

import datasets

directory = 'dataset/WIKI_ENGLISH' datasets.load_dataset('csv', data_files={folder: [f'{directory}/{folder}/{folder}_*.csv.gz'] for folder in ['train', 'dev','test']})

Contributing

This project welcomes external contributions, if you would like to contribute please see further instructions here

Pull requests are very welcome! Make sure your patches are well tested. Ideally create a topic branch for every separate change you make. For example:

Fork the repo
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Added some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request

Changelog

Major changes are documented here.

Notes

If you have any questions or issues you can create a new issue here.

License

This code is distributed under Apache License 2.0. If you would like to see the detailed LICENSE click here.

Authors

The YASO dataset was collected by Liat Ein-Dor, Ilya Shnayderman, Artem Spector, Lena Dankin, Ranit Aharonov and Noam Slonim.

The code was written by Ilya Shnayderman.

for a paper about leveraging discourse markers for training new models

Related tags

Overview

TSLM-DISCOURSE-MARKERS

Scope

Usage

Citing tslm-discourse-markers

Model

Loading dataset

Contributing

Changelog

Notes

License

Authors

Owner

International Business Machines

TransMorph: Transformer for Medical Image Registration

Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging"

Fashion Entity Classification

A-ESRGAN aims to provide better super-resolution images by using multi-scale attention U-net discriminators.

Deep ViT Features as Dense Visual Descriptors

3D Human Pose Machines with Self-supervised Learning

Optimized primitives for collective multi-GPU communication

Pyramid addon for OpenAPI3 validation of requests and responses.

PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning.

Lightweight Cuda Renderer with Python Wrapper.

Code for Massive-scale Decoding for Text Generation using Lattices

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble

Face Recognition plus identification simply and fast | Python

Bayesian optimization in PyTorch

PyTorch and GPyTorch implementation of the paper "Conditioning Sparse Variational Gaussian Processes for Online Decision-making."

Analyzing basic network responses to novel classes

The repository includes the code for training cell counting applications. (Keras + Tensorflow)

Campsite Reservation Finder

Google AI Open Images - Object Detection Track: Open Solution

Only works with the dashboard version / branch of jesse