Automatic Idiomatic Expression Detection

Related tags

Deep LearningDISC
Overview

IDentifier of Idiomatic Expressions via Semantic Compatibility (DISC)

An Idiomatic identifier that detects the presence and span of idiomatic expression in a given sentence.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. License
  5. Contact
  6. Acknowledgements

About The Project

This project is a supervised idiomatic expression identification method. Given a sentence that contains a potentially idiomatic expression (PIE), the model identifies the span of the PIE if it is indeed used in an idiomatic sense, otherwise, the model does not identify the PIE. The identification is done via checking the smemantic compatibility. More details will be updated here (Detail description, figures, etc.).

The paper will appear in TACL.

Built With

This model is heavily relying the resources/libraries list as following:

Getting Started

The implementation here includes processed data created for MAGPIE random-split dataset. The model checkpoint that trained with MAGPIE random-split is also provided.

Prerequisites

All the dependencies for this project is listed in requirements.txt. You can install them via a standard command:

pip install -r requirements.txt

It is highly recommanded to start a conda environment with PyTorch properly installed based on your hardward before install the other requirements.

Checkpoint

To run the model with a pre-trained checkpoint, please first create a ./checkpoints folder at root. Then, please download the checkpoint from Google Drive via this Link. Please put the checkpoint in the ./checkpoints folder.

Usage

Configuration

Before running the demo or experiments (training or testing), please see the config.py which sets the configuration of the model. Some parameters there, such as MODE needs to be set appropriately for the model to run correctly. Please see comments for more details.

Demo

To start, please go through the examples provided in demo.ipynb. In there, we process a given input sentence into the model input data and then run model inference to extract the idiomatic expression (if present) from the input sentence (visualized).

Data processing

To process a dataset (such as MAGPIE) for model training and testing, please refer to ./data_processing/MAGPIE/read_comp_data_processing.ipynb. It takes a dataset with sententences and their PIE lcoations as input and generate all the necessary files for model training and inference.

Training and Testing

For training and testing, please refer to train.py and test.py. Note that test.py is used to produce evaluation scores as shown in the paper. inference.py is used to produce prediction for sentences.

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Ziheng Zeng - [email protected]

Project Link: https://github.com/your_username/repo_name

Acknowledgements

[TODO]:

Add the following in README:

  • Method detail descrption
  • Method figure
  • Demo walkthrough
  • Data processing tips and instructions Add requirements.txt
A model which classifies reviews as positive or negative.

SentiMent Analysis In this project I built a model to classify movie reviews fromn the IMDB dataset of 50K reviews. WordtoVec : Neural networks only w

Rishabh Bali 2 Feb 09, 2022
Code release for NeurIPS 2020 paper "Co-Tuning for Transfer Learning"

CoTuning Official implementation for NeurIPS 2020 paper Co-Tuning for Transfer Learning. [News] 2021/01/13 The COCO 70 dataset used in the paper is av

THUML @ Tsinghua University 35 Sep 23, 2022
Code for "Primitive Representation Learning for Scene Text Recognition" (CVPR 2021)

Primitive Representation Learning Network (PREN) This repository contains the code for our paper accepted by CVPR 2021 Primitive Representation Learni

Ruijie Yan 76 Jan 02, 2023
Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Codebase for "ProtoAttend: Attention-Based Prototypical Learning." Authors: Sercan O. Arik and Tomas Pfister Paper: Sercan O. Arik and Tomas Pfister,

47 2 May 17, 2022
Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

Little Ball of Fur is a graph sampling extension library for Python. Please look at the Documentation, relevant Paper, Promo video and External Resour

Benedek Rozemberczki 619 Dec 14, 2022
Brain tumor detection using CNN (InceptionResNetV2 Model)

Brain-Tumor-Detection Building a detection model using a convolutional neural network in Tensorflow & Keras. Used brain MRI images. InceptionResNetV2

1 Feb 13, 2022
Bunch of different tools which helps visualizing and annotating images for semantic/instance segmentation tasks

Data Framework for Semantic/Instance Segmentation Bunch of different tools which helps visualizing, transforming and annotating images for semantic/in

Bruno Fernandes Carvalho 5 Dec 21, 2022
Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have underg

Nafis Ahmed 1 Dec 28, 2021
Code and models for ICCV2021 paper "Robust Object Detection via Instance-Level Temporal Cycle Confusion".

Robust Object Detection via Instance-Level Temporal Cycle Confusion This repo contains the implementation of the ICCV 2021 paper, Robust Object Detect

Xin Wang 69 Oct 13, 2022
Convert Table data to approximate values with GUI

Table_Editor Convert Table data to approximate values with GUIs... usage - Import methods for extension Tables. Imported method supposed to have only

CLJ 1 Jan 10, 2022
Numenta published papers code and data

Numenta research papers code and data This repository contains reproducible code for selected Numenta papers. It is currently under construction and w

Numenta 293 Jan 06, 2023
PyTorch implementation of CloudWalk's recent work DenseBody

densebody_pytorch PyTorch implementation of CloudWalk's recent paper DenseBody. Note: For most recent updates, please check out the dev branch. Update

Lingbo Yang 401 Nov 19, 2022
DyNet: The Dynamic Neural Network Toolkit

The Dynamic Neural Network Toolkit General Installation C++ Python Getting Started Citing Releases and Contributing General DyNet is a neural network

Chris Dyer's lab @ LTI/CMU 3.3k Jan 06, 2023
Official repository for: Continuous Control With Ensemble DeepDeterministic Policy Gradients

Continuous Control With Ensemble Deep Deterministic Policy Gradients This repository is the official implementation of Continuous Control With Ensembl

4 Dec 06, 2021
Template repository for managing machine learning research projects built with PyTorch-Lightning

Tutorial Repository with a minimal example for showing how to deploy training across various compute infrastructure.

Sidd Karamcheti 3 Feb 11, 2022
Supplementary code for SIGGRAPH 2021 paper: Discovering Diverse Athletic Jumping Strategies

SIGGRAPH 2021: Discovering Diverse Athletic Jumping Strategies project page paper demo video Prerequisites Important Notes We suspect there are bugs i

54 Dec 06, 2022
Second-order Attention Network for Single Image Super-resolution (CVPR-2019)

Second-order Attention Network for Single Image Super-resolution (CVPR-2019) "Second-order Attention Network for Single Image Super-resolution" is pub

516 Dec 28, 2022
Fast Axiomatic Attribution for Neural Networks (NeurIPS*2021)

Fast Axiomatic Attribution for Neural Networks This is the official repository accompanying the NeurIPS 2021 paper: R. Hesse, S. Schaub-Meyer, and S.

Visual Inference Lab @TU Darmstadt 11 Nov 21, 2022
learning and feeling SLAM together with hands-on-experiments

modern-slam-tutorial-python Learning and feeling SLAM together with hands-on-experiments 😀 😃 😆 Dependencies Most of the examples are based on GTSAM

Giseop Kim 59 Dec 22, 2022
TilinGNN: Learning to Tile with Self-Supervised Graph Neural Network (SIGGRAPH 2020)

TilinGNN: Learning to Tile with Self-Supervised Graph Neural Network (SIGGRAPH 2020) About The goal of our research problem is illustrated below: give

59 Dec 09, 2022