Lexical Substitution Framework

Overview

LexSubGen

Lexical Substitution Framework

This repository contains the code to reproduce the results from the paper:

Arefyev Nikolay, Sheludko Boris, Podolskiy Alexander, Panchenko Alexander, "Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution", Proceedings of the 28th International Conference on Computational Linguistics, 2020

Installation

Clone LexSubGen repository from github.com.

git clone https://github.com/Samsung/LexSubGen
cd LexSubGen

Setup anaconda environment

  1. Download and install conda
  2. Create new conda environment
    conda create -n lexsubgen python=3.7.4
  3. Activate conda environment
    conda activate lexsubgen
  4. Install requirements
    pip install -r requirements.txt
  5. Download spacy resources and install context2vec and word_forms from github repositories
    ./init.sh

Setup Web Application

If you do not plan to use the Web Application, skip this section and go to the next!

  1. Download and install NodeJS and npm.
  2. Run script for install dependencies and create build files.
bash web_app_setup.sh

Install lexsubgen library

python setup.py install

Results

Results of the lexical substitution task are presented in the following table. To reproduce them, follow the instructions above to install the correct dependencies.

Model SemEval COINCO
GAP [email protected] [email protected] [email protected] GAP [email protected] [email protected] [email protected]
OOC 44.65 16.82 12.83 18.36 46.3 19.58 15.03 12.99
C2V 55.82 7.79 5.92 11.03 48.32 8.01 6.63 7.54
C2V+embs 53.39 28.01 21.72 33.52 50.73 29.64 24.0 21.97
ELMo 53.66 11.58 8.55 13.88 49.47 13.58 10.86 11.35
ELMo+embs 54.16 32.0 22.2 31.82 52.22 35.96 26.62 23.8
BERT 54.42 38.39 27.73 39.57 50.5 42.56 32.64 28.73
BERT+embs 53.87 41.64 30.59 43.88 50.85 46.05 35.63 31.67
RoBERTa 56.74 32.25 24.26 36.65 50.82 35.12 27.35 25.41
RoBERTa+embs 58.74 43.19 31.19 44.61 54.6 46.54 36.17 32.1
XLNet 59.12 31.75 22.83 34.95 53.39 38.16 28.58 26.47
XLNet+embs 59.62 49.53 34.9 47.51 55.63 51.5 39.92 35.12

Results reproduction

Here we list XLNet reproduction commands that correspond to the results presented in the table above. Reproduction commands for all models you can find in scripts/lexsub-all-models.sh Besides saving to the 'run-directory' all results are saved using mlflow. To check them you can run mlflow ui in LexSubGen directory and then open the web page in a browser.

Also you can use pytest to check the reproducibility. But it may take a long time:

pytest tests/results_reproduction
  • XLNet:

XLNet Semeval07:

python lexsubgen/evaluations/lexsub.py solve --substgen-config-path configs/subst_generators/lexsub/xlnet.jsonnet --dataset-config-path configs/dataset_readers/lexsub/semeval_all.jsonnet --run-dir='debug/lexsub-all-models/semeval_all_xlnet' --force --experiment-name='lexsub-all-models' --run-name='semeval_all_xlnet'

XLNet CoInCo:

python lexsubgen/evaluations/lexsub.py solve --substgen-config-path configs/subst_generators/lexsub/xlnet.jsonnet --dataset-config-path configs/dataset_readers/lexsub/coinco.jsonnet --run-dir='debug/lexsub-all-models/coinco_xlnet' --force --experiment-name='lexsub-all-models' --run-name='coinco_xlnet'

XLNet with embeddings similarity Semeval07:

python lexsubgen/evaluations/lexsub.py solve --substgen-config-path configs/subst_generators/lexsub/xlnet_embs.jsonnet --dataset-config-path configs/dataset_readers/lexsub/semeval_all.jsonnet --run-dir='debug/lexsub-all-models/semeval_all_xlnet_embs' --force --experiment-name='lexsub-all-models' --run-name='semeval_all_xlnet_embs'

XLNet with embeddings similarity CoInCo:

python lexsubgen/evaluations/lexsub.py solve --substgen-config-path configs/subst_generators/lexsub/xlnet_embs.jsonnet --dataset-config-path configs/dataset_readers/lexsub/coinco.jsonnet --run-dir='debug/lexsub-all-models/coinco_xlnet_embs' --force --experiment-name='lexsub-all-models' --run-name='coinco_xlnet_embs'

Word Sense Induction Results

Model SemEval 2013 SemEval 2010
AVG AVG
XLNet 33.4 52.1
XLNet+embs 37.3 54.1

To reproduce these results use 2.3.0 version of transformers and the following command:

bash scripts/wsi.sh

Web application

You could use command line interface to run Web application.

# Run main server
lexsubgen-app run --host HOST 
                  --port PORT 
                  [--model-configs CONFIGS] 
                  [--start-ids START-IDS] 
                  [--start-all] 
                  [--restore-session]

Example:

# Run server and serve models BERT and XLNet. 
# For BERT create server for serving model and substitute generator instantly (load resources in memory).
# For XLNet create only server.
lexsubgen-app run --host '0.0.0.0' 
                  --port 5000 
                  --model-configs '["my_cool_configs/bert.jsonnet", "my_awesome_configs/xlnet.jsonnet"]' 
                  --start-ids '[0]'

# After shutting down server JSON file with session dumps in the '~/.cache/lexsubgen/app_session.json'.
# The content of this file looks like:
# [
#     'my_cool_configs/bert.jsonnet',
#     'my_awesome_configs/xlnet.jsonnet',
# ]
# You can restore it with flag 'restore-session'
lexsubgen-app run --host '0.0.0.0' 
                  --port 5000 
                  --restore-session
# BERT and XLNet restored now
Arguments:
Argument Default Description
--help Show this help message and exit
--host IP address of running server host
--port 5000 Port for starting the server
--model-configs [] List of file paths to the model configs.
--start-ids [] Zero-based indices of served models for which substitute generators will be created
--start-all False Whether to create substitute generators for all served models
--restore-session False Whether to restore session from previous Web application run

FAQ

  1. How to use gpu? - You can use environment variable CUDA_VISIBLE_DEVICES to use gpu for inference: export CUDA_VISIBLE_DEVICES='1' or CUDA_VISIBLE_DEVICES='1' before your command.
  2. How to run tests? - You can use pytest: pytest tests
Owner
Samsung
Samsung Electronics Co.,Ltd.
Samsung
Code for the paper: "On the Bottleneck of Graph Neural Networks and Its Practical Implications"

On the Bottleneck of Graph Neural Networks and its Practical Implications This is the official implementation of the paper: On the Bottleneck of Graph

75 Dec 22, 2022
Tensorflow Implementation of SMU: SMOOTH ACTIVATION FUNCTION FOR DEEP NETWORKS USING SMOOTHING MAXIMUM TECHNIQUE

SMU A Tensorflow Implementation of SMU: SMOOTH ACTIVATION FUNCTION FOR DEEP NETWORKS USING SMOOTHING MAXIMUM TECHNIQUE arXiv https://arxiv.org/abs/211

Fuhang 5 Jan 18, 2022
A MNIST-like fashion product database. Benchmark

Fashion-MNIST Table of Contents Why we made Fashion-MNIST Get the Data Usage Benchmark Visualization Contributing Contact Citing Fashion-MNIST License

Zalando Research 10.5k Jan 08, 2023
The repo for reproducing Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study

ECIR Reproducibility Paper: Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study This code corresponds to the reproducibility

ielab 3 Mar 31, 2022
RoMA: Robust Model Adaptation for Offline Model-based Optimization

RoMA: Robust Model Adaptation for Offline Model-based Optimization Implementation of RoMA: Robust Model Adaptation for Offline Model-based Optimizatio

9 Oct 31, 2022
The implementation of our CIKM 2021 paper titled as: "Cross-Market Product Recommendation"

FOREC: A Cross-Market Recommendation System This repository provides the implementation of our CIKM 2021 paper titled as "Cross-Market Product Recomme

Hamed Bonab 16 Sep 12, 2022
Source Code for AAAI 2022 paper "Graph Convolutional Networks with Dual Message Passing for Subgraph Isomorphism Counting and Matching"

Graph Convolutional Networks with Dual Message Passing for Subgraph Isomorphism Counting and Matching This repository is an official implementation of

HKUST-KnowComp 13 Sep 08, 2022
Fit Fast, Explain Fast

FastExplain Fit Fast, Explain Fast Installing pip install fast-explain About FastExplain FastExplain provides an out-of-the-box tool for analysts to

8 Dec 15, 2022
Official PyTorch implementation of the NeurIPS 2021 paper StyleGAN3

Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation of the NeurIPS 2021 paper Alias-Free Generative Adversarial Net

Eugenio Herrera 92 Nov 18, 2022
SymPy-powered, Wolfram|Alpha-like answer engine totally in your browser, without backend computation

SymPy Beta SymPy Beta is a fork of SymPy Gamma. The purpose of this project is to run a SymPy-powered, Wolfram|Alpha-like answer engine totally in you

Liumeo 25 Dec 21, 2022
:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

R²SQL The PyTorch implementation of paper Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing. (AAAI 2021) Requirement

huybery 60 Dec 31, 2022
.NET bindings for the Pytorch engine

TorchSharp TorchSharp is a .NET library that provides access to the library that powers PyTorch. It is a work in progress, but already provides a .NET

Matteo Interlandi 17 Aug 30, 2021
Python scripts for performing lane detection using the LSTR model in ONNX

ONNX LSTR Lane Detection Python scripts for performing lane detection using the Lane Shape Prediction with Transformers (LSTR) model in ONNX. Requirem

Ibai Gorordo 29 Aug 30, 2022
Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation This repository contains the official implementation of our paper: Self-su

Visual Inference Lab @TU Darmstadt 132 Dec 21, 2022
Official Pytorch implementation for AAAI2021 paper (RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning)

RSPNet Official Pytorch implementation for AAAI2021 paper "RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning" [Suppleme

35 Jun 24, 2022
A library of multi-agent reinforcement learning components and systems

Mava: a research framework for distributed multi-agent reinforcement learning Table of Contents Overview Getting Started Supported Environments System

InstaDeep Ltd 463 Dec 23, 2022
Bayesian Inference Tools in Python

BayesPy Bayesian Inference Tools in Python Our goal is, given the discrete outcomes of events, estimate the distribution of categories. Using gradient

Max Sklar 99 Dec 14, 2022
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning This repository contains the code for our ICCV 202

sangho.lee 28 Nov 08, 2022
This repository contains PyTorch models for SpecTr (Spectral Transformer).

SpecTr: Spectral Transformer for Hyperspectral Pathology Image Segmentation This repository contains PyTorch models for SpecTr (Spectral Transformer).

Boxiang Yun 45 Dec 13, 2022
Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

CenterPose Overview This repository is the official implementation of the paper "Single-stage Keypoint-based Category-level Object Pose Estimation fro

NVIDIA Research Projects 188 Dec 27, 2022