Linear programming solver for paper-reviewer matching and mind-matching

Last update: Jul 05, 2022

Overview

Paper-Reviewer Matcher

A python package for paper-reviewer matching algorithm based on topic modeling and linear programming. The algorithm is implemented based on this article). This package solves problem of assigning paper to reviewers with constrains by solving linear programming problem. We minimize global distance between papers and reviewers in topic space (e.g. topic modeling can be Principal component, Latent Semantic Analysis (LSA), etc.).

Here is a diagram of problem setup and how we solve the problem.

Mind-Match Command Line

Mind-Match is a session we run at Cognitive Computational Neuroscience (CCN) conference. We use a combination of topic modeling and linear programming to solve optimal matching problem. To run example Mind-Match algorithm on sample of 500 people, you can clone the repository and run the following

python mindmatch.py data/mindmatch_example.csv --n_match=6 --n_trim=50

in the root of this repo. This should produce a matching output output_match.csv in this relative location. However, when people get much larger this script takes quite a long time to run. We use pre-cluster into groups before running the mind-matching to make the script runs faster. Below is an example script for pre-clustering and mind-matching on all data:

python mindmatch_cluster.py data/mindmatch_example.csv --n_match=6 --n_trim=50 --n_clusters=4

Example script for the conferences

Here, I include a recent scripts for our Mind Matching session for CCN conference.

ccn_mind_matching_2019.py contains script for Mind Matching session (match scientists to scientists) for CCN conference
ccn_paper_reviewer_matching.py contains script for matching publications to reviewers for CCN conference, see example of CSV files in data folder

The code makes the distance metric of topics between incoming papers with reviewers (for ccn_paper_reviewer_matching.py) and between people with people (for ccn_mind_matching_2019). We trim the metric so that the problem is not too big to solve using or-tools. It then solves linear programming problem to assign the best matches which minimize the global distance between papers to reviewers. After that, we make the output that can be used by the organizers of the CCN conference -- pairs of paper and reviewers or mind-matching schedule between people to people in the conference. You can see of how it works below.

Dependencies

Use pip to install dependencies

pip install -r requirements.txt

Please see Stackoverflow if you have a problem installing or-tools on MacOS. You can use pip to install protobuf before installing or-tools

pip install protobuf==3.0.0b4
pip install ortools

for Python 3.6,

pip install --user --upgrade ortools

Citations

If you use Paper-Reviewer Matcher in your work or conference, please cite us as follows

@misc{achakulvisut2018,
    author = {Achakulvisut, Titipat and Acuna, Daniel E. and Kording, Konrad},
    title = {Paper-Reviewer Matcher},
    year = {2018},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/titipata/paper-reviewer-matcher}},
    commit = {9d346ee008e2789d34034c2b330b6ba483537674}
}

Members

Daniel Acuna (original author)
Titipat Achakulvisut (refactor)
Konrad Kording

Linear programming solver for paper-reviewer matching and mind-matching

Related tags

Overview

Paper-Reviewer Matcher

Mind-Match Command Line

Example script for the conferences

Dependencies

Citations

Members

Owner

Titipat Achakulvisut

A natural language modeling framework based on PyTorch

Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

Experiments in converting wikidata to ftm

One Stop Anomaly Shop: Anomaly detection using two-phase approach: (a) pre-labeling using statistics, Natural Language Processing and static rules; (b) anomaly scoring using supervised and unsupervised machine learning.

BiNE: Bipartite Network Embedding

AI and Machine Learning workflows on Anthos Bare Metal.

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Tools for curating biomedical training data for large-scale language modeling

The NewSHead dataset is a multi-doc headline dataset used in NHNet for training a headline summarization model.

An ultra fast tiny model for lane detection, using onnx_parser, TensorRTAPI, torch2trt to accelerate. our model support for int8, dynamic input and profiling. (Nvidia-Alibaba-TensoRT-hackathon2021)

Code for the paper "Flexible Generation of Natural Language Deductions"

This repository contains the code for "Generating Datasets with Pretrained Language Models".

A high-level yet extensible library for fast language model tuning via automatic prompt search

The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Tool to check whether a GCP bucket is public or not.

Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding

Every Google, Azure & IBM text to speech voice for free

CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)

OpenChat: Opensource chatting framework for generative models