Pytorch implementation of “Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement”

Overview

Graph-to-Graph Transformers

Self-attention models, such as Transformer, have been hugely successful in a wide range of natural language processing (NLP) tasks, especially when combined with language-model pre-training, such as BERT.

We propose "Graph-to-Graph Transformer" and "Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement"(accepted to TACL) to generalize vanilla Transformer to encode graph structure, and builds the desired output graph.

Note : To use G2GTr model for transition-based dependency parsing, please refer to G2GTr repository.

Contents

Installation

Following packages should be included in your environment:

  • Python >= 3.7
  • PyTorch >= 1.4.0
  • Transformers(huggingface) = 2.4.1

The easier way is to run the following command:

conda env create -f environment.yml
conda activate rngtr

Quick Start

Graph-to-Graph Transformer architecture is general and can be applied to any NLP tasks which interacts with graphs. To use our implementation in your task, you just need to add BertGraphModel class to your code to encode both token-level and graph-level information. Here is a sample usage:

#Loading BertGraphModel and initialize it with available BERT models.
import torch
from parser.utils.graph import initialize_bertgraph,BertGraphModel
# inputing unlabelled graph with label size 5, and Layer Normalization of key
# you can load other BERT pre-trained models too.
encoder = initialize_bertgraph('bert-base-cased',layernorm_key=True,layernorm_value=False,
             input_label_graph=False,input_unlabel_graph=True,label_size=5)

#sample input
input = torch.tensor([[1,2],[3,4]])
graph = torch.tensor([ [[1,0],[0,1]],[[0,1],[1,0]] ])
graph_rel = torch.tensor([[0,1],[3,4]])
output = encoder(input_ids=input,graph_arc=graph,graph_rel=graph_rel)
print(output[0].shape)
## torch.Size([2, 2, 768])

# inputting labelled graph
encoder = initialize_bertgraph('bert-base-cased',layernorm_key=True,layernorm_value=False,
             input_label_graph=True,input_unlabel_graph=False,label_size=5)

#sample input
input = torch.tensor([[1,2],[3,4]])
graph = torch.tensor([ [[2,0],[0,3]],[[0,1],[4,0]] ])
output = encoder(input_ids=input,graph_arc=graph,)
print(output[0].shape)
## torch.Size([2, 2, 768])

If you just want to use BertGraphModel in your research, you can just import it from our repository:

from parser.utils.graph import BertGraphModel,BertGraphConfig
config = BertGraphConfig(YOUR-CONFIG)
config.add_graph_par(GRAPH-CONFIG)
encoder = BertGraphModel(config)

Data Pre-processing and Initial Parser

Dataset Preparation

We evaluated our model on UD Treebanks, English and Chinese Penn Treebanks, and CoNLL 2009 Shared Task. In following sections, we prepare datasets and their evaluation scripts.

Penn Treebanks

English Penn Treebank can be downloaded from english and chinese under LDC license. For English Penn Treebank, replace gold POS tags with Stanford POS tagger with following command in this repository:

bash scripts/postag.sh ${data_dir}/ptb3-wsj-[train|dev|dev.proj|test].conllx

CoNLL 2009 Treebanks

You can download Treebanks from here under LDC license. We use predicted POS tags provided by organizers.

UD Treebanks

You can find required Treebanks from here. (use version 2.3)

Initial Parser

As mentioned in our paper, you can use any initial parser to produce dependency graph. Here we use Biaffine Parser for Penn Treebanks, and German Corpus. We also apply our model to ouput prediction of UDify parser for UD Treebanks.
Biaffine Parser: To prepare biaffine initial parser, we use this repository to produce output predictions.
UDify Parser: For UD Treebanks, we use UDify repository to produce required initial dependency graph.
Alternatively, you can easily run the following command file to produce all required outputs:

bash job_scripts/udify_dataset.bash

Training

To train your own model, you can easily fill out the script in job_scripts directory, and run it. Here is the list of sample scripts:

Model Script
Syntactic Transformer baseline.bash
Any initial parser+RNGTr rngtr.bash
Empty+RNGTr empty_rngtr.bash

Evaluation

First you should download official scripts from UD, Penn Treebaks, and German. Then, run the following command:

bash job_scripts/predict.bash

To replicate refinement analysis and error analysis results, you should use MaltEval tools.

Predict Raw Sentences

You can also predict dependency graphs of raw texts with a pre-trained model by modifying predict.bash file. Just set input_type to raw. Then, put all your sentences in a .txt file, and the output will be in CoNNL format.

Citations

If you use this code for your research, please cite these works as:

@misc{mohammadshahi2020recursive,
      title={Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement}, 
      author={Alireza Mohammadshahi and James Henderson},
      year={2020},
      eprint={2003.13118},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
@inproceedings{mohammadshahi-henderson-2020-graph,
    title = "Graph-to-Graph Transformer for Transition-based Dependency Parsing",
    author = "Mohammadshahi, Alireza  and
      Henderson, James",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.findings-emnlp.294",
    pages = "3278--3289",
    abstract = "We propose the Graph2Graph Transformer architecture for conditioning on and predicting arbitrary graphs, and apply it to the challenging task of transition-based dependency parsing. After proposing two novel Transformer models of transition-based dependency parsing as strong baselines, we show that adding the proposed mechanisms for conditioning on and predicting graphs of Graph2Graph Transformer results in significant improvements, both with and without BERT pre-training. The novel baselines and their integration with Graph2Graph Transformer significantly outperform the state-of-the-art in traditional transition-based dependency parsing on both English Penn Treebank, and 13 languages of Universal Dependencies Treebanks. Graph2Graph Transformer can be integrated with many previous structured prediction methods, making it easy to apply to a wide range of NLP tasks.",
}

Have a question not listed here? Open a GitHub Issue or send us an email.

Owner
Idiap Research Institute
Idiap Research Institute
Libraries, tools and tasks created and used at DeepMind Robotics.

Libraries, tools and tasks created and used at DeepMind Robotics.

DeepMind 270 Nov 30, 2022
Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

DIFFNet This repo is for Self-Supervised Monocular Depth Estimation with Internal Feature Fusion(arXiv), BMVC2021 A new backbone for self-supervised d

Hang 94 Dec 25, 2022
PaSST: Efficient Training of Audio Transformers with Patchout

PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa

165 Dec 26, 2022
Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Aviv Gabbay 41 Nov 29, 2022
Crawl & visualize ICLR papers and reviews

Crawl and Visualize ICLR 2022 OpenReview Data Descriptions This Jupyter Notebook contains the data crawled from ICLR 2022 OpenReview webpages and thei

Federico Berto 75 Dec 05, 2022
An Exact Solver for Semi-supervised Minimum Sum-of-Squares Clustering

PC-SOS-SDP: an Exact Solver for Semi-supervised Minimum Sum-of-Squares Clustering PC-SOS-SDP is an exact algorithm based on the branch-and-bound techn

Antonio M. Sudoso 1 Nov 13, 2022
💡 Learnergy is a Python library for energy-based machine learning models.

Learnergy: Energy-based Machine Learners Welcome to Learnergy. Did you ever reach a bottleneck in your computational experiments? Are you tired of imp

Gustavo Rosa 57 Nov 17, 2022
[CVPR'2020] DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data

DeepDeform (CVPR'2020) DeepDeform is an RGB-D video dataset containing over 390,000 RGB-D frames in 400 videos, with 5,533 optical and scene flow imag

Aljaz Bozic 165 Jan 09, 2023
Meta-learning for NLP

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks Code for training the meta-learning models and fine-tuning on downstr

IESL 43 Nov 08, 2022
Spiking Neural Network for Computer Vision using SpikingJelly framework and Pytorch-Lightning

Spiking Neural Network for Computer Vision using SpikingJelly framework and Pytorch-Lightning

Sami BARCHID 2 Oct 20, 2022
Classification Modeling: Probability of Default

Credit Risk Modeling in Python Introduction: If you've ever applied for a credit card or loan, you know that financial firms process your information

Aktham Momani 2 Nov 07, 2022
The second project in Python course on FCC

Assignment Write a function named add_time that takes in two required parameters and one optional parameter: a start time in the 12-hour clock format

Denise T 1 Dec 13, 2021
This repo contains implementation of different architectures for emotion recognition in conversations.

Emotion Recognition in Conversations Updates 🔥 🔥 🔥 Date Announcements 03/08/2021 🎆 🎆 We have released a new dataset M2H2: A Multimodal Multiparty

Deep Cognition and Language Research (DeCLaRe) Lab 1k Dec 30, 2022
PyGCL: A PyTorch Library for Graph Contrastive Learning

PyGCL is a PyTorch-based open-source Graph Contrastive Learning (GCL) library, which features modularized GCL components from published papers, standa

PyGCL 588 Dec 31, 2022
Hierarchical Uniform Manifold Approximation and Projection

HUMAP Hierarchical Manifold Approximation and Projection (HUMAP) is a technique based on UMAP for hierarchical non-linear dimensionality reduction. HU

Wilson Estécio Marcílio Júnior 160 Jan 06, 2023
Text-Based Ideal Points

Text-Based Ideal Points Source code for the paper: Text-Based Ideal Points by Keyon Vafa, Suresh Naidu, and David Blei (ACL 2020). Update (June 29, 20

Keyon Vafa 37 Oct 09, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation models. It contains 17 different amateur subjects performing 30

Aiden Nibali 25 Jun 20, 2021
DGL-TreeSearch and the Gurobi-MWIS interface

Independent Set Benchmarking Suite This repository contains the code for our maximum independent set benchmarking suite as well as our implementations

Maximilian Böther 19 Nov 22, 2022
Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

ToxiChat Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Install depen

Ashutosh Baheti 11 Jan 01, 2023
Catbird is an open source paraphrase generation toolkit based on PyTorch.

Catbird is an open source paraphrase generation toolkit based on PyTorch. Quick Start Requirements and Installation The project is based on PyTorch 1.

Afonso Salgado de Sousa 5 Dec 15, 2022