A toolkit for document-level event extraction, containing some SOTA model implementations

Overview

Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker

Source code for ACL-IJCNLP 2021 Long paper: Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker.

Our code is based on Doc2EDAG.

0. Introduction

Document-level event extraction aims to extract events within a document. Different from sentence-level event extraction, the arguments of an event record may scatter across sentences, which requires a comprehensive understanding of the cross-sentence context. Besides, a document may express several correlated events simultaneously, and recognizing the interdependency among them is fundamental to successful extraction. To tackle the aforementioned two challenges, We propose a novel heterogeneous Graph-based Interaction Model with a Tracker (GIT). A graph-based interaction network is introduced to capture the global context for the scattered event arguments across sentences with different heterogeneous edges. We also decode event records with a Tracker module, which tracks the extracted event records, so that the interdependency among events is taken into consideration. Our approach delivers better results over the state-of-the-art methods, especially in cross-sentence events and multiple events scenarios.

  • Architecture model overview

  • Overall Results

1. Package Description

GIT/
├─ dee/
    ├── __init__.py
    ├── base_task.py
    ├── dee_task.py
    ├── ner_task.py
    ├── dee_helper.py: data features constrcution and evaluation utils
    ├── dee_metric.py: data evaluation utils
    ├── config.py: process command arguments
    ├── dee_model.py: GIT model
    ├── ner_model.py
    ├── transformer.py: transformer module
    ├── utils.py: utils
├─ run_dee_task.py: the main entry
├─ train_multi.sh
├─ run_train.sh: script for training (including evaluation)
├─ run_eval.sh: script for evaluation
├─ Exps/: experiment outputs
├─ Data.zip
├─ Data: unzip Data.zip
├─ LICENSE
├─ README.md

2. Environments

  • python (3.6.9)
  • cuda (11.1)
  • Ubuntu-18.0.4 (5.4.0-73-generic)

3. Dependencies

  • numpy (1.19.5)
  • torch (1.8.1+cu111)
  • pytorch-pretrained-bert (0.4.0)
  • dgl-cu111 (0.6.1)
  • tensorboardX (2.2)

PS: The environments and dependencies listed here is different from what we use in our paper, so the results may be a bit different.

4. Preparation

  • Unzip Data.zip and you can get an Data folder, where the training/dev/test data locate.

5. Training

>> bash run_train.sh

6. Evaluation

>> bash run_eval.sh

(The evaluation is also conducted after the training)

7. License

This project is licensed under the MIT License - see the LICENSE file for details.

8. Citation

If you use this work or code, please kindly cite the following paper:

@inproceedings{xu-etal-2021-git,
    title = "Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker",
    author = "Runxin Xu  and
      Tianyu Liu  and
      Lei Li and
      Baobao Chang",
    booktitle = "The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021)",
    year = "2021",
    publisher = "Association for Computational Linguistics",
}
Owner
人生苦短 及时行乐
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism This repository is the official PyTorch implementation of our AAAI-2022 paper, in

Jinglin Liu 829 Jan 07, 2023
DeepSpeech - Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.

(简体中文|English) Quick Start | Documents | Models List PaddleSpeech is an open-source toolkit on PaddlePaddle platform for a variety of critical tasks i

5.6k Jan 03, 2023
The proliferation of disinformation across social media has led the application of deep learning techniques to detect fake news.

Fake News Detection Overview The proliferation of disinformation across social media has led the application of deep learning techniques to detect fak

Kushal Shingote 1 Feb 08, 2022
Python wrapper for Stanford CoreNLP tools v3.4.1

Python interface to Stanford Core NLP tools v3.4.1 This is a Python wrapper for Stanford University's NLP group's Java-based CoreNLP tools. It can eit

Dustin Smith 610 Sep 07, 2022
A linter to manage all your python exceptions and try/except blocks (limited only for those who like dinosaurs).

Manage your exceptions in Python like a PRO Currently in BETA. Inspired by this blog post. I shared the building process of this tool here. “For those

Guilherme Latrova 353 Dec 31, 2022
Translators - is a library which aims to bring free, multiple, enjoyable translation to individuals and students in Python

Translators - is a library which aims to bring free, multiple, enjoyable translation to individuals and students in Python

UlionTse 907 Dec 27, 2022
TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

TFPNER TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech Named entity recognition (NER), which aims at identifyin

1 Feb 07, 2022
The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

speech-recognition-py Speech recognition is the ability of computer software to identify words and phrases in spoken language and convert them to huma

Deepangshi 1 Apr 03, 2022
Train 🤗-transformers model with Poutyne.

poutyne-transformers Train 🤗 -transformers models with Poutyne. Installation pip install poutyne-transformers Example import torch from transformers

Lennart Keller 2 Dec 18, 2022
⚖️ A Statutory Article Retrieval Dataset in French.

A Statutory Article Retrieval Dataset in French This repository contains the Belgian Statutory Article Retrieval Dataset (BSARD), as well as the code

Maastricht Law & Tech Lab 19 Nov 17, 2022
End-2-end speech synthesis with recurrent neural networks

Introduction New: Interactive demo using Google Colaboratory can be found here TTS-Cube is an end-2-end speech synthesis system that provides a full p

Tiberiu Boros 214 Dec 07, 2022
An official repository for tutorials of Probabilistic Modelling and Reasoning (2021/2022) - a University of Edinburgh master's course.

PMR computer tutorials on HMMs (2021-2022) This is a repository for computer tutorials of Probabilistic Modelling and Reasoning (2021/2022) - a Univer

Vaidotas Šimkus 10 Dec 06, 2022
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Phil Wang 5k Jan 02, 2023
A Python script which randomly chooses and prints a file from a directory.

___ ____ ____ _ __ ___ / _ \ | _ \ | _ \ ___ _ __ | '__| / _ \ | |_| || | | || | | | / _ \| '__| | | | __/ | _ || |_| || |_| || __

yesmaybenookay 0 Aug 06, 2021
Unsupervised text tokenizer for Neural Network-based text generation.

SentencePiece SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabu

Google 6.4k Jan 01, 2023
Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)

Linear Multihead Attention (Linformer) PyTorch Implementation of reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer:

Kui Xu 58 Dec 23, 2022
Code for ACL 2020 paper "Rigid Formats Controlled Text Generation"

SongNet SongNet: SongCi + Song (Lyrics) + Sonnet + etc. @inproceedings{li-etal-2020-rigid, title = "Rigid Formats Controlled Text Generation",

Piji Li 212 Dec 17, 2022
Python SDK for working with Voicegain Speech-to-Text

Voicegain Speech-to-Text Python SDK Python SDK for the Voicegain Speech-to-Text API. This API allows for large vocabulary speech-to-text transcription

Voicegain 3 Dec 14, 2022
Tool which allow you to detect and translate text.

Text detection and recognition This repository contains tool which allow to detect region with text and translate it one by one. Description Two pretr

Damian Panek 176 Nov 28, 2022