Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"

Last update: Sep 14, 2022

Related tags

Overview

Graph Neural Topic Model (GNTM)

This is the pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"

Requirements

Python >= 3.6
Pytorch == 1.6.0
torch-geometric == 1.7.0
torch-scatter == 2.0.6
torch-sparse == 0.6.9

Dataset

The links of the datasets can be found in the following:

The Glove word embeddings can be download from theis link.

The datasets and word embedings should be placed with the guide of the paths in the settings.py.

Usage

Before training GNTM, we first need to preprocess the data by the following scripts (need adjust some parameters based on the description in our paper for different datasets.):

cd dataPrepare
python preprocess.py
python graph_data.py

Example script to train GNTM:

python main.py \
--device cuda:0 \
--dataset News20 \
--model GDGNNMODEL \
--num_topic 20 \
--num_epoch 400 \
--ni 300  \
--word \
--taskid 0 \
--nwindow  3

Here,

--dataset specifies the dataset name, currently it supports News20, TMN, BNC and Reuters for 20 News Group, Tag My News, British National Corpus and Reuters, respectively.
--device represents computation device, such as cpu or cuda:0.
--model represents the used model, GDGNNMODEL is corresponding to GNTM
--num_topic represents the number of topics.
--num_epoch represents the maximized number of training epochs.
--ni represents the dimension of word embeddings.
--taskid is corresponding to the random seed.
--nwindow represents the window size to construct dpcument graphs.

Reference

If you find our methods or code helpful, please kindly cite the paper:

@inproceedings{shen2021topic,
  title={Topic Modeling Revisited: A Document Graph-based Neural Network Perspective},
  author={Shen, Dazhong and Qin, Chuan and Wang, Chao and Dong, Zheng and Zhu, Hengshu and Xiong, Hui},
  booktitle={Proceedings of Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS-2021)},
  year={2021}
}

Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"

Related tags

Overview

Graph Neural Topic Model (GNTM)

Requirements

Dataset

Usage

Reference

Owner

Dazhong Shen

CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

AI assistant built in python.the features are it can display time,say weather,open-google,youtube,instagram.

Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

ToFFi - Toolbox for Frequency-based Fingerprinting of Brain Signals

Keras-1D-ACGAN-Data-Augmentation

Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting

ESGD-M - A stochastic non-convex second order optimizer, suitable for training deep learning models, for PyTorch

Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Implementation of "Deep Implicit Templates for 3D Shape Representation"

Versatile Generative Language Model

UMPNet: Universal Manipulation Policy Network for Articulated Objects

Fuzzing tool (TFuzz): a fuzzing tool based on program transformation

An alarm clock coded in Python 3 with Tkinter

2D&3D human pose estimation

Code and datasets for the paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction"

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018

UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body Decoupling 3D Model

Convolutional Neural Network to detect deforestation in the Amazon Rainforest

Code for the paper "Controllable Video Captioning with an Exemplar Sentence"