README

Code for the paper Asymptotics of L2 Regularized Network Embeddings.

Requirements

Requires Stellargraph 1.2.1, Tensorflow 2.6.0, scikit-learm 0.24.1, tqdm, along with any other packages required for the above three packages.

Code

To run node classification or link prediction experiments, run

python -m code.train_embed [[args]]

python -m code.train_embed_link [[args]]

from the command line respectively, where [[args]] correspond to the command line arguments for each function. Note that the scripts expect to run from the parent directory of the code folder; you will need to change the import statements in the associated python files if you move them around. The -h command line argument will display the arguments (with descriptions) of each of the two files.

train_embed.py arguments

short	long	default	help
`-h`	`--help`		show this help message and exit
	`--dataset`	`Cora`	Dataset to perform training on. Available options: Cora,CiteSeer,PubMedDiabetes
	`--emb-size`	`128`	Embedding dimension. Defaults to 128.
	`--reg-weight`	`0.0`	Weight to use for L2 regularization. If norm_reg is True, then reg_weight/num_of_nodes is used instead.
	`--norm-reg`		Boolean for whether to normalize the L2 regularization weight by the number of nodes in the graph. Defaults to false.
	`--method`	`node2vec`	Algorithm to perform training on. Available options: node2vec,GraphSAGE,GCN,DGI
	`--verbose`	`1`	Level of verbosity. Defaults to 1.
	`--epochs`	`5`	Number of epochs through the dataset to be used for training.
	`--optimizer`	`Adam`	Optimization algorithm to use for training.
	`--learning-rate`	`0.001`	Learning rate to use for optimization.
	`--batch-size`	`64`	Batch size used for training.
	`--train-split`	`[0.01, 0.025, 0.05]`	Percentage(s) to use for the training split when using the learned embeddings for downstream classification tasks.
	`--train-split-num`	`25`	Decides the number of random training/test splits to use for evaluating performance. Defaults to 50.
	`--output-fname`	`None`	If not None, saves the hyperparameters and testing results to a .json file with filename given by the argument.
	`--node2vec-p`	`1.0`	Hyperparameter governing probability of returning to source node.
	`--node2vec-q`	`1.0`	Hyperparameter governing probability of moving to a node away from the source node.
	`--node2vec-walk-number`	`50`	Number of walks used to generate a sample for node2vec.
	`--node2vec-walk-length`	`5`	Walk length to use for node2vec.
	`--dgi-sampler`	`fullbatch`	Specifies either a fullbatch or a minibatch sampling scheme for DGI.
	`--gcn-activation`	`['relu']`	Determines the activations of each layer within a GCN. Defaults to a single layer with relu activation.
	`--graphSAGE-aggregator`	`mean`	Specifies the aggreagtion rule used in GraphSAGE. Defaults to mean pooling.
	`--graphSAGE-nbhd-sizes`	`[10, 5]`	Specify multiple neighbourhood sizes for sampling in GraphSAGE. Defaults to [10, 5].
	`--tensorboard`		If toggles, saves Tensorboard logs for debugging purposes.
	`--visualize-embeds`	`None`	If specified with a directory, saves an image of a TSNE 2D projection of the learned embeddings at the specified directory.
	`--save-spectrum`	`None`	If specifies, saves the spectrum of the learned embeddings output by the algorithm.

train_embed_link.py arguments

short	long	default	help
`-h`	`--help`		show this help message and exit
	`--dataset`	`Cora`	Dataset to perform training on. Available options: Cora,CiteSeer,PubMedDiabetes
	`--emb-size`	`128`	Embedding dimension. Defaults to 128.
	`--reg-weight`	`0.0`	Weight to use for L2 regularization. If norm_reg is True, then reg_weight/num_of_nodes is used instead.
	`--norm-reg`		Boolean for whether to normalize the L2 regularization weight by the number of nodes in the graph. Defaults to false.
	`--method`	`node2vec`	Algorithm to perform training on. Available options: node2vec,GraphSAGE,GCN,DGI
	`--verbose`	`1`	Level of verbosity. Defaults to 1.
	`--epochs`	`5`	Number of epochs through the dataset to be used for training.
	`--optimizer`	`Adam`	Optimization algorithm to use for training.
	`--learning-rate`	`0.001`	Learning rate to use for optimization.
	`--batch-size`	`64`	Batch size used for training.
	`--test-split`	`0.1`	Split of edge/non-edge set to be used for testing.
	`--output-fname`	`None`	If not None, saves the hyperparameters and testing results to a .json file with filename given by the argument.
	`--node2vec-p`	`1.0`	Hyperparameter governing probability of returning to source node.
	`--node2vec-q`	`1.0`	Hyperparameter governing probability of moving to a node away from the source node.
	`--node2vec-walk-number`	`50`	Number of walks used to generate a sample for node2vec.
	`--node2vec-walk-length`	`5`	Walk length to use for node2vec.
	`--gcn-activation`	`['relu']`	Specifies layers in terms of their output activation (either relu or linear), with the number of arguments determining the length of the GCN. Defaults to a single layer with relu activation.
	`--graphSAGE-aggregator`	`mean`	Specifies the aggreagtion rule used in GraphSAGE. Defaults to mean pooling.
	`--graphSAGE-nbhd-sizes`	`[10, 5]`	Specify multiple neighbourhood sizes for sampling in GraphSAGE. Defaults to [25, 10].

Code for the paper "Asymptotics of ℓ2 Regularized Network Embeddings"

Related tags

Overview

README

Requirements

Code

train_embed.py arguments

train_embed_link.py arguments

Owner

Andrew Davison

Federated Learning Based on Dynamic Regularization

[ICML 2021] A fast algorithm for fitting robust decision trees.

🍅🍅🍅YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~

PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time

机器学习、深度学习、自然语言处理等人工智能基础知识总结。

Python based framework for Automatic AI for Regression and Classification over numerical data.

[ICCV '21] In this repository you find the code to our paper Keypoint Communities

torchbearer: A model fitting library for PyTorch

Optimizes image files by converting them to webp while also updating all references.

Repository for self-supervised landmark discovery

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Source code for "OmniPhotos: Casual 360° VR Photography"

BBB streaming without Xorg and Pulseaudio and Chromium and other nonsense (heavily WIP)

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms.

📚 A collection of Jupyter notebooks for learning and experimenting with OpenVINO 👓

Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

[CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning

ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

Interpolation-based reduced-order models

Implementation of our paper "DMT: Dynamic Mutual Training for Semi-Supervised Learning"