TigerLily: Finding drug interactions in silico with the Graph.

Overview

PyPI Version Docs Status Code Coverage Build Status Arxiv


Drug Interaction Prediction with Tigerlily

Documentation | Example Notebook | Youtube Video | Project Report

Tigerlily is a TigerGraph based system designed to solve the drug interaction prediction task. In this machine learning task, we want to predict whether two drugs have an adverse interaction. Our framework allows us to solve this highly relevant real-world problem using graph mining techniques in these steps:


(A) Creating and populating a Graph

As a first step, the basic TigerLily tools are imported, and we load the example dataset that integrated DrugBankDDI and the BioSNAP datasets. We create a PersonalizedPageRankMachine and connect to the host with the Graph. The settings of this machine should be the appropriate user credentials and details; a secret is obtained in the TigerGraph Graph Studio. We install the default Personalized PageRank query and upload the edges of the example dataset used in our demonstrations. This graph has drug and protein nodes, drug-protein and protein-protein interactions. Our goal is to predict the drug-drug interactions.

from tigerlily.dataset import ExampleDataset
from tigerlily.embedding import EmbeddingMachine
from tigerlily.operator import hadamard_operator
from tigerlily.pagerank import PersonalizedPageRankMachine

dataset = ExampleDataset()

edges = dataset.read_edges()
target = dataset.read_target()

machine = PersonalizedPageRankMachine(host="host_name",
                                      graphname="graph_name",
                                      username="username_value",
                                      secret="secret_value",
                                      password="password_value")
                           
machine.connect()
machine.install_query()

machine.upload_graph(new_graph=True, edges=edges)

(B) Computing the Approximate Personalized PageRank vectors

We are only interested in describing the neighbourhood of drug nodes in the biological graph. Because of this, we only retrieve the neighbourhood of the drugs - for each drug we retrieve those nodes (top-k closest neighbors) which are the closest based on the Personalized PageRank scores. We are going to learn the drug embeddings based on these scores.

drug_node_ids = machine.connection.getVertices("drug")

pagerank_scores = machine.get_personalized_pagerank(drug_node_ids)

(C) Learning the Drug Embeddings and Drug Pair Feature Generation

We create an embedding machine that creates drug node representations. The embedding machine instance has a random seed, a dimensions hyperparameter (this sets the number of factors), and a maximal iteration count for the factorization. An embedding is learned from the Personalized PageRank scores and using the drug features we create drug pair features with the operator function.

embedding_machine = EmbeddingMachine(seed=42,
                                     dimensions=32,
                                     max_iter=100)

embedding = embedding_machine.fit(pagerank_scores)

drug_pair_features = embedding_machine.create_features(target, hadamard_operator)

(D) Predicting Drug Interactions and Inference

We load a gradient boosting-based classifier, an evaluation metric for binary classification, and a function to create train-test splits. We create a train and test portion of the drug pairs using 80% of the pairs for training. A gradient boosted tree model is trained, score the model on the test set. We compute an AUROC score on the test portion of the dataset and print it out.

from lightgbm import LGBMClassifier
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(drug_pair_features,
                                                    target,
                                                    train_size=0.8,
                                                    random_state=42)

model = LGBMClassifier(learning_rate=0.01,
                       n_estimators=100)

model.fit(X_train,y_train["label"])

predicted_label = model.predict_proba(X_test)

auroc_score_value = roc_auc_score(y_test["label"], predicted_label[:,1])

print(f'AUROC score: {auroc_score_value :.4f}')

Head over to the documentation to find out more about installation and a full API reference. For a quick start, check out the example notebook. If you notice anything unexpected, please open an issue.


Citing

If you find Tigerlily useful in your research, please consider adding the following citation:

@misc{tigerlily2022,
  author = {Benedek Rozemberczki},
  title = {TigerLily: Finding drug interactions in silico with the Graph},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/benedekrozemberczki/tigerlily}},
}

Installation

To install tigerlily, simply run:

pip install tigerlily

Running tests

Running tests requires that you run:

$ tox -e py

License


Credit

The TigerLily logo and the high level machine learning workflow image are based on:

Benedek Rozemberczki has a yearly subscription to the Noun Project that allows the customization and commercial use of the icons.

You might also like...
git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Self-Attention Attribution This repository contains the implementation for AAAI-2021 paper Self-Attention Attribution: Interpreting Information Intera

Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time
Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time

Semi Hand-Object Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time (CVPR 2021).

Official code for
Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Focal Transformer This is the official implementation of our Focal Transformer -- "Focal Self-attention for Local-Global Interactions in Vision Transf

A simple API wrapper for Discord interactions.
A simple API wrapper for Discord interactions.

Your ultimate Discord interactions library for discord.py. About | Installation | Examples | Discord | PyPI About What is discord-py-interactions? dis

Neon: an add-on for Lightbulb making it easier to handle component interactions

Neon Neon is an add-on for Lightbulb making it easier to handle component interactions. Installation pip install git+https://github.com/neonjonn/light

A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Span-ASTE-Pytorch This repository is a pytorch version that implements Ali's ACL 2021 research paper Learning Span-Level Interactions for Aspect Senti

Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GanFormer and TransGan paper

TransGanFormer (wip) Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GansFormer and TransGan paper. I

[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search
[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search The official implementation of the paper LightTra

Finding an Unsupervised Image Segmenter in each of your Deep Generative Models

Finding an Unsupervised Image Segmenter in each of your Deep Generative Models Description Recent research has shown that numerous human-interpretable

Releases(v0.1.0)
Owner
Benedek Rozemberczki
Machine Learning Engineer at AstraZeneca | PhD from The University of Edinburgh.
Benedek Rozemberczki
Source code of our work: "Benchmarking Deep Models for Salient Object Detection"

SALOD Source code of our work: "Benchmarking Deep Models for Salient Object Detection". In this works, we propose a new benchmark for SALient Object D

22 Dec 30, 2022
Fast and robust clustering of point clouds generated with a Velodyne sensor.

Depth Clustering This is a fast and robust algorithm to segment point clouds taken with Velodyne sensor into objects. It works with all available Velo

Photogrammetry & Robotics Bonn 957 Dec 21, 2022
Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

TFLite-HITNET-Stereo-depth-estimation Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite. Stereo depth e

Ibai Gorordo 22 Oct 20, 2022
A decent AI that solves daily Wordle puzzles. Works with different websites with similar wordlists,.

Wordle-AI A decent AI that solves daily "Wordle" puzzles. Works with different websites with similar wordlists. When prompted with "Word:" enter the w

Ethan 1 Feb 10, 2022
A parametric soroban written with CADQuery.

A parametric soroban written in CADQuery The purpose of this project is to demonstrate how "code CAD" can be intuitive to learn. See soroban.py for a

Lee 4 Aug 13, 2022
MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

MLP-Numpy A simple modular implementation of Multi Layer Perceptron in pure Numpy. I used the Iris dataset from scikit-learn library for the experimen

Soroush Omranpour 1 Jan 01, 2022
[CVPR'21] MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation

MonoRUn MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. CVPR 2021. [paper] Hansheng Chen, Yuyao Huang, Wei Tian*

同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 96 Dec 10, 2022
Warning: This project does not have any current developer. See bellow.

Pylearn2: A machine learning research library Warning : This project does not have any current developer. We will continue to review pull requests and

Laboratoire d’Informatique des Systèmes Adaptatifs 2.7k Dec 26, 2022
ICNet and PSPNet-50 in Tensorflow for real-time semantic segmentation

Real-Time Semantic Segmentation in TensorFlow Perform pixel-wise semantic segmentation on high-resolution images in real-time with Image Cascade Netwo

Oles Andrienko 219 Nov 21, 2022
RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection

RODD Official Implementation of 2022 CVPRW Paper RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection Introduction: Recent studie

Umar Khalid 17 Oct 11, 2022
Spatial-Temporal Transformer for Dynamic Scene Graph Generation, ICCV2021

Spatial-Temporal Transformer for Dynamic Scene Graph Generation Pytorch Implementation of our paper Spatial-Temporal Transformer for Dynamic Scene Gra

Yuren Cong 119 Jan 01, 2023
Self-Guided Contrastive Learning for BERT Sentence Representations

Self-Guided Contrastive Learning for BERT Sentence Representations This repository is dedicated for releasing the implementation of the models utilize

Taeuk Kim 16 Dec 04, 2022
A TikTok-like recommender system for GitHub repositories based on Gorse

GitRec GitRec is the missing recommender system for GitHub repositories based on Gorse. Architecture The trending crawler crawls trending repositories

337 Jan 04, 2023
This package contains a PyTorch Implementation of IB-GAN of the submitted paper in AAAI 2021

The PyTorch implementation of IB-GAN model of AAAI 2021 This package contains a PyTorch implementation of IB-GAN presented in the submitted paper (IB-

Insu Jeon 9 Mar 30, 2022
A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

collie Collie is a library for preparing, training, and evaluating implicit deep learning hybrid recommender systems, named after the Border Collie do

ShopRunner 96 Dec 29, 2022
Official implementation for the paper: Multi-label Classification with Partial Annotations using Class-aware Selective Loss

Multi-label Classification with Partial Annotations using Class-aware Selective Loss Paper | Pretrained models Official PyTorch Implementation Emanuel

99 Dec 27, 2022
Implementation for paper "Towards the Generalization of Contrastive Self-Supervised Learning"

Contrastive Self-Supervised Learning on CIFAR-10 Paper "Towards the Generalization of Contrastive Self-Supervised Learning", Weiran Huang, Mingyang Yi

Weiran Huang 13 Nov 30, 2022
Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

PyTorch code to reproduce LyDROO algorithm [1], which is an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability an

Liang HUANG 87 Dec 28, 2022
Autonomous Movement from Simultaneous Localization and Mapping

Autonomous Movement from Simultaneous Localization and Mapping About us Built by a group of Clarkson University students with the help from Professor

14 Nov 07, 2022
Improving XGBoost survival analysis with embeddings and debiased estimators

xgbse: XGBoost Survival Embeddings "There are two cultures in the use of statistical modeling to reach conclusions from data

Loft 242 Dec 30, 2022