Deep Survival Machines - Fully Parametric Survival Regression

Overview

Build Status     codecov     License: MIT     GitHub Repo stars

Package: dsm

Python package dsm provides an API to train the Deep Survival Machines and associated models for problems in survival analysis. The underlying model is implemented in pytorch.

For full documentation of the module, please see https://autonlab.github.io/DeepSurvivalMachines/

What is Survival Analysis?

Survival Analysis involves estimating when an event of interest, T would take place given some features or covariates X. In statistics and ML, these scenarios are modelled as regression to estimate the conditional survival distribution, P(T>t|X).
As compared to typical regression problems, Survival Analysis differs in two major ways:

  • The Event distribution, T has positive support i.e. T ∈ [0, ∞).
  • There is presence of censoring i.e. a large number of instances of data are lost to follow up.

Deep Survival Machines

Deep Survival Machines (DSM) is a fully parametric approach to model Time-to-Event outcomes in the presence of Censoring, first introduced in [1]. In the context of Healthcare ML and Biostatistics, this is known as 'Survival Analysis'. The key idea behind Deep Survival Machines is to model the underlying event outcome distribution as a mixure of some fixed ( K ) parametric distributions. The parameters of these mixture distributions as well as the mixing weights are modelled using Neural Networks.

Usage Example

from dsm import DeepSurvivalMachines
model = DeepSurvivalMachines()
model.fit()
model.predict_risk()

Recurrent Deep Survival Machines

Recurrent Deep Survival Machines (RDSM) builds on the original DSM model and allows for learning of representations of the input covariates using Recurrent Neural Networks like LSTMs, GRUs. Deep Recurrent Survival Machines is a natural fit to model problems where there are time dependendent covariates.

Deep Convolutional Survival Machines

Predictive maintenance and medical imaging sometimes requires to work with image streams. Deep Convolutional Survival Machines extends DSM and DRSM to learn representations of the input image data using convolutional layers. If working with streaming data, the learnt representations are then passed through an LSTM to model temporal dependencies before determining the underlying survival distributions.

⚠️ Not Implemented Yet!

Deep Cox Mixtures

The Cox Mixture involves the assumption that the survival function of the individual to be a mixture of K Cox Models. Conditioned on each subgroup Z=k; the PH assumptions are assumed to hold and the baseline hazard rates is determined non-parametrically using an spline-interpolated Breslow's estimator. For full details on Deep Cox Mixture, refer to the paper:

Deep Cox Mixtures for Survival Regression. Machine Learning in Health Conference (2021)

Installation

[email protected]:~$ git clone https://github.com/autonlab/DeepSurvivalMachines.git
[email protected]:~$ cd DeepSurvivalMachines
[email protected]:~$ pip install -r requirements.txt

Examples

  1. Deep Survival Machines on the SUPPORT Dataset
  2. Recurrent Deep Survival Machines on the PBC Dataset

References

Please cite the following papers if you are using the dsm package.

[1] Deep Survival Machines: Fully Parametric Survival Regression and Representation Learning for Censored Data with Competing Risks. IEEE Journal of Biomedical & Health Informatics (2021)

  @article{nagpal2021deep,
  title={Deep Survival Machines: Fully Parametric Survival Regression and\
  Representation Learning for Censored Data with Competing Risks},
  author={Nagpal, Chirag and Li, Xinyu and Dubrawski, Artur},
  journal={IEEE Journal of Biomedical and Health Informatics},
  year={2021}
  }

[2] Deep Parametric Time-to-Event Regression with Time-Varying Covariates. AAAI Spring Symposium (2021)

@InProceedings{pmlr-v146-nagpal21a,
  title = 	 {Deep Parametric Time-to-Event Regression with Time-Varying Covariates},
  author =       {Nagpal, Chirag and Jeanselme, Vincent and Dubrawski, Artur},
  booktitle = 	 {Proceedings of AAAI Spring Symposium on Survival Prediction - Algorithms, Challenges, and Applications 2021},
  series = 	 {Proceedings of Machine Learning Research},
  publisher =    {PMLR},
  }

[3] Deep Cox Mixtures for Survival Regression. Machine Learning for Healthcare (2021)

@InProceedings{nagpal2021dcm,
  title={Deep Cox Mixtures for Survival Regression},
  author={Nagpal, Chirag and Yadlowsky, Steve and Rostamzadeh, Negar and Heller, Katherine},
  booktitle={Proceedings of the 6th Machine Learning for Healthcare Conference},
  pages={674--708},
  year={2021},
  volume={149},
  series={Proceedings of Machine Learning Research},
  publisher={PMLR},
}

Compatibility

dsm requires python 3.5+ and pytorch 1.1+.

To evaluate performance using standard metrics dsm requires scikit-survival.

Contributing

dsm is on GitHub. Bug reports and pull requests are welcome.

License

MIT License

Copyright (c) 2020 Carnegie Mellon University, Auton Lab

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Owner
Carnegie Mellon University Auton Lab
Carnegie Mellon University Auton Lab
A python library for easy manipulation and forecasting of time series.

Time Series Made Easy in Python darts is a python library for easy manipulation and forecasting of time series. It contains a variety of models, from

Unit8 5.2k Jan 04, 2023
Production Grade Machine Learning Service

This project is made to help you scale from a basic Machine Learning project for research purposes to a production grade Machine Learning web service

Abdullah Zaiter 10 Apr 04, 2022
Practical Time-Series Analysis, published by Packt

Practical Time-Series Analysis This is the code repository for Practical Time-Series Analysis, published by Packt. It contains all the supporting proj

Packt 325 Dec 23, 2022
A Python implementation of FastDTW

fastdtw Python implementation of FastDTW [1], which is an approximate Dynamic Time Warping (DTW) algorithm that provides optimal or near-optimal align

tanitter 651 Jan 04, 2023
Machine Learning Algorithms ( Desion Tree, XG Boost, Random Forest )

implementation of machine learning Algorithms such as decision tree and random forest and xgboost on darasets then compare results for each and implement ant colony and genetic algorithms on tsp map,

Mohamadreza Rezaei 1 Jan 19, 2022
TIANCHI Purchase Redemption Forecast Challenge

TIANCHI Purchase Redemption Forecast Challenge

Haorui HE 4 Aug 26, 2022
AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

AutoTabular AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just

wenqi 2 Jun 26, 2022
A naive Bayes model for cancer classification using a set of documents

Naivebayes text classifcation model for cancer and noncancer documents Author: Alex King Purpose Requirements/files included How to use 1. Purpose The

Alex W King 1 Nov 24, 2021
This repository demonstrates the usage of hover to understand and supervise a machine learning task.

Hover Example Apps (works out-of-the-box on Binder) This repository demonstrates the usage of hover to understand and supervise a machine learning tas

Pavel 43 Dec 03, 2021
Hierarchical Time Series Forecasting using Prophet

htsprophet Hierarchical Time Series Forecasting using Prophet Credit to Rob J. Hyndman and research partners as much of the code was developed with th

Collin Rooney 131 Dec 02, 2022
Evidently helps analyze machine learning models during validation or production monitoring

Evidently helps analyze machine learning models during validation or production monitoring. The tool generates interactive visual reports and JSON profiles from pandas DataFrame or csv files. Current

Evidently AI 3.1k Jan 07, 2023
Create large-scale ML-driven multiscale simulation ensembles to study the interactions

MuMMI RAS v0.1 Released: Nov 16, 2021 MuMMI RAS is the application component of the MuMMI framework developed to create large-scale ML-driven multisca

4 Feb 16, 2022
A quick reference guide to the most commonly used patterns and functions in PySpark SQL

Using PySpark we can process data from Hadoop HDFS, AWS S3, and many file systems. PySpark also is used to process real-time data using Streaming and

Sundar Ramamurthy 53 Dec 21, 2022
Iris-Heroku - Putting a Machine Learning Model into Production with Flask and Heroku

Puesta en Producción de un modelo de aprendizaje automático con Flask y Heroku L

Jesùs Guillen 1 Jun 03, 2022
Distributed Deep learning with Keras & Spark

Elephas: Distributed Deep Learning with Keras & Spark Elephas is an extension of Keras, which allows you to run distributed deep learning models at sc

Max Pumperla 1.6k Dec 29, 2022
Course files for "Ocean/Atmosphere Time Series Analysis"

time-series This package contains all necessary files for the course Ocean/Atmosphere Time Series Analysis, an introduction to data and time series an

Jonathan Lilly 107 Nov 29, 2022
Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

Oracle 95 Dec 28, 2022
Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices

Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and t

164 Jan 04, 2023
XGBoost + Optuna

AutoXGB XGBoost + Optuna: no brainer auto train xgboost directly from CSV files auto tune xgboost using optuna auto serve best xgboot model using fast

abhishek thakur 517 Dec 31, 2022
ThunderSVM: A Fast SVM Library on GPUs and CPUs

What's new We have recently released ThunderGBM, a fast GBDT and Random Forest library on GPUs. add scikit-learn interface, see here Overview The miss

Xtra Computing Group 1.4k Dec 22, 2022