WTTE-RNN a framework for churn and time to event prediction

Overview

WTTE-RNN

Build Status

Weibull Time To Event Recurrent Neural Network

A less hacky machine-learning framework for churn- and time to event prediction. Forecasting problems as diverse as server monitoring to earthquake- and churn-prediction can be posed as the problem of predicting the time to an event. WTTE-RNN is an algorithm and a philosophy about how this should be done.

Installation

Python

Check out README for Python package.

If this seems like overkill, the basic implementation can be found inlined as a jupyter notebook

Ideas and Basics

You have data consisting of many time-series of events and want to use historic data to predict the time to the next event (TTE). If you haven't observed the last event yet we've only observed a minimum bound of the TTE to train on. This results in what's called censored data (in red):

Censored data

Instead of predicting the TTE itself the trick is to let your machine learning model output the parameters of a distribution. This could be anything but we like the Weibull distribution because it's awesome. The machine learning algorithm could be anything gradient-based but we like RNNs because they are awesome too.

example WTTE-RNN architecture

The next step is to train the algo of choice with a special log-loss that can work with censored data. The intuition behind it is that we want to assign high probability at the next event or low probability where there wasn't any events (for censored data):

WTTE-RNN prediction over a timeline

What we get is a pretty neat prediction about the distribution of the TTE in each step (here for a single event):

WTTE-RNN prediction

A neat sideresult is that the predicted params is a 2-d embedding that can be used to visualize and group predictions about how soon (alpha) and how sure (beta). Here by stacking timelines of predicted alpha (left) and beta (right):

WTTE-RNN alphabeta.png

Warnings

There's alot of mathematical theory basically justifying us to use this nice loss function in certain situations:

loss-equation

So for censored data it only rewards pushing the distribution up, beyond the point of censoring. To get this to work you need the censoring mechanism to be independent from your feature data. If your features contains information about the point of censoring your algorithm will learn to cheat by predicting far away based on probability of censoring instead of tte. A type of overfitting/artifact learning. Global features can have this effect if not properly treated.

Status and Roadmap

The project is under development. The goal is to create a forkable and easily deployable model framework. WTTE is the algorithm but the whole project aims to be more. It's a visual philosophy and an opinionated idea about how churn-monitoring and reporting can be made beautiful and easy.

Pull-requests, recommendations, comments and contributions very welcome.

What's in the repository

  • Transformations
    • Data pipeline transformations (pandas.DataFrame of expected format to numpy)
    • Time to event and censoring indicator calculations
  • Weibull functions (cdf, pdf, quantile, mean etc)
  • Objective functions:
    • Tensorflow
    • Keras (Tensorflow + Theano)
  • Keras helpers
    • Weibull output layers
    • Loss functions
    • Callbacks
  • ~~ Lots of example-implementations ~~

Licensing

  • MIT license

Citation

@MastersThesis{martinsson:Thesis:2016,
    author = {Egil Martinsson},
    title  = {{WTTE-RNN : Weibull Time To Event Recurrent Neural Network}},
    school = {Chalmers University Of Technology},
    year   = {2016},
}

Contributing

Contributions/PR/Comments etc are very welcome! Post an issue if you have any questions and feel free to reach out to egil.martinsson[at]gmail.com.

Contributors (by order of commit)

  • Egil Martinsson
  • Dayne Batten (made the first keras-implementation)
  • Clay Kim
  • Jannik Hoffjann
  • Daniel Klevebring
  • Jeongkyu Shin
  • Joongi Kim
  • Jonghyun Park
Owner
Egil Martinsson
Egil Martinsson
Code for "R-GCN: The R Could Stand for Random"

RR-GCN: Random Relational Graph Convolutional Networks PyTorch Geometric code for the paper "R-GCN: The R Could Stand for Random" RR-GCN is an extensi

PreDiCT.IDLab 31 Sep 07, 2022
Progressive Growing of GANs for Improved Quality, Stability, and Variation

Progressive Growing of GANs for Improved Quality, Stability, and Variation — Official TensorFlow implementation of the ICLR 2018 paper Tero Karras (NV

Tero Karras 5.9k Jan 05, 2023
Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

SASSnet Code for paper: Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images(MICCAI 2020) Our code is origin from UA-MT You can fin

klein 125 Jan 03, 2023
PyTorch Implementation of ECCV 2020 Spotlight TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images

TuiGAN-PyTorch Official PyTorch Implementation of "TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images" (ECCV 2020 Spotligh

181 Dec 09, 2022
Implementations of the algorithms in the paper Approximative Algorithms for Multi-Marginal Optimal Transport and Free-Support Wasserstein Barycenters

Implementations of the algorithms in the paper Approximative Algorithms for Multi-Marginal Optimal Transport and Free-Support Wasserstein Barycenters

Johannes von Lindheim 3 Oct 29, 2022
Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

counterfactual-tpp This is a repository containing code and real data for the paper Counterfactual Temporal Point Processes. Pre-requisites This code

Networks Learning 11 Dec 09, 2022
Transformer model implemented with Pytorch

transformer-pytorch Transformer model implemented with Pytorch Attention is all you need-[Paper] Architecture Self-Attention self_attention.py class

Mingu Kang 12 Sep 03, 2022
A modification of Daniel Russell's notebook merged with Katherine Crowson's hq-skip-net changes

Edits made to this repo by Katherine Crowson I have added several features to this repository for use in creating higher quality generative art (featu

Paul Fishwick 10 May 07, 2022
System-oriented IR evaluations are limited to rather abstract understandings of real user behavior

Validating Simulations of User Query Variants This repository contains the scripts of the experiments and evaluations, simulated queries, as well as t

IR Group at Technische Hochschule Köln 2 Nov 23, 2022
Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

Tengfei Wang 371 Dec 30, 2022
PointPillars inference with TensorRT

A project demonstrating how to use CUDA-PointPillars to deal with cloud points data from lidar.

NVIDIA AI IOT 315 Dec 31, 2022
[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️

Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples This repository is the official implementation of "Tow

Sungyoon Lee 4 Jul 12, 2022
Video-based open-world segmentation

UVO_Challenge Team Alpes_runner Solutions This is an official repo for our UVO Challenge solutions for Image/Video-based open-world segmentation. Our

Yuming Du 84 Dec 22, 2022
Intrusion Detection System using ensemble learning (machine learning)

IDS-ML implementation of an intrusion detection system using ensemble machine learning methods Data set This project is carried out using the UNSW-15

4 Nov 25, 2022
A Python package to process & model ChEMBL data.

insilico: A Python package to process & model ChEMBL data. ChEMBL is a manually curated chemical database of bioactive molecules with drug-like proper

Steven Newton 0 Dec 09, 2021
Tooling for the Common Objects In 3D dataset.

CO3D: Common Objects In 3D This repository contains a set of tools for working with the Common Objects in 3D (CO3D) dataset. Download the dataset The

Facebook Research 724 Jan 06, 2023
Memory-Augmented Model Predictive Control

Memory-Augmented Model Predictive Control This repository hosts the source code for the journal article "Composing MPC with LQR and Neural Networks fo

Fangyu Wu 1 Jun 19, 2022
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.

[CVPR2022] Thin-Plate Spline Motion Model for Image Animation Source code of the CVPR'2022 paper "Thin-Plate Spline Motion Model for Image Animation"

yoyo-nb 1.4k Dec 30, 2022
Official Pytorch implementation of "Learning Debiased Representation via Disentangled Feature Augmentation (Neurips 2021, Oral)"

Learning Debiased Representation via Disentangled Feature Augmentation (Neurips 2021, Oral): Official Project Webpage This repository provides the off

Kakao Enterprise Corp. 68 Dec 17, 2022
This is the source code for the experiments related to the paper Unsupervised Audio Source Separation Using Differentiable Parametric Source Models

Unsupervised Audio Source Separation Using Differentiable Parametric Source Models This is the source code for the experiments related to the paper Un

30 Oct 19, 2022