Supervised domain-agnostic prediction framework for probabilistic modelling

Last update: Oct 23, 2022

Overview

A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data points.

The package offers a variety of features and specifically allows for

the implementation of probabilistic prediction strategies in the supervised contexts
comparison of frequentist and Bayesian prediction methods
strategy optimization through hyperparamter tuning and ensemble methods (e.g. bagging)
workflow automation

List of developers and contributors

Documentation

The full documentation is available here.

Installation

Installation is easy using Python's package manager

$ pip install skpro

Contributing & Citation

We welcome contributions to the skpro project. Please read our contribution guide.

If you use skpro in a scientific publication, we would appreciate citations.

Comments

Distributions as return objects
Re-opening the sub-issue opened in #3 and commented upon by @murphyk

Question: should skpro's predict methods return a vector of distribution objects? For example, using the distributions from scipy.stats which implement methods pdf, cdf, mean, var, etc.

Pro:

this would be using an existing, consolidated, and well-supported interface

it might be easier to use

it might be easier to understand

Contra:

mixture types are not supported

l2 norm is not supported (as would be needed for squared/Gneiting loss)

mixed distributions on the reals, especially empirical distributions (weighted sum of deltas) which are returned by Bayesian packages are not supported

vectors of distributions are not supported, alternatively Cartesian products of distributions

this is not the status quo

help wanted
opened by fkiraly 11

documentation: np.mean(y_pred) does not work

I'm following along with this intro example.. However this line fails

(numpy.mean(y_pred) * 2).shape

Error below (seems to be because Distribution objects don't support the mean() function but instead insist on obscurely calling it point!)

np.mean(y_pred)
Traceback (most recent call last):

  File "<ipython-input-38-19819be87ab5>", line 1, in <module>
    np.mean(y_pred)

  File "/home/kpmurphy/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2920, in mean
    out=out, **kwargs)

  File "/home/kpmurphy/anaconda3/lib/python3.7/site-packages/numpy/core/_methods.py", line 75, in _mean
    ret = umr_sum(arr, axis, dtype, out, keepdims)

TypeError: unsupported operand type(s) for +: 'Distribution' and 'Distribution'

opened by murphyk 3

First example: 'utils' not found

The first example in your documentation (DensityBaseline) does not run right on my machine: it throws a 'module not found' exception at the call to 'utils'.

This might be a python version problem (I am using 3.6), so perhaps it's not an error in the normal sense - though I don't see any specification that the package required a particular python version. Apologies if I missed it: in any case, I fixed it by importing matplotlib instead: i.e.

import matplotlib.pyplot as plt plt.scatter(y_test, y_pred)

instead of:

import utils utils.plot_performance(y_test, y_pred)

opened by Thomas-M-H-Hope 2
problem in loading the skpro

It has been 2 days that I am trying to import skpro. But I can not I keep getting this error:

cannot import name 'six' from 'sklearn.externals' (C:\Users\My Book\anaconda3\lib\site-packages\sklearn\externals_init_.py)

opened by honestee 1
(wish)list of probabilistic regressors to implement or to interface
A wishlist for probabilistic regression methods to implement or interface. This is partly copied from the R counterpart https://github.com/mlr-org/mlr3proba/issues/32 . Number of stars at the end is estimated difficulty or time investment.

GLM

[ ] generalized linear model(s) with regression link, e.g., Gaussian *

[ ] generalized linear model(s) with count link, e.g., Poisson *

[ ] heteroscedastic linear regression ***

[ ] Bayesian GLM where conjugate priors are available, e.g., GLM with Gaussian link ***

KRR aka Gaussian process regression

[ ] vanilla kernel ridge regression with fixed kernel parameters and variance *

[ ] kernel ridge regression with MLE for kernel parameters and regularization parameter **

[ ] heteroscedastic KRR or Gaussian processes ***

CDE

[ ] variants of conditional density estimation (Nadaraya-Watson type) **

[ ] reduction to density estimation by binning of input variables, then apply unconditional density estimation **

Tree-based

[ ] probabilistic regression trees **

Neural networks

[ ] interface tensorflow probability - some hard-coded NN architectures **

[ ] generic tensorflow probability interface - some hard-coded NN architectures ***

Bayesian toolboxes

[ ] generic pymc3 interface ***

[ ] generic pyro interface ****

[ ] generic Stan interface ****

[ ] generic JAGS interface ****

[ ] generic BUGS interface ****

[ ] generic Bayesian interface - prior-valued hyperparameters *****

Pipeline elements for target transformation

[ ] distr fixed target transformation **

[ ] distr predictive target calibration **

Composite techniques, reduction to deterministic regression

[ ] stick mean, sd, from a deterministic regressor which already has these as return types into some location/scale distr family (Gaussian, Laplace) *

[ ] use model 1 for the mean, model 2 fit to residuals (squared, absolute, or log), put this in some location/scale distr family (Gaussian, Laplace) **

[ ] upper/lower thresholder for a regression prediction, to use as a pipeline element for a forced lower variance bound **

[ ] generic parameter prediction by elicitation, output being plugged into parameters of a distr object not necessarily scale/location ****

[ ] reduction via bootstrapped sampling of a determinstic regressor **

Ensembling type pipeline elements and compositors

[ ] simple bagging, averaging of pdf/cdf **

[ ] probabilistic boosting ***

[ ] probabilistic stacking ***

baselines

[ ] always predict a Gaussian with mean = training mean, var = training var *

[ ] IMPORTANT as featureless baseline: reduction to distr/density estimation to produce an unconditional probabilistic regressor **

[ ] IMPORTANT as deterministic style baseline: reduction to deterministic regression, mean = prediction by det.regressor, var = training sample var, distr type = Gaussian (or Laplace) **

Other reduction from/to probabilistic regression

[ ] reducing deterministic regression to probabilistic regression - take mean, median or mode **

[ ] reduction(s) to quantile regression, use predictive quantiles to make a distr ***

[ ] reducing deterministic (quantile) regression to probabilistic regression - take quantile(s) **

[ ] reducing interval regression to probabilistic regression - take mean/sd, or take quantile(s) **

[ ] reduction to survival, as the sub-case of no censoring **

[ ] reduction to classification, by binning ***

good first issue
opened by fkiraly 0
skpro-refactoring (version-2)
See below some comments/description of the coming refactoring contents :

Distribution classes refactoring in a more OOD way (see. skpro->distribution)

Losse functions (see. metrics->distribution)

Estimators (see. metrics->distribution)

Some descriptive notebooks (in docs->notebooks) and a full set of unit test (in tests) are also available.
opened by jesellier 24

Releases(v1.0.1-beta)

v1.0.1-beta(Feb 18, 2019)

Documentation improvements and small fixes
Source code(tar.gz)
Source code(zip)
1.0.0b1(Dec 8, 2017)

The first public beta release of skpro!
Source code(tar.gz)
Source code(zip)

Owner

The Alan Turing Institute

The UK's national institute for data science and artificial intelligence.

GitHub Repository https://alan-turing-institute.github.io/skpro/

FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX.

FedJAX: Federated learning with JAX What is FedJAX? FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX. FedJAX priori

208 Dec 14, 2022

Customised to detect objects automatically by a given model file(onnx)

LabelImg LabelImg is a graphical image annotation tool. It is written in Python and uses Qt for its graphical interface. Annotations are saved as XML

1 Jun 07, 2022

Spatial Sparse Convolution Library

SpConv: Spatially Sparse Convolution Library PyPI Install Downloads CPU (Linux Only) pip install spconv CUDA 10.2 pip install spconv-cu102 CUDA 11.1 p

1.2k Jan 07, 2023

A compendium of useful, interesting, inspirational usage of pandas functions, each example will be an ipynb file

Pandas_by_examples A compendium of useful/interesting/inspirational usage of pandas functions, each example will be an ipynb file What is this reposit

32 Nov 20, 2022

Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"

MeshTransformer ✨ This is our research code of End-to-End Human Pose and Mesh Reconstruction with Transformers. MEsh TRansfOrmer is a simple yet effec

473 Dec 31, 2022

This repository contains demos I made with the Transformers library by HuggingFace.

Transformers-Tutorials Hi there! This repository contains demos I made with the Transformers library by 🤗 HuggingFace. Currently, all of them are imp

3.5k Jan 01, 2023

Artificial Intelligence search algorithm base on Pacman

Pacman Search Artificial Intelligence search algorithm base on Pacman Source The Pacman Projects by the University of California, Berkeley. Layouts Di

6 Nov 17, 2022

PyTorch original implementation of Cross-lingual Language Model Pretraining.

XLM NEW: Added XLM-R model. PyTorch original implementation of Cross-lingual Language Model Pretraining. Includes: Monolingual language model pretrain

2.7k Dec 27, 2022

Subdivision-based Mesh Convolutional Networks

Subdivision-based Mesh Convolutional Networks The official implementation of SubdivNet in our paper, Subdivion-based Mesh Convolutional Networks Requi

181 Dec 28, 2022

MEND: Model Editing Networks using Gradient Decomposition

MEND: Model Editing Networks using Gradient Decomposition Setup Environment This codebase uses Python 3.7.9. Other versions may work as well. Create a

141 Dec 02, 2022

Official repo for AutoInt: Automatic Integration for Fast Neural Volume Rendering in CVPR 2021

AutoInt: Automatic Integration for Fast Neural Volume Rendering CVPR 2021 Project Page | Video | Paper PyTorch implementation of automatic integration

149 Dec 22, 2022

Implementation for Homogeneous Unbalanced Regularized Optimal Transport

HUROT: An Homogeneous formulation of Unbalanced Regularized Optimal Transport. This repository provides code related to this preprint. This is an alph

1 Feb 17, 2022

Code for "ATISS: Autoregressive Transformers for Indoor Scene Synthesis", NeurIPS 2021

ATISS: Autoregressive Transformers for Indoor Scene Synthesis This repository contains the code that accompanies our paper ATISS: Autoregressive Trans

138 Dec 22, 2022

Sub-tomogram-Detection - Deep learning based model for Cyro ET Sub-tomogram-Detection

Deep learning based model for Cyro ET Sub-tomogram-Detection High degree of stru

2 Feb 04, 2022

CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification (ICCV2021)

CM-NAS Official Pytorch code of paper CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification in ICCV2021. Vis

40 Nov 25, 2022

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition Paper: MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition accepted fo

64 Dec 18, 2022

Pytorch cuda extension of grid_sample1d

Grid Sample 1d pytorch cuda extension of grid sample 1d. Since pytorch only supports grid sample 2d/3d, I extend the 1d version for efficiency. The fo

24 Dec 03, 2022

For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

LongScientificFormer For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training. Some code

6 Nov 02, 2022

DL & CV-based indicator toolset for the vehicle drivers via live dash-cam footage.

Vehicle Indicator Toolset Deep Learning and Computer Vision based indicator toolset for vehicle drivers using live dash-cam footages. Tracking of vehi

12 Dec 28, 2021

PyTorch implementation for our paper Learning Character-Agnostic Motion for Motion Retargeting in 2D, SIGGRAPH 2019

Learning Character-Agnostic Motion for Motion Retargeting in 2D We provide PyTorch implementation for our paper Learning Character-Agnostic Motion for

367 Dec 22, 2022