A data-driven approach to quantify the value of classifiers in a machine learning ensemble.

Last update: Dec 29, 2022

Overview

Documentation | External Resources | Research Paper

Shapley is a Python library for evaluating binary classifiers in a machine learning ensemble.

The library consists of various methods to compute (approximate) the Shapley value of players (models) in weighted voting games (ensemble games) - a class of transferable utility cooperative games. We covered the exact enumeration based computation and various widely know approximation methods from economics and computer science research papers. There are also functionalities to identify the heterogeneity of the player pool based on the Shapley entropy. In addition, the framework comes with a detailed documentation, an intuitive tutorial, 100% test coverage and illustrative toy examples.

Citing

If you find Shapley useful in your research please consider adding the following citation:

@misc{rozemberczki2021shapley,
      title = {{The Shapley Value of Classifiers in Ensemble Games}}, 
      author = {Benedek Rozemberczki and Rik Sarkar},
      year = {2021},
      eprint = {2101.02153},
      archivePrefix = {arXiv},
      primaryClass = {cs.LG}
}

A simple example

Shapley makes solving voting games quite easy - see the accompanying tutorial. For example, this is all it takes to solve a weighted voting game with defined on the fly with permutation sampling:

import numpy as np
from shapley import PermutationSampler

W = np.random.uniform(0, 1, (1, 7))
W = W/W.sum()
q = 0.5

solver = PermutationSampler()
solver.solve_game(W, q)
shapley_values = solver.get_solution()

Methods Included

In detail, the following methods can be used.

Expected Marginal Contribution Approximation from Fatima et al.: A Linear Approximation Method for the Shapley Value
Multilinear Extension from Owen: Multilinear Extensions of Games
Monte Carlo Permutation Sampling from Maleki et al.: Bounding the Estimation Error of Sampling-based Shapley Value Approximation
Exact Enumeration from Shapley: A Value for N-Person Games

Head over to our documentation to find out more about installation, creation of datasets and a full list of implemented methods and available datasets. For a quick start, check out the examples in the examples/ directory.

If you notice anything unexpected, please open an issue. If you are missing a specific method, feel free to open a feature request.

Installation

$ pip install shapley

Running tests

$ python setup.py test

Running examples

$ cd examples
$ python permutation_sampler_example.py

License

MIT License

You might also like...

Scripts for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation and a convolutional neural network (CNN) for image classification

About subwAI subwAI - a project for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation

82 Jan 1, 2023

Comments

Error in running MLE example

Thank you for sharing your great work. I truly enjoyed reading it. However, I met an error when I tried the example. It seems to be fine for the MC example.

$ python multilinear_extension_example.py RuntimeWarning: invalid value encountered in true_divide self._Phi = self._Phi / np.sum(self._Phi, axis=1).reshape(-1, 1) Traceback (most recent call last): File "multilinear_extension_example.py", line 11, in solver.solve_game(W, q) File "/lib/python3.6/site-packages/shapley/solvers/multilinear_extension.py", line 34, in solve_game self._run_sanity_check(W, self._Phi) File "/lib/python3.6/site-packages/shapley/solution_concept.py", line 28, in _run_sanity_check self._verify_distribution(Phi) File "/lib/python3.6/site-packages/shapley/solution_concept.py", line 22, in _verify_distribution assert np.sum(Phi) - Phi.shape[0] < 0.001 AssertionError

opened by xxlya 2

Releases(v_10003)

v_10003(Apr 28, 2022)
Moves the Shapley library to an ABC based design.

Adds a version attribute.

Source code(tar.gz)
Source code(zip)
v_10002(May 16, 2021)

Source code(tar.gz)
Source code(zip)
v_10001(Feb 1, 2021)
Fixed the expectations and variances.

Source code(tar.gz)
Source code(zip)
v_10000(Dec 31, 2020)

The official first release of Shapley.
Source code(tar.gz)
Source code(zip)

A data-driven approach to quantify the value of classifiers in a machine learning ensemble.

Related tags

Overview

You might also like...

Scripts for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation and a convolutional neural network (CNN) for image classification

The Python ensemble sampling toolkit for affine-invariant MCMC

Neural Ensemble Search for Performant and Calibrated Predictions

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

zeus is a Python implementation of the Ensemble Slice Sampling method.

Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Using Hotel Data to predict High Value And Potential VIP Guests

A Simple Key-Value Data-store written in Python

Comments

Error in running MLE example

Releases(v_10003)

v_10003(Apr 28, 2022)

v_10002(May 16, 2021)

v_10001(Feb 1, 2021)

v_10000(Dec 31, 2020)

Owner

Benedek Rozemberczki

Convert dog pictures into various painting styles. Try LimnPet

Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

Tensorflow implementation of soft-attention mechanism for video caption generation.

A fuzzing framework for SMT solvers

Dataloader tools for language modelling

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

An SMPC companion library for Syft

PyTorch implementation for the ICLR 2020 paper "Understanding the Limitations of Variational Mutual Information Estimators"

Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch

simple artificial intelligence utilities

Element selection for functional materials discovery by integrated machine learning of atomic contributions to properties

Revisting Open World Object Detection

Adaptive, interpretable wavelets across domains (NeurIPS 2021)

Time should be taken seer-iously

Pull sensitive data from users on windows including discord tokens and chrome data.

Focal Loss for Dense Rotation Object Detection

this is a lite easy to use virtual keyboard project for anyone to use

PyTorch implementation of the paper Ultra Fast Structure-aware Deep Lane Detection

Simple PyTorch hierarchical models.

This package implements THOR: Transformer with Stochastic Experts.