ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions

Last update: Dec 17, 2022

Related tags

Machine Learning eli5

Overview

ELI5

ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions.

It provides support for the following machine learning frameworks and packages:

scikit-learn. Currently ELI5 allows to explain weights and predictions of scikit-learn linear classifiers and regressors, print decision trees as text or as SVG, show feature importances and explain predictions of decision trees and tree-based ensembles. ELI5 understands text processing utilities from scikit-learn and can highlight text data accordingly. Pipeline and FeatureUnion are supported. It also allows to debug scikit-learn pipelines which contain HashingVectorizer, by undoing hashing.
Keras - explain predictions of image classifiers via Grad-CAM visualizations.
xgboost - show feature importances and explain predictions of XGBClassifier, XGBRegressor and xgboost.Booster.
LightGBM - show feature importances and explain predictions of LGBMClassifier, LGBMRegressor and lightgbm.Booster.
CatBoost - show feature importances of CatBoostClassifier, CatBoostRegressor and catboost.CatBoost.
lightning - explain weights and predictions of lightning classifiers and regressors.
sklearn-crfsuite. ELI5 allows to check weights of sklearn_crfsuite.CRF models.

ELI5 also implements several algorithms for inspecting black-box models (see Inspecting Black-Box Estimators):

TextExplainer allows to explain predictions of any text classifier using LIME algorithm (Ribeiro et al., 2016). There are utilities for using LIME with non-text data and arbitrary black-box classifiers as well, but this feature is currently experimental.
Permutation importance method can be used to compute feature importances for black box estimators.

Explanation and formatting are separated; you can get text-based explanation to display in console, HTML version embeddable in an IPython notebook or web dashboards, a pandas.DataFrame object if you want to process results further, or JSON version which allows to implement custom rendering and formatting on a client.

License is MIT.

Check docs for more.

Note

This is the same project as https://github.com/TeamHG-Memex/eli5/, but due to temporary github access issues, 0.11 release is prepared in https://github.com/eli5-org/eli5 (this repo).

ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions

Related tags

Overview

ELI5

Owner

XGBoost-Ray is a distributed backend for XGBoost, built on top of distributed computing framework Ray.

WAGMA-SGD is a decentralized asynchronous SGD for distributed deep learning training based on model averaging.

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

ml4h is a toolkit for machine learning on clinical data of all kinds including genetics, labs, imaging, clinical notes, and more

A Microsoft Azure Web App project named Covid 19 Predictor using Machine learning Model

PySpark + Scikit-learn = Sparkit-learn

This project used bitcoin, S&P500, and gold to construct an investment portfolio that aimed to minimize risk by minimizing variance.

healthy and lesion models for learning based on the joint estimation of stochasticity and volatility

FLAML is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically

PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows.

Sequence learning toolkit for Python

This is an auto-ML tool specialized in detecting of outliers

onelearn: Online learning in Python

It is a forest of random projection trees

Projeto: Machine Learning: Linguagens de Programacao 2004-2001

whylogs: A Data and Machine Learning Logging Standard

AutoOED: Automated Optimal Experiment Design Platform

Regularization and Feature Selection in Least Squares Temporal Difference Learning

Formulae is a Python library that implements Wilkinson's formulas for mixed-effects models.

LibTraffic is a unified, flexible and comprehensive traffic prediction library based on PyTorch