onelearn: Online learning in Python

Last update: Nov 06, 2022

Overview

onelearn: Online learning in Python

Documentation | Reproduce experiments |

onelearn stands for ONE-shot LEARNning. It is a small python package for online learning with Python. It provides :

online (or one-shot) learning algorithms: each sample is processed once, only a single pass is performed on the data
including multi-class classification and regression algorithms
For now, only ensemble methods, namely Random Forests

Installation

The easiest way to install onelearn is using pip

pip install onelearn

But you can also use the latest development from github directly with

pip install git+https://github.com/onelearn/onelearn.git

References

@article{mourtada2019amf,
  title={AMF: Aggregated Mondrian Forests for Online Learning},
  author={Mourtada, Jaouad and Ga{\"\i}ffas, St{\'e}phane and Scornet, Erwan},
  journal={arXiv preprint arXiv:1906.10529},
  year={2019}
}

Comments

Unable to pickle AMFClassifier.
I would like to save the AMFClassifier, but am unable to pickle it. I have also tried to use dill or joblib, but they also don't seem to work.

Is there maybe another way to somehow export the AMFClassifier in any way, such that I can save it and load it in another kernel?

Below I added a snippet of code which reproduces the error. Note that only after the partial_fit method an error occurs when pickling. When the AMFClassifier has not been fit yet, pickling happens without problems, however, exporting an empty model is pretty useless.

Any help or tips is much appreciated.

from onelearn import AMFClassifier import dill as pickle from sklearn import datasets iris = datasets.load_iris() X = iris.data y = iris.target amf = AMFClassifier(n_classes=3) dump = pickle.dumps(amf) amf = pickle.loads(dump) amf.partial_fit(X,y) dump = pickle.dumps(amf) amf = pickle.loads(dump)
opened by w-feijen 1
Move experiments of the paper in a experiments folder
Update the documentation

Explain that we must clone the repo

Move also the short experiments to a examples folder and build a sphinx gallery with it
enhancement
opened by stephanegaiffas 1
Add some extra tests
Test that batch versus online training leads to the exact same forest

Test the behavior of reserve_samples, with several calls to partial_fit to check that memory is correctly allocated and

tests
opened by stephanegaiffas 1
What if predict_proba receives a single sample

get_amf_decision_online amf.partial_fit(X_train[iteration - 1], y_train[iteration - 1]) File "/Users/stephanegaiffas/Code/onelearn/onelearn/forest.py", line 259, in partial_fit n_samples, n_features = X.shape

opened by stephanegaiffas 1
Improve coverage

A problem is that @jit functions don't work with coverage... a workaround is to disable using the NUMBA_DISABLE_JIT environment variable, but breaks the code that use @jitclass and .class_type.instance_type attributes
enhancement bug fix

opened by stephanegaiffas 1

Releases(v0.3)

v0.3(Sep 29, 2021)
This release adds the following improvements

AMFClassifier and AMFRegressor can be serialized to files (using internally pickle) using the save and load methods

Source code(tar.gz)
Source code(zip)
v0.2.0(Apr 6, 2020)
This release adds the following improvements

SampleCollection pre-allocates more samples instead of the bare minimum for faster computation

The playground can be launched from the library

A documentation on readthedocs

Faster computations and a lot of code cleaning

Unittests for python 3.6-3.8

Source code(tar.gz)
Source code(zip)

Owner

GitHub Repository https://onelearn.readthedocs.io

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

1.3k Dec 22, 2022

Client - 🔥 A tool for visualizing and tracking your machine learning experiments

Weights and Biases Use W&B to build better models faster. Track and visualize all the pieces of your machine learning pipeline, from datasets to produ

5.2k Jan 03, 2023

Open MLOps - A Production-focused Open-Source Machine Learning Framework

Open MLOps - A Production-focused Open-Source Machine Learning Framework Open MLOps is a set of open-source tools carefully chosen to ease user experi

590 Dec 28, 2022

Tools for Optuna, MLflow and the integration of both.

HPOflow - Sphinx DOC Tools for Optuna, MLflow and the integration of both. Detailed documentation with examples can be found here: Sphinx DOC Table of

17 Nov 20, 2022

ETNA is an easy-to-use time series forecasting framework.

ETNA is an easy-to-use time series forecasting framework. It includes built in toolkits for time series preprocessing, feature generation, a variety of predictive models with unified interface - from

674 Jan 07, 2023

Climin is a Python package for optimization, heavily biased to machine learning scenarios

climin climin is a Python package for optimization, heavily biased to machine learning scenarios distributed under the BSD 3-clause license. It works

177 Sep 02, 2022

Combines MLflow with a database (PostgreSQL) and a reverse proxy (NGINX) into a multi-container Docker application

Combines MLflow with a database (PostgreSQL) and a reverse proxy (NGINX) into a multi-container Docker application (with docker-compose).

2 Dec 03, 2021

A classification model capable of accurately predicting the price of secondhand cars

The purpose of this project is create a classification model capable of accurately predicting the price of secondhand cars. The data used for model building is open source and has been added to this

2 Sep 13, 2022

This is my implementation on the K-nearest neighbors algorithm from scratch using Python

K Nearest Neighbors (KNN) algorithm In this Machine Learning world, there are various algorithms designed for classification problems such as Logistic

1 Jan 08, 2022

A collection of interactive machine-learning experiments: 🏋️models training + 🎨models demo

🤖 Interactive Machine Learning experiments: 🏋️models training + 🎨models demo

1.4k Jan 06, 2023

A game theoretic approach to explain the output of any machine learning model.

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allo

18.2k Jan 02, 2023

This repo implements a Topological SLAM: Deep Visual Odometry with Long Term Place Recognition (Loop Closure Detection)

This repo implements a topological SLAM system. Deep Visual Odometry (DF-VO) and Visual Place Recognition are combined to form the topological SLAM system.

32 Jun 23, 2022

Toolss - Automatic installer of hacking tools (ONLY FOR TERMUKS!)

Tools Автоматический установщик хакерских утилит (ТОЛЬКО ДЛЯ ТЕРМУКС!) Оригиналь

14 Jan 05, 2023

A simple application that calculates the probability distribution of a normal distribution

probability-density-function General info An application that calculates the probability density and cumulative distribution of a normal distribution

1 Oct 25, 2022

A Pythonic framework for threat modeling

pytm: A Pythonic framework for threat modeling Introduction Traditional threat modeling too often comes late to the party, or sometimes not at all. In

644 Dec 20, 2022

Machine Learning from Scratch

Machine Learning from Scratch Author: Shengxuan Wang From: Oregon State University Content: Building Machine Learning model from Scratch, without usin

0 Jul 05, 2022

Module for statistical learning, with a particular emphasis on time-dependent modelling

Operating system Build Status Linux/Mac Windows tick tick is a Python 3 module for statistical learning, with a particular emphasis on time-dependent

410 Dec 14, 2022

Case studies with Bayesian methods

8 Nov 26, 2022

Pydantic based mock data generation

This library offers powerful mock data generation capabilities for pydantic based models. It can also be used with other libraries that use pydantic as a foundation, for example SQLModel, Beanie and

396 Dec 28, 2022

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

Trading Tesla with Machine Learning and Sentiment Analysis An interactive program to train a Random Forest Classifier to predict Tesla daily prices us

31 Nov 17, 2022

onelearn: Online learning in Python

Related tags

Overview

onelearn: Online learning in Python

Installation

References

Comments

Unable to pickle AMFClassifier.

Move experiments of the paper in a experiments folder

Add some extra tests

What if predict_proba receives a single sample

Improve coverage

Releases(v0.3)

v0.3(Sep 29, 2021)

v0.2.0(Apr 6, 2020)

Owner

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

Client - 🔥 A tool for visualizing and tracking your machine learning experiments

Open MLOps - A Production-focused Open-Source Machine Learning Framework

Tools for Optuna, MLflow and the integration of both.

ETNA is an easy-to-use time series forecasting framework.

Climin is a Python package for optimization, heavily biased to machine learning scenarios

Combines MLflow with a database (PostgreSQL) and a reverse proxy (NGINX) into a multi-container Docker application

A classification model capable of accurately predicting the price of secondhand cars

This is my implementation on the K-nearest neighbors algorithm from scratch using Python

A collection of interactive machine-learning experiments: 🏋️models training + 🎨models demo

A game theoretic approach to explain the output of any machine learning model.

This repo implements a Topological SLAM: Deep Visual Odometry with Long Term Place Recognition (Loop Closure Detection)

Toolss - Automatic installer of hacking tools (ONLY FOR TERMUKS!)

A simple application that calculates the probability distribution of a normal distribution

A Pythonic framework for threat modeling

Machine Learning from Scratch

Module for statistical learning, with a particular emphasis on time-dependent modelling

Case studies with Bayesian methods

Pydantic based mock data generation

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.