Model factory is a ML training platform to help engineers to build ML models at scale

Last update: Sep 23, 2022

Related tags

Overview

Model Factory

Machine learning today is powering many businesses today, e.g., search engine, e-commerce, news or feed recommendation. Training high quality ML models is critical to all of these systems.

However, training a model is not trivial. Traditionally, engineers use single devvm to train models. It might be doable if you were only to build a few models. If you are interested in exploring hundreds or even thousands of ideas, repeating the workflow manually will be a painful process.

There are many issues with the above workflow:

Hard to scale
No tracking
No monitor
No end-to-end automation
Not easy to share with others
No centralized model management

The above pain points really slows engineers down when they are developing their ML models. Model factory is a project that targets at addressing the above issues.

Background

There are existing work in the industry which tries to address the above issues as well, e.g., Facebook fblearner, Google Kubeflow.

The key difference between model factory and other projects is that model factory promotes a pure python based authoring experience, while most others uses DAG (Directed Acyclic Graph). The philosophy gives model factory the following advantages:

Easy to learn: there is almost no learning curve. As long as you know how to write python, you know how to use model factory.
More flexible: control flow logic can be easily implemented on it.
Allow communication between nodes: free form communication can be done between operators, which opens up the possibility of building distributed training on top of model factory.

Installation

Please follow the Installation page to deploy model factory in your production or testing environment.

Development Guide

Please follow the Development Guide page to try out your first model factory pipeline.

Model factory is a ML training platform to help engineers to build ML models at scale

Related tags

Overview

Model Factory

Background

Installation

Development Guide

Owner

K-Means clusternig example with Python and Scikit-learn

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

Tools for diffing and merging of Jupyter notebooks.

Distributed deep learning on Hadoop and Spark clusters.

Predict profitability of trades based on indicator buy / sell signals

A modular active learning framework for Python

STUMPY is a powerful and scalable Python library for computing a Matrix Profile, which can be used for a variety of time series data mining tasks

李航《统计学习方法》复现

Machine Learning Algorithms

Toolkit for building machine learning models that generalize to unseen domains and are robust to privacy and other attacks.

This machine-learning algorithm takes in data from the last 60 days and tries to predict tomorrow's price of any crypto you ask it.

Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc)

Code for the TCAV ML interpretability project

Azure MLOps (v2) solution accelerators.

Magenta: Music and Art Generation with Machine Intelligence

moDel Agnostic Language for Exploration and eXplanation

A framework for building (and incrementally growing) graph-based data structures used in hierarchical or DAG-structured clustering and nearest neighbor search

Decision Tree Regression algorithm implemented on Python from scratch.

Responsible AI Workshop: a series of tutorials & walkthroughs to illustrate how put responsible AI into practice

A benchmark of data-centric tasks from across the machine learning lifecycle.