Implementation of hyperparameter optimization/tuning methods for machine learning & deep learning models

Overview

Hyperparameter Optimization of Machine Learning Algorithms

This code provides a hyper-parameter optimization implementation for machine learning algorithms, as described in the paper:
L. Yang and A. Shami, “On hyperparameter optimization of machine learning algorithms: Theory and practice,” Neurocomputing, vol. 415, pp. 295–316, 2020, doi: https://doi.org/10.1016/j.neucom.2020.07.061.

To fit a machine learning model into different problems, its hyper-parameters must be tuned. Selecting the best hyper-parameter configuration for machine learning models has a direct impact on the model's performance. In this paper, optimizing the hyper-parameters of common machine learning models is studied. We introduce several state-of-the-art optimization techniques and discuss how to apply them to machine learning algorithms. Many available libraries and frameworks developed for hyper-parameter optimization problems are provided, and some open challenges of hyper-parameter optimization research are also discussed in this paper. Moreover, experiments are conducted on benchmark datasets to compare the performance of different optimization methods and provide practical examples of hyper-parameter optimization.

This paper and code will help industrial users, data analysts, and researchers to better develop machine learning models by identifying the proper hyper-parameter configurations effectively.

Paper

On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice
One-column version: arXiv
Two-column version: Elsevier

Quick Navigation

Section 3: Important hyper-parameters of common machine learning algorithms
Section 4: Hyper-parameter optimization techniques introduction
Section 5: How to choose optimization techniques for different machine learning models
Section 6: Common Python libraries/tools for hyper-parameter optimization
Section 7: Experimental results (sample code in "HPO_Regression.ipynb" and "HPO_Classification.ipynb")
Section 8: Open challenges and future research directions
Summary table for Sections 3-6: Table 2: A comprehensive overview of common ML models, their hyper-parameters, suitable optimization techniques, and available Python libraries
Summary table for Sections 8: Table 10: The open challenges and future directions of HPO research

Implementation

Sample code for hyper-parameter optimization implementation for machine learning algorithms is provided in this repository.

Sample code for Regression problems

HPO_Regression.ipynb
Dataset used: Boston-Housing

Sample code for Classification problems

HPO_Classification.ipynb
Dataset used: MNIST

Machine Learning & Deep Learning Algorithms

  • Random forest (RF)
  • Support vector machine (SVM)
  • K-nearest neighbor (KNN)
  • Artificial Neural Networks (ANN)

Hyperparameter Configuration Space

ML Model Hyper-parameter Type Search Space
RF Classifier n_estimators Discrete [10,100]
max_depth Discrete [5,50]
min_samples_split Discrete [2,11]
min_samples_leaf Discrete [1,11]
criterion Categorical 'gini', 'entropy'
max_features Discrete [1,64]
SVM Classifier C Continuous [0.1,50]
kernel Categorical 'linear', 'poly', 'rbf', 'sigmoid'
KNN Classifier n_neighbors Discrete [1,20]
ANN Classifier optimizer Categorical 'adam', 'rmsprop', 'sgd'
activation Categorical 'relu', 'tanh'
batch_size Discrete [16,64]
neurons Discrete [10,100]
epochs Discrete [20,50]
patience Discrete [3,20]
RF Regressor n_estimators Discrete [10,100]
max_depth Discrete [5,50]
min_samples_split Discrete [2,11]
min_samples_leaf Discrete [1,11]
criterion Categorical 'mse', 'mae'
max_features Discrete [1,13]
SVM Regressor C Continuous [0.1,50]
kernel Categorical 'linear', 'poly', 'rbf', 'sigmoid'
epsilon Continuous [0.001,1]
KNN Regressor n_neighbors Discrete [1,20]
ANN Regressor optimizer Categorical 'adam', 'rmsprop'
activation Categorical 'relu', 'tanh'
loss Categorical 'mse', 'mae'
batch_size Discrete [16,64]
neurons Discrete [10,100]
epochs Discrete [20,50]
patience Discrete [3,20]

HPO Algorithms

  • Grid search
  • Random search
  • Hyperband
  • Bayesian Optimization with Gaussian Processes (BO-GP)
  • Bayesian Optimization with Tree-structured Parzen Estimator (BO-TPE)
  • Particle swarm optimization (PSO)
  • Genetic algorithm (GA)

Requirements

Contact-Info

Please feel free to contact me for any questions or cooperation opportunities. I'd be happy to help.

Citation

If you find this repository useful in your research, please cite this article as:

L. Yang and A. Shami, “On hyperparameter optimization of machine learning algorithms: Theory and practice,” Neurocomputing, vol. 415, pp. 295–316, 2020, doi: https://doi.org/10.1016/j.neucom.2020.07.061.

@article{YANG2020295,
title = "On hyperparameter optimization of machine learning algorithms: Theory and practice",
author = "Li Yang and Abdallah Shami",
volume = "415",
pages = "295 - 316",
journal = "Neurocomputing",
year = "2020",
issn = "0925-2312",
doi = "https://doi.org/10.1016/j.neucom.2020.07.061",
url = "http://www.sciencedirect.com/science/article/pii/S0925231220311693"
}
Owner
Li Yang
Ph.D. Candidate in OC2 Lab at Western University
Li Yang
The Adapter-Bot: All-In-One Controllable Conversational Model

The Adapter-Bot: All-In-One Controllable Conversational Model This is the implementation of the paper: The Adapter-Bot: All-In-One Controllable Conver

CAiRE 37 Nov 04, 2022
Intent parsing and slot filling in PyTorch with seq2seq + attention

PyTorch Seq2Seq Intent Parsing Reframing intent parsing as a human - machine translation task. Work in progress successor to torch-seq2seq-intent-pars

Sean Robertson 160 Jan 07, 2023
Source Code For Template-Based Named Entity Recognition Using BART

Template-Based NER Source Code For Template-Based Named Entity Recognition Using BART Training Training train.py Inference inference.py Corpus ATIS (h

174 Dec 19, 2022
[IJCAI'21] Deep Automatic Natural Image Matting

Deep Automatic Natural Image Matting [IJCAI-21] This is the official repository of the paper Deep Automatic Natural Image Matting. Introduction | Netw

Jizhizi_Li 316 Jan 06, 2023
Code for Motion Representations for Articulated Animation paper

Motion Representations for Articulated Animation This repository contains the source code for the CVPR'2021 paper Motion Representations for Articulat

Snap Research 851 Jan 09, 2023
CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

CIFS This repository provides codes for CIFS (ICML 2021). CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Sel

Hanshu YAN 19 Nov 12, 2022
Implementation of Shape and Electrostatic similarity metric in deepFMPO.

DeepFMPO v3D Code accompanying the paper "On the value of using 3D-shape and electrostatic similarities in deep generative methods". The paper can be

34 Nov 28, 2022
The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis (NeurIPS 2021) Project Page | Paper Xudong Xu, Xingang Pan, Dahua Lin and Bo Dai GOF

xuxudong 97 Nov 10, 2022
A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning

Officile code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning"

Mathieu Godbout 1 Nov 19, 2021
A tutorial showing how to train, convert, and run TensorFlow Lite object detection models on Android devices, the Raspberry Pi, and more!

A tutorial showing how to train, convert, and run TensorFlow Lite object detection models on Android devices, the Raspberry Pi, and more!

Evan 1.3k Jan 02, 2023
Learning Super-Features for Image Retrieval

Learning Super-Features for Image Retrieval This repository contains the code for running our FIRe model presented in our ICLR'22 paper: @inproceeding

NAVER 101 Dec 28, 2022
My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs (GNN, GAT, GraphSAGE, GCN)

machine-learning-with-graphs My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs Course materials can be

Marko Njegomir 7 Dec 14, 2022
Joint parameterization and fitting of stroke clusters

StrokeStrip: Joint Parameterization and Fitting of Stroke Clusters Dave Pagurek van Mossel1, Chenxi Liu1, Nicholas Vining1,2, Mikhail Bessmeltsev3, Al

Dave Pagurek 44 Dec 01, 2022
Data pipelines for both TensorFlow and PyTorch!

rapidnlp-datasets Data pipelines for both TensorFlow and PyTorch ! If you want to load public datasets, try: tensorflow/datasets huggingface/datasets

1 Dec 08, 2021
Denoising Diffusion Probabilistic Models

Denoising Diffusion Probabilistic Models Jonathan Ho, Ajay Jain, Pieter Abbeel Paper: https://arxiv.org/abs/2006.11239 Website: https://hojonathanho.g

Jonathan Ho 1.5k Jan 08, 2023
Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and t

305 Dec 16, 2022
Official implementation for Scale-Aware Neural Architecture Search for Multivariate Time Series Forecasting

1 SNAS4MTF This repo is the official implementation for Scale-Aware Neural Architecture Search for Multivariate Time Series Forecasting. 1.1 The frame

SZJ 5 Sep 21, 2022
A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Segnet is deep fully convolutional neural network architecture for semantic pixel-wise segmentation. This is implementation of http://arxiv.org/pdf/15

Pradyumna Reddy Chinthala 190 Dec 15, 2022
Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging"

Deep Optics for Single-shot High-dynamic-range Imaging Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging" CVPR, 2

Stanford Computational Imaging Lab 40 Dec 12, 2022
hipCaffe: the HIP port of Caffe

Caffe Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Cent

ROCm Software Platform 126 Dec 05, 2022