Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Last update: Nov 16, 2022

Related tags

Overview

Population-Based Bandits (PB2)

Code for the Population-Based Bandits (PB2) Algorithm, from the paper Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits.

The framework is based on a union of ray (using rllib and tune) and GPy. Heavily inspired by the ray tune pbt_ppo example.

NOTE PB2 is included in the ray.tune library, which is the official supported implementation. The link to the code is here, and the accompanying blog post is here.

Running the Code

To run the IMPALA experiment, use command:

python run_impala.py

To run the PPO experiment, use command:

python run_ppo.py

Config

Within that function, there are multiple ways to mix it up. You can choose the following:

-env_name: for example BreakoutNoFrameSkip-v4.
-method: either pb2 or pbt (or asha for PPO).
-freq: the frequency of updating hyperparams, we use 500,000 for IMPALA and 50,000 for PPO.
-seed: we used 0 1 2 3 4 5 6... and plan to add more seeds.
-max: the maximum number of timesteps, we used 10,000,000 for IMPALA and 1,000,000 for PPO.

It should also be possible to adapt this code to run other ray tune schedulers. We used it for ASHA in our PPO experiments. We are also working to include a BOHB baseline.

Please get in touch for all questions. jackph [at] robots [dot] ox [dot] ac [dot] uk

Citing PB2

Finally, if you found this repo useful, please consider citing us:

@inproceedings{NEURIPS2020_c7af0926,
 author = {Parker-Holder, Jack and Nguyen, Vu and Roberts, Stephen J},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
 pages = {17200--17211},
 publisher = {Curran Associates, Inc.},
 title = {Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits},
 url = {https://proceedings.neurips.cc/paper/2020/file/c7af0926b294e47e52e46cfebe173f20-Paper.pdf},
 volume = {33},
 year = {2020}
}

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Related tags

Overview

Population-Based Bandits (PB2)

Running the Code

Config

Citing PB2

Owner

Jack Parker-Holder

Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models

A tutorial on DataFrames.jl prepared for JuliaCon2021

ACV is a python library that provides explanations for any machine learning model or data.

Code for the paper "TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks"

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

[CVPR 2021] Unsupervised Degradation Representation Learning for Blind Super-Resolution

Mouse Brain in the Model Zoo

Lite-HRNet: A Lightweight High-Resolution Network

AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised Anime Face Generation

Fuzzification helps developers protect the released, binary-only software from attackers who are capable of applying state-of-the-art fuzzing techniques

Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

Code for the paper "Next Generation Reservoir Computing"

Img-process-manual - Utilize Python Numpy and Matplotlib to realize OpenCV baisc image processing function

A plug-and-play library for neural networks written in Python

The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

Playable Video Generation

TDN: Temporal Difference Networks for Efficient Action Recognition

This repository contains small projects related to Neural Networks and Deep Learning in general.

Segmentation Training Pipeline

A Framework for Encrypted Machine Learning in TensorFlow