Multi-Objective Reinforced Active Learning

Last update: Nov 19, 2022

Related tags

Deep Learning moral_rl

Overview

Multi-Objective Reinforced Active Learning

Dependencies

wandb
tqdm
pytorch >= 1.7.0
numpy >= 1.20.0
scipy >= 1.1.0
pycolab == 1.2

Weights and Biases

Our code depends on for visualizing and logging results during training. As a result, we call wandb.init(), which will prompt to add an API key for linking the training runs with your personal wandb account. This can be done by pasting the WANDB_API_KEY into the respective box when running the code for the first time.

Environments

Our gridworlds (Emergency: randomized_v2.py, Delivery: randomized_v3.py) build on the game engine with a custom wrapper to provide similar functionality as the gym . This engine comes with a user interface and any environment can be played in the console using python environment.py with arrow keys and w, a, s, d as controls.

Training

There are four training scripts for

manually training a PPO agent on custom rewards (ppo_train.py),
training AIRL on a single expert dataset (airl_train.py),
active MORL with custom/automatic preferences (moral_train.py) and
training DRLHP with custom/automatic preferences (drlhp_train.py).

When using automatic preferences, a desired ratio can be passed as an argument. For example,

python moral_train.py --ratio a b c

will run MORAL using a (real-valued) ratio of a:b:c among the three explicit objectives in Delivery.

Hyperparameters

Hyperparameters are passed as arguments to wandb.init() and can be changed by modifying the respective training files.

Multi-Objective Reinforced Active Learning

Related tags

Overview

Multi-Objective Reinforced Active Learning

Dependencies

Weights and Biases

Environments

Training

Hyperparameters

Owner

Markus Peschl

This is a clean and robust Pytorch implementation of DQN and Double DQN.

A PyTorch implementation of Learning to learn by gradient descent by gradient descent

A pre-trained language model for social media text in Spanish

NAVER BoostCamp Final Project

A pre-trained model with multi-exit transformer architecture.

Unofficial JAX implementations of Deep Learning models

Code release for "BoxeR: Box-Attention for 2D and 3D Transformers"

A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization

Tensorflow AffordanceNet and AffContext implementations

Instant-nerf-pytorch - NeRF trained SUPER FAST in pytorch

Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

MarcoPolo is a clustering-free approach to the exploration of bimodally expressed genes along with group information in single-cell RNA-seq data

[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Semi-supervised Implicit Scene Completion from Sparse LiDAR

Code for Motion Representations for Articulated Animation paper

Source code of "Hold me tight! Influence of discriminative features on deep network boundaries"

Creating a custom CNN hypertunned architeture for the Fashion MNIST dataset with Python, Keras and Tensorflow.

CPF: Learning a Contact Potential Field to Model the Hand-object Interaction

[ICCV 2021] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation

Plenoxels: Radiance Fields without Neural Networks, Code release WIP