PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

Last update: Dec 08, 2022

Related tags

Deep Learning idaac

Overview

IDAAC: Invariant Decoupled Advantage Actor-Critic

This is a PyTorch implementation of the methods proposed in

Decoupling Value and Policy for Generalization in Reinforcement Learning by

Roberta Raileanu and Rob Fergus.

Citation

If you use this code in your own work, please cite our paper:

@article{Raileanu2021DecouplingVA,
  title={Decoupling Value and Policy for Generalization in Reinforcement Learning},
  author={Roberta Raileanu and R. Fergus},
  journal={ArXiv},
  year={2021},
  volume={abs/2102.10330}
}

Requirements

To install all the required dependencies:

conda create -n idaac python=3.7
conda activate idaac

cd idaac
pip install -r requirements.txt

pip install procgen

git clone https://github.com/openai/baselines.git
cd baselines 
python setup.py install

Instructions

This repo provides instructions for training IDAAC, DAAC, and PPO on the Procgen benchmark.

Train IDAAC on CoinRun

python train.py --env_name coinrun --algo idaac

Train DAAC on CoinRun

python train.py --env_name coinrun --algo daac

Train PPO on CoinRun

python train.py --env_name coinrun --algo ppo --ppo_epoch 3

Note: The default code uses the same set of hyperparameters (HPs) for all environments, which are the best ones overall. In our studies, we've found some of the games can further benefit from slightly different HPs, so we provide those as well. To use the best hyperparameters for each environment, use the flag --use_best_hps.

Overview of DAAC and IDAAC

Procgen Results

IDAAC achieves state-of-the-art performance on the Procgen benchmark (easy mode), significantly improving the agent's generalization ability over standard RL methods such as PPO.

Test Results on Procgen

Acknowledgements

This code was based on an open sourced PyTorch implementation of PPO.

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

Related tags

Overview

IDAAC: Invariant Decoupled Advantage Actor-Critic

Citation

Requirements

Instructions

Train IDAAC on CoinRun

Train DAAC on CoinRun

Train PPO on CoinRun

Overview of DAAC and IDAAC

Procgen Results

Acknowledgements

Owner

Analysis code and Latex source of the manuscript describing the conditional permutation test of confounding bias in predictive modelling.

Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective

The second project in Python course on FCC

[ICCV 2021] Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation

DrNAS: Dirichlet Neural Architecture Search

[제 13회 투빅스 컨퍼런스] OK Mugle! - 장르부터 멜로디까지, Content-based Music Recommendation

Tesla Light Show xLights Guide With python

Algebraic effect handlers in Python

Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)

CVPRW 2021: How to calibrate your event camera

This repository contains demos I made with the Transformers library by HuggingFace.

Deep learning image registration library for PyTorch

A Pytorch loader for MVTecAD dataset.

BasicNeuralNetwork - This project looks over the basic structure of a neural network and how machine learning training algorithms work

This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️

Code for visualizing the loss landscape of neural nets

Tensorflow 2 Object Detection API kurulumu, GPU desteği, custom model hazırlama

A model which classifies reviews as positive or negative.

Semi-Supervised Semantic Segmentation with Cross-Consistency Training (CCT)