Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Last update: Nov 22, 2022

Related tags

Overview

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

This is the official repository for Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning. We provide the commands to run the PETS and PlaNet experiments included in the paper. This repository is made minimal for ease of experimentation.

Installations

This repository requires Python (3.6), Pytorch (version 1.3 or above) run the following command to create a conda environment (tested using CUDA10.2):

conda env create -f environment.yml

Experiments

To run the PETS experiments on the HalfCheetah environment used in our ablation study, run:

cd cap-pets

CAP

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --penalize_uncertainty --learn_kappa --seed 1

CAP with fixed kappa

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --penalize_uncertainty --kappa 1.0 --seed 1

CCEM

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --seed 1

CEM

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--seed 1

The commands for the PlaNet experiment on the CarRacing environment are:

CAP

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained --penalize-uncertainty \
--learn-kappa --penalty-kappa 0.1 \
--id CarRacing-cap --seed 1

CAP with fixed kappa

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained --penalize-uncertainty \
--penalty-kappa 1.0 \
--id CarRacing-kappa1 --seed 1

CCEM

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained \
--id CarRacing-ccem --seed 1

CEM

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--id CarRacing-cem --seed 1

Contact

If you have any questions regarding the code or paper, feel free to contact [email protected] or open an issue on this repository.

Acknowledgement

This repository contains code adapted from the following repositories: PETS and PlaNet. We thank the authors and contributors for open-sourcing their code.

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Related tags

Overview

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Installations

Experiments

To run the PETS experiments on the HalfCheetah environment used in our ablation study, run:

The commands for the PlaNet experiment on the CarRacing environment are:

Contact

Acknowledgement

Owner

DeepLearning Anomalies Detection with Bluetooth Sensor Data

[CVPR 2021] Teachers Do More Than Teach: Compressing Image-to-Image Models (CAT)

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)

Perception-aware multi-sensor fusion for 3D LiDAR semantic segmentation (ICCV 2021)

The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing".

pytorch implementation for Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network arXiv:1609.04802

TFOD-MASKRCNN - Tensorflow MaskRCNN With Python

Get started learning C# with C# notebooks powered by .NET Interactive and VS Code.

Simulating an AI playing 2048 using the Expectimax algorithm

A PyTorch implementation of deep-learning-based registration

Official PyTorch Implementation of GAN-Supervised Dense Visual Alignment

Hyperparameters tuning and features selection are two common steps in every machine learning pipeline.

This repository contains the official MATLAB implementation of the TDA method for reverse image filtering

Code release for Local Light Field Fusion at SIGGRAPH 2019

Face-Recognition-based-Attendance-System - An implementation of Attendance System in python.

Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection

Learned model to estimate number of distinct values (NDV) of a population using a small sample.

Deeplearning project at The Technological University of Denmark (DTU) about Neural ODEs for finding dynamics in ordinary differential equations and real world time series data

Using OpenAI's CLIP to upscale and enhance images

Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in 3D.