Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Last update: Aug 31, 2022

Related tags

Deep Learning SafeDRL

Overview

Getting Started

This repository contains the code used for the following publications:

Probabilistic Guarantees for Safe Deep Reinforcement Learning (FORMATS 2020)
Verifying Reinforcement Learning up to Infinity (IJCAI 2021)
Verified Probabilistic Policies for Deep Reinforcement Learning (NFM 2022)

These instructions will help with setting up the project

Prerequisites

Create a virtual environment with conda:

conda env create -f environment.yml
conda activate safedrl

This will take care of installing all the dependencies needed by python

In addition, download PRISM from the following link: https://github.com/phate09/prism

Ensure you have Gradle installed (https://gradle.org/install/)

Running the code

Before running any code, in a new terminal go to the PRISM project folder and run

gradle run

This will enable the communication channel between PRISM and the rest of the repository

Probabilistic Guarantees for Safe Deep Reinforcement Learning (FORMATS 2020)

Training

Run the train_pendulum.py inside agents/dqn to train the agent on the inverted pendulum problem and record the location of the saved agent

Analysis

Run the domain_analysis_sym.py inside runnables/symbolic/dqn changing paths to point to the saved network

Verifying Reinforcement Learning up to Infinity (IJCAI 2021)

####Paper results ## download and unzip experiment_collection_final.zip in the 'save' directory

run tensorboard --logdir=./save/experiment_collection_final

(results for the output range analysis experiments are in experiment_collection_ora_final.zip)

####Train neural networks from scratch ## run either:

training/tune_train_PPO_bouncing_ball.py
training/tune_train_PPO_car.py
training/tune_train_PPO_cartpole.py

####Check safety of pretrained agents ## download and unzip pretrained_agents.zip in the 'save' directory

run verification/run_tune_experiments.py

(to monitor the progress of the algorithm run tensorboard --logdir=./save/experiment_collection_final)

The results in tensorboard can be filtered using regular expressions (eg. "bouncing_ball.* template: 0") on the search bar on the left:

The name of the experiment contains the name of the problem (bouncing_ball, cartpole, stopping car), the amount of adversarial noise ("eps", only for stopping_car), the time steps length for the dynamics of the system ("tau", only for cartpole) and the choice of restriction in order of complexity (0 being box, 1 being the chosen template, and 2 being octagon).

The table in the paper is filled by using some of the metrics reported in tensorboard:

max_t: Avg timesteps
seen: Avg polyhedra
time_since_restore: Avg clock time (s)

Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Related tags

Overview

Getting Started

Prerequisites

Running the code

Probabilistic Guarantees for Safe Deep Reinforcement Learning (FORMATS 2020)

Training

Analysis

Verifying Reinforcement Learning up to Infinity (IJCAI 2021)

Verified Probabilistic Policies for Deep Reinforcement Learning (NFM 2022)

Owner

Edoardo Bacci

An open-source online reverse dictionary.

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Adversarial Attacks are Reversible via Natural Supervision

NuPIC Studio is an all-in-one tool that allows users create a HTM neural network from scratch

Fast, flexible and easy to use probabilistic modelling in Python.

Bootstrapped Unsupervised Sentence Representation Learning (ACL 2021)

Implementation of ICCV 2021 oral paper -- A Novel Self-Supervised Learning for Gaussian Mixture Model

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.

Automatic Data-Regularized Actor-Critic (Auto-DrAC)

A repository with exploration into using transformers to predict DNA ↔ transcription factor binding

Multi-task head pose estimation in-the-wild

Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Code for ICCV2021 paper PARE: Part Attention Regressor for 3D Human Body Estimation

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

A PyTorch Implementation of the paper - Choi, Woosung, et al. "Investigating u-nets with various intermediate blocks for spectrogram-based singing voice separation." 21th International Society for Music Information Retrieval Conference, ISMIR. 2020.

[NeurIPS 2021] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects

Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

PyTorch implementation for "Mining Latent Structures with Contrastive Modality Fusion for Multimedia Recommendation"

Personal project about genus-0 meshes, spherical harmonics and a cow

Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Related tags

Overview

Getting Started

Prerequisites

Running the code

Probabilistic Guarantees for Safe Deep Reinforcement Learning (FORMATS 2020)

Training

Analysis

Verifying Reinforcement Learning up to Infinity (IJCAI 2021)

Verified Probabilistic Policies for Deep Reinforcement Learning (NFM 2022)

Owner

Edoardo Bacci

An open-source online reverse dictionary.

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Adversarial Attacks are Reversible via Natural Supervision

NuPIC Studio is an all­-in-­one tool that allows users create a HTM neural network from scratch

Fast, flexible and easy to use probabilistic modelling in Python.

Bootstrapped Unsupervised Sentence Representation Learning (ACL 2021)

Implementation of ICCV 2021 oral paper -- A Novel Self-Supervised Learning for Gaussian Mixture Model

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.

Automatic Data-Regularized Actor-Critic (Auto-DrAC)

A repository with exploration into using transformers to predict DNA ↔ transcription factor binding

Multi-task head pose estimation in-the-wild

Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Code for ICCV2021 paper PARE: Part Attention Regressor for 3D Human Body Estimation

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

A PyTorch Implementation of the paper - Choi, Woosung, et al. "Investigating u-nets with various intermediate blocks for spectrogram-based singing voice separation." 21th International Society for Music Information Retrieval Conference, ISMIR. 2020.

[NeurIPS 2021] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects

Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

PyTorch implementation for "Mining Latent Structures with Contrastive Modality Fusion for Multimedia Recommendation"

Personal project about genus-0 meshes, spherical harmonics and a cow

NuPIC Studio is an all-in-one tool that allows users create a HTM neural network from scratch