RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

Related tags

Deep LearningRE3
Overview

State Entropy Maximization with Random Encoders for Efficient Exploration (RE3) (ICML 2021)

Code for State Entropy Maximization with Random Encoders for Efficient Exploration.

In this repository, we provide code for RE3 algorithm described in the paper linked above. We provide code in three sub-directories: rad_re3 containing code for the combination of RE3 and RAD, dreamer_re3 containing code for the combination of RE3 and Dreamer, and a2c_re3 containing code for the combination of RE3 and A2C.

We also provide raw data(.csv) and code for visualization in the data directory.

If you find this repository useful for your research, please cite:

@inproceedings{seo2021state,
  title={State Entropy Maximization with Random Encoders for Efficient Exploration},
  author={Seo, Younggyo and Chen, Lili and Shin, Jinwoo and Lee, Honglak and Abbeel, Pieter and Lee, Kimin},
  booktitle={International Conference on Machine Learning},
  year={2021}
}

RAD + RE3

Our code is built on top of the DrQ repository.

Installation

You could install all dependencies by following command:

conda env install -f conda_env.yml

You should also install custom version of dm_control to run experiments on Walker Run Sparse and Cheetah Run Sparse. You could do this by following command:

cd ../envs/dm_control
pip install .

Instructions

RAD

python train.py env=hopper_hop batch_size=512 action_repeat=2 logdir=runs_rad_re3 use_state_entropy=false

RAD + RE3

python train.py env=hopper_hop batch_size=512 action_repeat=2 logdir=runs_rad_re3

We provide all scripts to reproduce Figure 4 (RAD, RAD + RE3) in scripts directory.

Dreamer + RE3

Our code is built on top of the Dreamer repository.

Installation

You could install all dependencies by following command:

pip3 install --user tensorflow-gpu==2.2.0
pip3 install --user tensorflow_probability
pip3 install --user git+git://github.com/deepmind/dm_control.git
pip3 install --user pandas
pip3 install --user matplotlib

# Install custom dm_control environments for walker_run_sparse / cheetah_run_sparse
cd ../envs/dm_control
pip3 install .

Instructions

Dreamer

python dreamer.py --logdir ./logdir/dmc_pendulum_swingup/dreamer/12345 --task dmc_pendulum_swingup --precision 32 --beta 0.0 --seed 12345

Dreamer + RE3

python dreamer.py --logdir ./logdir/dmc_pendulum_swingup/dreamer_re3/12345 --task dmc_pendulum_swingup --precision 32 --k 53 --beta 0.1 --seed 12345

We provide all scripts to reproduce Figure 4 (Dreamer, Dreamer + RE3) in scripts directory.

A2C + RE3

Training code can be found in rl-starter-files directory, which is forked from rl-starter-files, which uses a modified A2C implementation from torch-ac. Note that currently there is only support for A2C.

Installation

All of the dependencies are in the requirements.txt file in rl-starter-files. They can be installed manually or with the following command:

pip3 install -r requirements.txt

You will also need to install our cloned version of torch-ac with these commands:

cd torch-ac
pip3 install -e .

Instructions

See instructions in rl-starter-files directory. Example scripts can be found in rl-starter-files/rl-starter-files/run_sent.sh.

Owner
Younggyo Seo
Ph.D Student @ Graduate School of AI, KAIST
Younggyo Seo
A generalized framework for prototyping full-stack cooperative driving automation applications under CARLA+SUMO.

OpenCDA OpenCDA is a SIMULATION tool integrated with a prototype cooperative driving automation (CDA; see SAE J3216) pipeline as well as regular autom

UCLA Mobility Lab 726 Dec 29, 2022
Simple Python application to transform Serial data into OSC messages

SerialToOSC-Bridge Simple Python application to transform Serial data into OSC messages. The current purpose is to be a compatibility layer between ha

Division of Applied Acoustics at Chalmers University of Technology 3 Jun 03, 2021
CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)

CLIP (Contrastive Language–Image Pre-training) Experiments (Evaluation) Model Dataset Acc (%) ViT-B/32 (Paper) CIFAR100 65.1 ViT-B/32 (Our) CIFAR100 6

Myeongjun Kim 52 Jan 07, 2023
[ECCV'20] Convolutional Occupancy Networks

Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page | Blog Post This repository contains the implementation o

622 Dec 30, 2022
Deep Learning segmentation suite designed for 2D microscopy image segmentation

Deep Learning segmentation suite dessigned for 2D microscopy image segmentation This repository provides researchers with a code to try different enco

7 Nov 03, 2022
CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper) (Accepted for oral presentation at ACM

Minha Kim 1 Nov 12, 2021
Simple tutorials on Pytorch DDP training

pytorch-distributed-training Distribute Dataparallel (DDP) Training on Pytorch Features Easy to study DDP training You can directly copy this code for

Ren Tianhe 188 Jan 06, 2023
Code for Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks Under construction. Description Code for Phase diagram of S

Rodrigo Veiga 3 Nov 24, 2022
PyTorch and Tensorflow functional model definitions

functional-zoo Model definitions and pretrained weights for PyTorch and Tensorflow PyTorch, unlike lua torch, has autograd in it's core, so using modu

Sergey Zagoruyko 590 Dec 22, 2022
Latte: Cross-framework Python Package for Evaluation of Latent-based Generative Models

Cross-framework Python Package for Evaluation of Latent-based Generative Models Latte Latte (for LATent Tensor Evaluation) is a cross-framework Python

Karn Watcharasupat 30 Sep 08, 2022
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dea

MIC-DKFZ 1.2k Jan 04, 2023
Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras

Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras This tutorial shows how to use Keras library to build deep ne

Marko Jocić 922 Dec 19, 2022
Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

TGraM Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling, Qibin He, Xian Sun, Zhiyuan Yan, Beibei Li, Kun Fu Abstract Rece

Qibin He 6 Nov 25, 2022
[ECCVW2020] Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DiMP)

Feel free to visit my homepage Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DIMP) [ECCVW2020 paper] Presentation

Seokeon Choi 35 Oct 26, 2022
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this

Nervana 3.9k Dec 20, 2022
Crawl & visualize ICLR papers and reviews

Crawl and Visualize ICLR 2022 OpenReview Data Descriptions This Jupyter Notebook contains the data crawled from ICLR 2022 OpenReview webpages and thei

Federico Berto 75 Dec 05, 2022
Deep learning based hand gesture recognition using LSTM and MediaPipie.

Hand Gesture Recognition Deep learning based hand gesture recognition using LSTM and MediaPipie. Demo video using PingPong Robot Files Pretrained mode

Brad 24 Nov 11, 2022
Implementation for the paper SMPLicit: Topology-aware Generative Model for Clothed People (CVPR 2021)

SMPLicit: Topology-aware Generative Model for Clothed People [Project] [arXiv] License Software Copyright License for non-commercial scientific resear

Enric Corona 225 Dec 13, 2022
A variational Bayesian method for similarity learning in non-rigid image registration (CVPR 2022)

A variational Bayesian method for similarity learning in non-rigid image registration We provide the source code and the trained models used in the re

daniel grzech 14 Nov 21, 2022
Self-Supervised Learning of Event-based Optical Flow with Spiking Neural Networks

Self-Supervised Learning of Event-based Optical Flow with Spiking Neural Networks Work accepted at NeurIPS'21 [paper, video]. If you use this code in

TU Delft 43 Dec 07, 2022