OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Last update: Nov 25, 2022

Related tags

Overview

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Video demo

We here provide a video demo from confounded Enduro environment (see Figure 8 of the main draft). We also visualize the spatial attention map from a convolutional encoder trained with BC (medium) and OREO (right).

Installation

OREO requires CUDA 10.1 to run.

Install the dependencies:

conda install pytorch torchvision torchaudio cudatoolkit=10.1 -c pytorch
pip install dopamine_rl sklearn tqdm kornia dropblock atari-py==0.2.6 gsutil

Download DQN Replay dataset for expert demonstrations on Atari environments:

mkdir DATAPATH
cp download.sh DATAPATH
cd DATAPATH
sh download.sh

Pre-training

We here provide beta-VAE (for CCIL) and VQ-VAE (for CRLR and OREO) pretraining scripts. For other datasets, change the --env option.

beta-VAE

CUDA_VISIBLE_DEVICES=0,1,2,3 python atari_beta_vae.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --seed 1 --ch_div 4 --lmd 10

VQ-VAE

CUDA_VISIBLE_DEVICES=0,1,2,3 python atari_vqvae.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --seed 1

Training BC policy

We here provide training scripts for baselines and OREO. For other datasets, change the --env, --beta_vae_path, and --vqvae_path options.

Behavioral cloning

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --num_episodes 20 --num_eval_episodes 100

Dropout

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --original_dropout --prob 0.5 --num_episodes 20 --num_eval_episodes 100

DropBlock

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --dropblock --prob 0.3 --num_episodes 20 --num_eval_episodes 100

Cutout

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --input_cutout --num_episodes 20 --num_eval_episodes 100

RandomShift

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --random_shift --num_episodes 20 --num_eval_episodes 100

CCIL (w/o interaction)

CUDA_VISIBLE_DEVICES=0 python atari_beta_vae_actor.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --num_eval_episodes 100 --seed 1 --eval_interval 1000 --prob 0.5 --ch_div 4 --beta_vae_path models_beta_vae_coord_conv_chdiv4_actor_lmd10.0/KungFuMaster_s1_epi20_con1_seed1_zdim50_beta4_kltol0_ep1000_beta_vae.pth

CRLR

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor_crlr.py --fixed_size 15000 --num_sub_iters 10 --eval_interval 10 --save_interval 10 --n_epochs 10 --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --num_eval_episodes 100 --seed 1 --vqvae_path models_vqvae/KungFuMaster_s1_epi20_con1_seed1_ne512_c0.25_ep1000_vqvae.pth

OREO

CUDA_VISIBLE_DEVICES=0 python atari_vqvae_oreo.py --env=KungFuMaster --datapath DATAPATH --num_mask 5 --num_episodes 20 --num_eval_episodes 100 --seed 1 --eval_interval 1000 --prob 0.5 --vqvae_path models_vqvae/KungFuMaster_s1_epi20_con1_seed1_ne512_c0.25_ep1000_vqvae.pth

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Related tags

Overview

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Video demo

Installation

Pre-training

beta-VAE

VQ-VAE

Training BC policy

Behavioral cloning

Dropout

DropBlock

Cutout

RandomShift

CCIL (w/o interaction)

CRLR

OREO

Owner

The official homepage of the (outdated) COCO-Stuff 10K dataset.

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

Matlab Python Heuristic Battery Opt - SMOP conversion and manual conversion

Safe Local Motion Planning with Self-Supervised Freespace Forecasting, CVPR 2021

PyTorch implementation of saliency map-aided GAN for Auto-demosaic+denosing

NeurIPS'21 Tractable Density Estimation on Learned Manifolds with Conformal Embedding Flows

Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.

Python package for Bayesian Machine Learning with scikit-learn API

IRON Kaggle project done while doing IRONHACK Bootcamp where we had to analyze and use a Machine Learning Project to predict future sales

"Learning and Analyzing Generation Order for Undirected Sequence Models" in Findings of EMNLP, 2021

3D cascade RCNN for object detection on point cloud

The Noise Contrastive Estimation for softmax output written in Pytorch

A hybrid SOTA solution of LiDAR panoptic segmentation with C++ implementations of point cloud clustering algorithms. ICCV21, Workshop on Traditional Computer Vision in the Age of Deep Learning

AI创造营：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

Readings for "A Unified View of Relational Deep Learning for Polypharmacy Side Effect, Combination Therapy, and Drug-Drug Interaction Prediction."

THIS IS THE OLD PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

an implementation of softmax splatting for differentiable forward warping using PyTorch

A custom DeepStack model for detecting 16 human actions.

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Related tags

Overview

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Video demo

Installation

Pre-training

beta-VAE

VQ-VAE

Training BC policy

Behavioral cloning

Dropout

DropBlock

Cutout

RandomShift

CCIL (w/o interaction)

CRLR

OREO

Owner

The official homepage of the (outdated) COCO-Stuff 10K dataset.

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

Matlab Python Heuristic Battery Opt - SMOP conversion and manual conversion

Safe Local Motion Planning with Self-Supervised Freespace Forecasting, CVPR 2021

PyTorch implementation of saliency map-aided GAN for Auto-demosaic+denosing

NeurIPS'21 Tractable Density Estimation on Learned Manifolds with Conformal Embedding Flows

Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.

Python package for Bayesian Machine Learning with scikit-learn API

IRON Kaggle project done while doing IRONHACK Bootcamp where we had to analyze and use a Machine Learning Project to predict future sales

"Learning and Analyzing Generation Order for Undirected Sequence Models" in Findings of EMNLP, 2021

3D cascade RCNN for object detection on point cloud

The Noise Contrastive Estimation for softmax output written in Pytorch

A hybrid SOTA solution of LiDAR panoptic segmentation with C++ implementations of point cloud clustering algorithms. ICCV21, Workshop on Traditional Computer Vision in the Age of Deep Learning

AI创造营 ：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

Readings for "A Unified View of Relational Deep Learning for Polypharmacy Side Effect, Combination Therapy, and Drug-Drug Interaction Prediction."

THIS IS THE **OLD** PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

an implementation of softmax splatting for differentiable forward warping using PyTorch

A custom DeepStack model for detecting 16 human actions.

AI创造营：Metaverse启动机之重构现世，结合PaddlePaddle 和 Wechaty 创造自己的聊天机器人

THIS IS THE OLD PYMC PROJECT. PLEASE USE PYMC3 INSTEAD: