Code for NeurIPS 2021 paper: Invariant Causal Imitation Learning for Generalizable Policies

Last update: Dec 01, 2022

Overview

Invariant Causal Imitation Learning for Generalizable Policies

Ioana Bica, Daniel Jarrett, Mihaela van der Schaar

Neural Information Processing Systems (NeurIPS) 2021

Dependencies

The code was implemented in Python 3.6 and the following packages are needed for running it:

gym==0.17.2
numpy==1.18.2
pandas==1.0.4
tensorflow==1.15.0
torch==1.6.0
tqdm==4.32.1
scipy==1.1.0
scikit-learn==0.22.2
stable-baselines==2.10.1

Running and evaluating the model:

The control tasks used for experiments are from OpenAI gym [1]. Each control task is associated with a true reward function (unknown to the imitation algorithm). In each case, the “expert” demonstrator can be obtained by using a pre-trained and hyperparameter-optimized agent from the RL Baselines Zoo [2] in Stable OpenAI Baselines [3].

In this implementation we provide the expert demonstrations for 2 environments for CartPole-v1 in 'volume/CartPole-v1'. Note that the code in 'contrib/baselines_zoo' was taken from [2].

To train and evaluate ICIL on CartPole-v1, run the following command with the chosen command line arguments. For reference, the expert performance is 500.

python testing/il.py

Options :
   --env                  # Environment name. 
   --num_trajectories	  # Number of expert trajectories used for training the imitation learning algorithm. 
   --trial                # Trial number.

Outputs:

Average reward for 10 repetitions of running ICIL.

Example usage

python testing/il.py  --env='CartPole-v1' --num_trajectories=20 --trial=0

References

[1] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym. OpenAI, 2016

[2] Antonin Raffin. Rl baselines zoo. https://github.com/araffin/rl-baselines-zoo, 2018

[3] Ashley Hill, Antonin Raffin, Maximilian Ernestus, Adam Gleave, Anssi Kanervisto, Rene Traore, Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, and Yuhuai Wu. Stable baselines. https://github.com/hill-a/stable-baselines, 2018.

Citation

If you use this code, please cite:

@inproceedings{bica2021invariant,
  title={Invariant Causal Imitation Learning for Generalizable Policies},
  author={Bica, Ioana and Jarrett, Daniel and van der Schaar, Mihaela},
  booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
  year={2021}
}

Code for NeurIPS 2021 paper: Invariant Causal Imitation Learning for Generalizable Policies

Related tags

Overview

Invariant Causal Imitation Learning for Generalizable Policies

Ioana Bica, Daniel Jarrett, Mihaela van der Schaar

Neural Information Processing Systems (NeurIPS) 2021

Dependencies

Running and evaluating the model:

Example usage

References

Citation

Owner

Ioana Bica

Class-Attentive Diffusion Network for Semi-Supervised Classification [AAAI'21] (official implementation)

Hl classification bc - A Network-Based High-Level Data Classification Algorithm Using Betweenness Centrality

RuDOLPH: One Hyper-Modal Transformer can be creative as DALL-E and smart as CLIP

The official implementation of paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks" (IJCV under review).

Implement object segmentation on images using HOG algorithm proposed in CVPR 2005

A collection of resources on GAN Inversion.

This repo generates the training data and the model for Morpheus-Deblend

[peer review] An Arbitrary Scale Super-Resolution Approach for 3D MR Images using Implicit Neural Representation

A library for optimization on Riemannian manifolds

We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

GradAttack is a Python library for easy evaluation of privacy risks in public gradients in Federated Learning

Geometric Vector Perceptrons --- a rotation-equivariant GNN for learning from biomolecular structure

Mapping Conditional Distributions for Domain Adaptation Under Generalized Target Shift

PyTorch implementation for ACL 2021 paper "Maria: A Visual Experience Powered Conversational Agent".

Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

MAGMA - a GPT-style multimodal model that can understand any combination of images and language

Code repository of the paper Neural circuit policies enabling auditable autonomy published in Nature Machine Intelligence

PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems