(NeurIPS '21 Spotlight) IQ-Learn: Inverse Q-Learning for Imitation

Last update: Dec 20, 2022

Related tags

Overview

Inverse Q-Learning (IQ-Learn)

Official code base for IQ-Learn: Inverse soft-Q Learning for Imitation, NeurIPS '21 Spotlight

IQ-Learn is an easy-to-use algorithm that's a drop-in replacement to methods like Behavior Cloning and GAIL, to boost your imitation learning pipelines!
Update: IQ-Learn was recently used to create the best AI agent for playing Minecraft. Placing #1 in NeurIPS MineRL Basalt Challenge using only human demos (Overall Leaderboard Rank #2)

[Project Page]

We introduce Inverse Q-Learning (IQ-Learn), a state-of-the-art novel framework for Imitation Learning (IL), that directly learns soft-Q functions from expert data. IQ-Learn enables non-adverserial imitation learning, working on both offline and online IL settings. It is performant even with very sparse expert data, and scales to complex image-based environments, surpassing prior methods by more than 3x. It is very simple to implement requiring ~15 lines of code on top of existing RL methods.

Inverse Q-Learning is theoretically equivalent to Inverse Reinforcement learning, i.e. learning rewards from expert data. However, it is much more powerful in practice. It admits very simple non-adverserial training and works on complete offline IL settings (without any access to the environment), greatly exceeding Behavior Cloning.

IQ-Learn is the successor to Adversarial Imitation Learning methods like GAIL (coming from the same lab).
It extends the theoretical framework for Inverse RL to non-adverserial and scalable learning, for the first-time showing guaranteed convergence.

Citation

@inproceedings{garg2021iqlearn,
title={IQ-Learn: Inverse soft-Q Learning for Imitation},
author={Divyansh Garg and Shuvam Chakraborty and Chris Cundy and Jiaming Song and Stefano Ermon},
booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
year={2021},
url={https://openreview.net/forum?id=Aeo-xqtb5p}
}

Key Advantages

✅ Drop-in replacement to Behavior Cloning
✅ Non-adverserial online IL (Successor to GAIL & AIRL)
✅ Simple to implement
✅ Performant with very sparse data (single expert demo)
✅ Scales to Complex Image Envs (SOTA on Atari and playing Minecraft)
✅ Recover rewards from envs

Usage

To install and use IQ-Learn check the instructions provided in the iq_learn folder.

Imitation

Reaching human-level performance on Atari with pure imitation:

Rewards

Recovering environment rewards on GridWorld:

Questions

Please feel free to email us if you have any questions.

Div Garg ([email protected])

(NeurIPS '21 Spotlight) IQ-Learn: Inverse Q-Learning for Imitation

Related tags

Overview

Inverse Q-Learning (IQ-Learn)

Citation

Key Advantages

Usage

Imitation

Rewards

Questions

Owner

Divyansh Garg

The code for "Deep Level Set for Box-supervised Instance Segmentation in Aerial Images".

MASS (Mueen's Algorithm for Similarity Search) - a python 2 and 3 compatible library used for searching time series sub-sequences under z-normalized Euclidean distance for similarity.

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

A library for finding knowledge neurons in pretrained transformer models.

SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.

Data Augmentation with Variational Autoencoders

Official Pytorch implementation for video neural representation (NeRV)

A rule learning algorithm for the deduction of syndrome definitions from time series data.

An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

SARS-Cov-2 Recombinant Finder for fasta sequences

HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

K-FACE Analysis Project on Pytorch

Multiple style transfer via variational autoencoder

FinRL-Meta: A Universe for Data-Driven Financial Reinforcement Learning. 🔥

Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

FNet Implementation with TensorFlow & PyTorch

SOLOv2 on onnx & tensorRT

Official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution"

DeepStochlog Package For Python

Quantum-enhanced transformer neural network

(NeurIPS '21 Spotlight) IQ-Learn: Inverse Q-Learning for Imitation

Related tags

Overview

Inverse Q-Learning (IQ-Learn)

Citation

Key Advantages

Usage

Imitation

Rewards

Questions

Owner

Divyansh Garg

The code for "Deep Level Set for Box-supervised Instance Segmentation in Aerial Images".

MASS (Mueen's Algorithm for Similarity Search) - a python 2 and 3 compatible library used for searching time series sub-sequences under z-normalized Euclidean distance for similarity.

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

A library for finding knowledge neurons in pretrained transformer models.

SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.

Data Augmentation with Variational Autoencoders

Official Pytorch implementation for video neural representation (NeRV)

A rule learning algorithm for the deduction of syndrome definitions from time series data.

An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

SARS-Cov-2 Recombinant Finder for fasta sequences

HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

K-FACE Analysis Project on Pytorch

Multiple style transfer via variational autoencoder

FinRL­-Meta: A Universe for Data­-Driven Financial Reinforcement Learning. 🔥

Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

FNet Implementation with TensorFlow & PyTorch

SOLOv2 on onnx & tensorRT

Official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution"

DeepStochlog Package For Python

Quantum-enhanced transformer neural network

FinRL-Meta: A Universe for Data-Driven Financial Reinforcement Learning. 🔥