Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Last update: Sep 16, 2022

Related tags

Overview

Overcooked-AI

We suppose to apply traditional offline reinforcement learning technique to multi-agent algorithm.
In this repository, we implemented behavior cloning(BC), offline MADDPG, MADDPG+REM (MADDPG w/ REM), MADDPG+BCQ (MADDPG w/ BCQ) with pytorch. Now, BCQ is in ' Working In Progress', and it's not implemented completely.

We collected 0.5M multi-agent offline RL dataset and experimented with each comparison methods. We collected this data with online MADDPG agents, and it includes exploration trajectories using OU noise. The experiments are ran on Asymmetric Advantages on the Overcooked environment.

We are looking forward your contribution!

How to Run

Collect Offline Data

python train_online.py agent=maddpg save_replay_buffer=true

While the agents train with 0.5M steps, the trajectory replay buffer will be dumped in your experiment/{date}/{time}_maddpg_{exp_name}/buffer folder.
Please replace the path in config/data/local.yaml to the experiment by-product directory.

Download Dataset

Or, if you want to use our dataset pre-collected, please enjoy this link.
We provide 0.5M trajectories in Asymmetric Advantages layout.
Please download our dataset in your local computer and replace the path in config/data/local.yaml

Train Offline Models

Behavior Cloning

python train_bc.py agent=bc data=local

Offline MADDPG (Vanilla)

python train_offline.py agent=maddpg data=local

Offline MADDPG (w/ REM)

python train_offline.py agent=rem_maddpg data=local

Offline MADDPG (w/ BCQ) (WIP)

python train_offline.py agent=bcq_maddpg data=local

Result

Graph

Online	Offline (0.5M Data)	Offline (0.25M Data)

Video

Online	BC	Offline /w REM

Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Related tags

Overview

Overcooked-AI

How to Run

Collect Offline Data

Download Dataset

Train Offline Models

Behavior Cloning

Offline MADDPG (Vanilla)

Offline MADDPG (w/ REM)

Offline MADDPG (w/ BCQ) (WIP)

Result

Graph

Video

Acknowledgement

Owner

Baek In-Chang

An end-to-end regression problem of predicting the price of properties in Bangalore.

Multi-query Video Retreival

This repo contains source code and materials for the TEmporally COherent GAN SIGGRAPH project.

Code release of paper "Deep Multi-View Stereo gone wild"

A parametric soroban written with CADQuery.

DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

Reinforcement learning algorithms in RLlib

Sequence Modeling with Structured State Spaces

[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

Scheduling BilinearRewards

Using fully convolutional networks for semantic segmentation with caffe for the cityscapes dataset

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

GLNet for Memory-Efficient Segmentation of Ultra-High Resolution Images

links and status of cool gradio demos

[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Official repository for ABC-GAN

Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, as a standalone package for Pytorch

[ICLR 2021] Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization

Development kit for MIT Scene Parsing Benchmark

PyTorch Lightning + Hydra. A feature-rich template for rapid, scalable and reproducible ML experimentation with best practices. ⚡🔥⚡