Multi-Stage Episodic Control for Strategic Exploration in Text Games

Last update: May 24, 2022

Overview

XTX: eXploit - Then - eXplore

Requirements

First clone this repo using git clone https://github.com/princeton-nlp/XTX.git

Please create two conda environments as follows:

conda env create -f yml_envs/jericho-wt.yml
a. conda activate jericho-wt
b. pip install git+https://github.com/jens321/[email protected]
conda env create -f yml_envs/jericho-no-wt.yml

The first set of commands will create a conda environment called jericho-wt which has added actions to the game grammar for specific games (see games with * in the paper). The second command will create another conda environment called jericho-no-wt which installs an unmodified version of the Jericho library.

Training

All code can be run from the root folder of this project. Please follow the commands below for each specific model:

XTX: sh scripts/run_xtx.sh
XTX (no-mix): sh scripts/run_xtx_no_mix.sh
XTX (uniform): sh scrtips/run_xtx_uniform.sh
XTX ($\lambda$ = 0, 0.5, or 1): sh scripts/run_xtx_ablation.sh
INV DY: sh scripts/run_inv_dy.sh
DRRN: sh scripts/run_drrn.sh

Notes

You can use analysis/sample_env.py for quickly playing around with a sample Jericho environment. Run it using python3 -m analysis.sample_env.
You can use analysis/augment_wt.py for generating the missing action candidates that can be added to the game grammar (games with * in the paper). Run it using python3 -m analysis.augment_wt.
Note that all models should finish within a day or two given 1 gpu and 8 cpus, except for games where Jericho's valid action handicap is slow (e.g. Library, Dragon). Since Jericho's valid action handicap heavily relies on parallelization, increasing the number of cpus also results in good speedups (e.g. 8 -> 16).

Acknowledgements

We used Weights & Biases for experiment tracking and visualizations to develop insights for this paper.

Some of the code borrows from the TDQN repo.

For any questions please contact Jens Tuyls ([email protected]).

Multi-Stage Episodic Control for Strategic Exploration in Text Games

Related tags

Overview

XTX: eXploit - Then - eXplore

Requirements

Training

Notes

Acknowledgements

Owner

Princeton Natural Language Processing

Playable Video Generation

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

Official Datasets and Implementation from our Paper "Video Class Agnostic Segmentation in Autonomous Driving".

Code for the paper BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks

A general-purpose programming language, focused on simplicity, safety and stability.

Aalto-cs-msc-theses - Listing of M.Sc. Theses of the Department of Computer Science at Aalto University

Polynomial-time Meta-Interpretive Learning

Pre-trained Deep Learning models and demos (high quality and extremely fast)

Visual Adversarial Imitation Learning using Variational Models (VMAIL)

RoMa: A lightweight library to deal with 3D rotations in PyTorch.

This code finds bounding box of a single human mouth.

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Official PyTorch Implementation of Rank & Sort Loss [ICCV2021]

Code release for Local Light Field Fusion at SIGGRAPH 2019

You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

QMagFace: Simple and Accurate Quality-Aware Face Recognition

Code for KHGT model, AAAI2021

Another pytorch implementation of FCN (Fully Convolutional Networks)

Multi Agent Reinforcement Learning for ROS in 2D Simulation Environments

This tool uses Deep Learning to help you draw and write with your hand and webcam.