Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Last update: Jan 11, 2022

Related tags

Deep Learning c2d

Overview

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Code & Data Appendix for Conjugated Discrete Distributions for Distributional Reinforcement Learning.

Björn Lindenberg, Jonas Nordqvist, Karl-Olof Lindahl

Citation

If you use C2D in your research we ask you to please cite the following:

@misc{lindenberg2021conjugated,
      title={Conjugated Discrete Distributions for Distributional Reinforcement Learning}, 
      author={Björn Lindenberg and Jonas Nordqvist and Karl-Olof Lindahl},
      year={2021},
      eprint={2112.07424},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Data

Agent scores are available in the data folder.
Raw experiment data for each seed is available in the folder data/supplementary.
Each seed was run on a VM Ubuntu 20.04 server with 64GB RAM, a single Nvidia Quadro P4000 GPU and TensorFlow 2.5.

Code

The C++20 source code that handles ALE and transition buffering resides in src.
The agent code, written in TensorFlow/Python (with algorithms), can be viewed in c2d.
Requires cuDNN, TensorFlow 2.X, python3, The Arcade Learning Environment, C++20 and LZ4. For a comprehensive view of dependencies, have a look at our VM setup files in install_scripts.

Atari Games

To avoid legal issues, our Atari 2600 rom file directory ale_roms is left empty. However the corresponding binaries are widely available for import from elsewhere, e.g., Breakout or breakout.bin can be extracted from the atari-py Python package.

Library

The directory ale_roms needs to be populated by the relevant binaries of different Atari games. ALE's checksum file md5.txt for checking binary compatibility is present in the root directory.
The initial library setup or any changes to settings.cmake will require compilation by
```
bash build_lib.sh
```
One can train for one iteration (1M frames) in Breakout with:
```
python3 run.py --game breakout --tag test --iterations 1
```

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Related tags

Overview

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Citation

Data

Code

Atari Games

Library

Figures

Performance Profile (Deep reinforcement learning at the edge of the statistical precipice, Agarwal et al. 2021)

Sampling Efficiency: Mean and Median

Training Graphs

Strong/Weak Examples

Support Evolution

Owner

Code for the AI lab course 2021/2022 of the University of Verona

JUSTICE: A Benchmark Dataset for Supreme Court’s Judgment Prediction

Code and data for "TURL: Table Understanding through Representation Learning"

Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

Code for the paper 'A High Performance CRF Model for Clothes Parsing'.

Pytorch implementation of Rosca, Mihaela, et al. "Variational Approaches for Auto-Encoding Generative Adversarial Networks."

This repository contains the code for the paper Neural RGB-D Surface Reconstruction

PyTorch implementation of Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

This is just a funny project that we want to see AutoEncoder (AE) can actually work to enhance the features we want

Code for "Multi-Time Attention Networks for Irregularly Sampled Time Series", ICLR 2021.

Implementation for the IJCAI2021 work "Beyond the Spectrum: Detecting Deepfakes via Re-synthesis"

A Python implementation of global optimization with gaussian processes.

Pytorch and Keras Implementations of Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects.

A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling"

codebase for "A Theory of the Inductive Bias and Generalization of Kernel Regression and Wide Neural Networks"

A python library for implementing a recommender system

Tensorflow-seq2seq-tutorials - Dynamic seq2seq in TensorFlow, step by step

Neural HMMs are all you need (for high-quality attention-free TTS)

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking

A tool to estimate time varying instantaneous reproduction number during epidemics