Author's PyTorch implementation of TD3 for OpenAI gym tasks

Last update: Dec 25, 2022

Related tags

Overview

Addressing Function Approximation Error in Actor-Critic Methods

PyTorch implementation of Twin Delayed Deep Deterministic Policy Gradients (TD3). If you use our code or data please cite the paper.

Method is tested on MuJoCo continuous control tasks in OpenAI gym. Networks are trained using PyTorch 1.2 and Python 3.7.

Usage

The paper results can be reproduced by running:

./run_experiments.sh

Experiments on single environments can be run by calling:

python main.py --env HalfCheetah-v2

Hyper-parameters can be modified with different arguments to main.py. We include an implementation of DDPG (DDPG.py), which is not used in the paper, for easy comparison of hyper-parameters with TD3. This is not the implementation of "Our DDPG" as used in the paper (see OurDDPG.py).

Algorithms which TD3 compares against (PPO, TRPO, ACKTR, DDPG) can be found at OpenAI baselines repository.

Results

Code is no longer exactly representative of the code used in the paper. Minor adjustments to hyperparamters, etc, to improve performance. Learning curves are still the original results found in the paper.

Learning curves found in the paper are found under /learning_curves. Each learning curve are formatted as NumPy arrays of 201 evaluations (201,), where each evaluation corresponds to the average total reward from running the policy for 10 episodes with no exploration. The first evaluation is the randomly initialized policy network (unused in the paper). Evaluations are peformed every 5000 time steps, over a total of 1 million time steps.

Numerical results can be found in the paper, or from the learning curves. Video of the learned agent can be found here.

Bibtex

@inproceedings{fujimoto2018addressing,
  title={Addressing Function Approximation Error in Actor-Critic Methods},
  author={Fujimoto, Scott and Hoof, Herke and Meger, David},
  booktitle={International Conference on Machine Learning},
  pages={1582--1591},
  year={2018}
}

Author's PyTorch implementation of TD3 for OpenAI gym tasks

Related tags

Overview

Addressing Function Approximation Error in Actor-Critic Methods

Usage

Results

Bibtex

Owner

Scott Fujimoto

🛠️ Tools for Transformers compression using Lightning ⚡

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

A deep learning model for style-specific music generation.

SoGCN: Second-Order Graph Convolutional Networks

Code for "Adversarial Training for a Hybrid Approach to Aspect-Based Sentiment Analysis

Human4D Dataset tools for processing and visualization

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

A deep learning CNN model to identify and classify and check if a person is wearing a mask or not.

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

A library for finding knowledge neurons in pretrained transformer models.

YOLOv7 - Framework Beyond Detection

Python library for computer vision labeling tasks. The core functionality is to translate bounding box annotations between different formats-for example, from coco to yolo.

Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

Codes for paper "KNAS: Green Neural Architecture Search"

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Official repo for SemanticGAN https://nv-tlabs.github.io/semanticGAN/

Creating predictive checklists from data using integer programming.

AI-based, context-driven network device ranking

TensorFlow Implementation of "Show, Attend and Tell"

Implicit Deep Adaptive Design (iDAD)

Author's PyTorch implementation of TD3 for OpenAI gym tasks

Related tags

Overview

Addressing Function Approximation Error in Actor-Critic Methods

Usage

Results

Bibtex

Owner

Scott Fujimoto

🛠️ Tools for Transformers compression using Lightning ⚡

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

A deep learning model for style-specific music generation.

SoGCN: Second-Order Graph Convolutional Networks

Code for "Adversarial Training for a Hybrid Approach to Aspect-Based Sentiment Analysis

Human4D Dataset tools for processing and visualization

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

A deep learning CNN model to identify and classify and check if a person is wearing a mask or not.

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

A library for finding knowledge neurons in pretrained transformer models.

YOLOv7 - Framework Beyond Detection

Python library for computer vision labeling tasks. The core functionality is to translate bounding box annotations between different formats-for example, from coco to yolo.

Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

Codes for paper "KNAS: Green Neural Architecture Search"

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Official repo for SemanticGAN https://nv-tlabs.github.io/semanticGAN/

Creating predictive checklists from data using integer programming.

AI-based, context-driven network device ranking

TensorFlow Implementation of "Show, Attend and Tell"

Implicit Deep Adaptive Design (iDAD)

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang