RL-GAN: Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation

Related tags

Deep LearningRL-GAN
Overview

RL-GAN: Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation

RL-GAN is an official implementation of the paper: Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation.

Paper

Shani Gamrian, Yoav Goldberg, "Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation"

@article{DBLP:journals/corr/abs-1806-07377,
  author    = {Shani Gamrian and
               Yoav Goldberg},
  title     = {Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image
               Translation},
  journal   = {CoRR},
  volume    = {abs/1806.07377},
  year      = {2018},
  url       = {http://arxiv.org/abs/1806.07377},
  archivePrefix = {arXiv},
  eprint    = {1806.07377},
  timestamp = {Mon, 13 Aug 2018 16:48:23 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1806-07377},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Videos:

Breakout

RoadFighter

Installation

  • The code was tested on Ubuntu 16.04 with Python 3.6
  • Install packages by typing the command: pip install -r requirements.txt.
  • For Road Fighter, clone and install the repo: https://github.com/ShaniGam/retro

Getting Started

Breakout Examples

  • Train Breakout from scratch:
python -m breakout_a3c.main --num-processes 32 --variation 'standart'
  • Transfer from standart to diagonals variation and fine-tune the model:
python -m breakout_a3c.main --num-processes 32 --variation diagonals --ft-setting full-ft --test
  • Collect images for UNIT training:
python -m breakout_a3c.main --collect-images --num-collected-imgs 100000 --variation diagonals --num-processes 1
  • Train UNIT:
python -m unit.train --trainer UNIT --config unit/configs/breakout-diagonals.yaml
  • Run Breakout with UNIT:
python -m breakout_a3c.main --variation diagonals --test --ft-setting full-ft --test-gan --gan-dir breakout-diagonals --num-processes 0

Road Fighter Examples

  • Train level 1 of Road Fighter
python -m roadfighter_a2c.main --num-processes 84
  • Collect images for UNIT training:
python -m roadfighter_a2c.main -level 1 --collect-images --num-collected-imgs 100000 --num-processes 1
python -m roadfighter_a2c.main -level 2 --collect-images --num-collected-imgs 100000 --num-processes 1
  • Train UNIT:
python -m unit.train --trainer UNIT --config unit/configs/roadfighter-lvl2.yaml
  • Run Road Fighter with UNIT:
python -m roadfighter_a2c.main --load --level 2 --test-gan --gan-dir roadfighter-lvl2-kl01 --num-processes 1
  • Run Road Fighter with UNIT and Imitation Learning:
python -m roadfighter_a2c.main_imitation --load --gan-dir roadfighter-lvl2-kl01 --gan-imitation-file '00320000' --log-name lvl2.log --super-during-rl --level 2 --det-score 5350

Acknowledgments

The code was written by Shani Gamrian and is based on the repositories: pytorch-a3c, pytorch-a2c, UNIT

TO-DO

  • Add links for pretrained models.
  • Create videos.
Diffgram - Supervised Learning Data Platform

Data Annotation, Data Labeling, Annotation Tooling, Training Data for Machine Learning

Diffgram 1.6k Jan 07, 2023
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

SMPLify-XMC This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright Lic

Lea Müller 83 Dec 14, 2022
AI Based Smart Exam Proctoring Package

AI Based Smart Exam Proctoring Package It takes image (base64) as input: Provide Output as: Detection of Mobile phone. Detection of More than 1 person

NARENDER KESWANI 3 Sep 09, 2022
Train SN-GAN with AdaBelief

SNGAN-AdaBelief Train a state-of-the-art spectral normalization GAN with AdaBelief https://github.com/juntang-zhuang/Adabelief-Optimizer Acknowledgeme

Juntang Zhuang 10 Jun 11, 2022
A Python script that creates subtitles of a given length from text paragraphs that can be easily imported into any Video Editing software such as FinalCut Pro for further adjustments.

Text to Subtitles - Python This python file creates subtitles of a given length from text paragraphs that can be easily imported into any Video Editin

Dmytro North 9 Dec 24, 2022
Pytorch implementations of the paper Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients

LSF-SAC Pytorch implementations of the paper Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy G

Hanhan 2 Aug 14, 2022
VOGUE: Try-On by StyleGAN Interpolation Optimization

VOGUE is a StyleGAN interpolation optimization algorithm for photo-realistic try-on. Top: shirt try-on automatically synthesized by our method in two different examples.

Wei ZHANG 66 Dec 09, 2022
StyleGAN2-ADA-training-jupyter - Training custom datasets in styleGAN2-ADA by NVIDIA using Jupyter

styleGAN2-ADA-training-jupyter Training custom datasets in styleGAN2-ADA on Jupyter Official StyleGAN2-ADA by NIVIDIA Paper Training Generative Advers

Mang Su Hyun 2 Feb 24, 2022
Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Non-AR Spatial-Temporal Transformer Introduction Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series For

Chen Kai 66 Nov 28, 2022
Face Recognition & AI Based Smart Attendance Monitoring System.

In today’s generation, authentication is one of the biggest problems in our society. So, one of the most known techniques used for authentication is h

Sagar Saha 1 Jan 14, 2022
A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

Keren Ye 35 Nov 20, 2022
Company clustering with K-means/GMM and visualization with PCA, t-SNE, using SSAN relation extraction

RE results graph visualization and company clustering Installation pip install -r requirements.txt python -m nltk.downloader stopwords python3.7 main.

Jieun Han 1 Oct 06, 2022
Learning Representations that Support Robust Transfer of Predictors

Transfer Risk Minimization (TRM) Code for Learning Representations that Support Robust Transfer of Predictors Prepare the Datasets Preprocess the Scen

Yilun Xu 15 Dec 07, 2022
The Official Repository for "Generalized OOD Detection: A Survey"

Generalized Out-of-Distribution Detection: A Survey 1. Overview This repository is with our survey paper: Title: Generalized Out-of-Distribution Detec

Jingkang Yang 338 Jan 03, 2023
MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

MicRank: Learning to Rank Microphones for Distant Speech Recognition Application Scenario Many applications nowadays envision the presence of multiple

Samuele Cornell 20 Nov 10, 2022
A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.

torchsynth The fastest synth in the universe. Introduction torchsynth is based upon traditional modular synthesis written in pytorch. It is GPU-option

torchsynth 229 Jan 02, 2023
Equivariant layers for RC-complement symmetry in DNA sequence data

Equi-RC Equivariant layers for RC-complement symmetry in DNA sequence data This is a repository that implements the layers as described in "Reverse-Co

7 May 19, 2022
Graph Representation Learning via Graphical Mutual Information Maximization

GMI (Graphical Mutual Information) Graph Representation Learning via Graphical Mutual Information Maximization (Peng Z, Huang W, Luo M, et al., WWW 20

93 Dec 29, 2022
Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness Code for Paper "Imbalanced Gradients: A Subtle Cause of Overestimated Adv

Hanxun Huang 11 Nov 30, 2022
Official PyTorch implementation of Data-free Knowledge Distillation for Object Detection, WACV 2021.

Introduction This repository is the official PyTorch implementation of Data-free Knowledge Distillation for Object Detection, WACV 2021. Data-free Kno

NVIDIA Research Projects 50 Jan 05, 2023