Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

Overview

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

This repository is the official implementation of

  • Seohong Park, Jaekyeom Kim, Gunhee Kim. Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods. In NeurIPS, 2021.

It contains the implementations for SAR, FiGAR-C and base policy gradient algorithms (PPO, TRPO and A2C).

The code is based on Stable Baselines3 (SB3) for PPO and A2C, and Stable Baselines (SB) for TRPO.

Requirements

Run examples

PPO and A2C (based on Stable Baselines3)

To install requirements:

cd sb3
pip install -r requirements.txt
pip install -e .

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

Train FiGAR-C-PPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.01 --max_t=0.05

Train PPO on InvertedPendulum-v2 with the original δ:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg

Train SAR-A2C on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=a2c_rg --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "Action Noise" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=action --anoise_prob=0.05 --anoise_std=3

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "External Force" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=ext_f --anoise_prob=0.05 --anoise_std=300

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "Strong External Force (Perceptible)" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=ext_fpc --anoise_prob=0.05 --anoise_std=1000

TRPO (based on Stable Baselines)

To install requirements:

cd sb
pip install -r requirements.txt
pip install -e .

Train SAR-TRPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

License

This codebase is licensed under the MIT License. See also sb3/LICENSE_SB3 and sb/LICENSE_SB.

Owner
Seohong Park
Seohong Park
Keystroke logging, often referred to as keylogging or keyboard capturing

Keystroke logging, often referred to as keylogging or keyboard capturing, is the action of recording the keys struck on a keyboard, typically covertly, so that a person using the keyboard is unaware

Harsha G 2 Jan 11, 2022
Malware Configuration And Payload Extraction

CAPE: Malware Configuration And Payload Extraction CAPE is a malware sandbox. It is derived from Cuckoo and is designed to automate the process of mal

Kevin O'Reilly 1k Dec 30, 2022
log4j2 passive burp rce scanning tool get post cookie full parameter recognition

log4j2_burp_scan 自用脚本log4j2 被动 burp rce扫描工具 get post cookie 全参数识别,在ceye.io api速率限制下,最大线程扫描每一个参数,记录过滤已检测地址,重复地址 token替换为你自己的http://ceye.io/ token 和域名地址

5 Dec 10, 2021
Description Basic Recon tool for beginners. Especially those who faces issue on how to recon or what all tools to use

Description Basic Recon tool for beginners. Especially those who faces issue on how to recon or what all tools to use. Will try to add atleast 10 more tools currently use 7 sources to gather domains.

Harinder Singh 7 Jan 03, 2022
Deltaspy - an advanced keylogger that can send keylogs and screenshots to gmail

Deltaspy Deltaspy is a advanced keylogger which sends keylogs and screenshot to

Praanesh S 1 Dec 31, 2021
Genpyteal - Experiment to rewrite Python into PyTeal using RedBaron

genpyteal Converts Python to PyTeal. Your mileage will vary depending on how muc

Jason Livesay 9 Oct 19, 2022
Patching - Interactive Binary Patching for IDA Pro

Patching - Interactive Binary Patching for IDA Pro Overview Patching assembly code to change the behavior of an existing program is not uncommon in ma

589 Dec 30, 2022
Malware for Discord, designed to steal passwords, tokens, and inject discord folders for long-term use.

Vital What is Vital? Vital is malware primarily used to collect and extract information from the Discord desktop client. While it has other features (

HellSec 59 Dec 01, 2022
PyExtractor is a decompiler that can fully decompile exe's compiled with pyinstaller or py2exe

PyExtractor is a decompiler that can fully decompile exe's compiled with pyinstaller or py2exe with additional features such as malware checker/detector! Also checks file(s) for suspicious words, dis

Rdimo 56 Jul 31, 2022
Local server for IDA Lumina feature

About POC of an offline server for IDA Lumina feature.

Synacktiv 166 Dec 30, 2022
Tools for converting Nintendo DS binaries to an ELF file for Ghidra/IDA

nds2elf Requirements nds2elf.py uses LIEF and template.elf to form a new binary. LIEF is available via pip: pip3 install lief Usage DSi and DSi-enhan

Max Thomas 17 Aug 14, 2022
Vulmap 是一款 web 漏洞扫描和验证工具, 可对 webapps 进行漏洞扫描, 并且具备漏洞利用功能

Vulmap 是一款 web 漏洞扫描和验证工具, 可对 webapps 进行漏洞扫描, 并且具备漏洞利用功能

之乎者也 2.8k Dec 29, 2022
Hashpic - Hashpic creates an image from a MD5 or SHA512 hash

Hashpic Hashpic creates an image from the MD5 hash of your input. Since v0.2.0 i

0xflotus 15 Nov 23, 2022
log4j2 dos exploit,CVE-2021-45105 exploit,Denial of Service poc

说明 about author: 我超怕的 blog: https://www.cnblogs.com/iAmSoScArEd/ github: https://github.com/iAmSOScArEd/ date: 2021-12-20 log4j2 dos exploit log4j2 do

3 Aug 13, 2022
Make files with as many random bytes as you want

Lots o' Bytes 🔣 Make files with as many random bytes as you want! Use case Can be used to package malware that is normally small by making the downlo

Addi 1 Jan 13, 2022
A GitHub action for organizations that enables advanced security code scanning on all new repos

Advanced-Security-Enforcer What this repository does This code is for an active GitHub Action written in Python to check (on a schedule) for new repos

Zack Koppert 30 May 17, 2022
logmap: Log4j2 jndi injection fuzz tool

logmap - Log4j2 jndi injection fuzz tool Used for fuzzing to test whether there are log4j2 jndi injection vulnerabilities in header/body/path Use http

之乎者也 67 Oct 25, 2022
AttractionFinder - 2022 State Qualified FBLA Attraction Finder Application

Attraction Finder Developers: Riyon Praveen, Aaron Bijoy, & Yash Vora How It Wor

$ky 2 Feb 09, 2022
A bitcoin private keys brute-forcing tool. Educational purpose only.

BitForce A bitcoin private keys brute-forcing tool. If you have an average computer, his will take decades to find a private key with balance. Run Mak

Gilad Leef 2 Dec 20, 2022
Dependency injection in python with autoconfiguration

The base is a DynamicContainer to autoconfigure services using the decorators @services for regular services and @command_handler for using command pattern.

Sergio Gómez 2 Jan 17, 2022