Neural Dynamic Policies for End-to-End Sensorimotor Learning

Overview

Neural Dynamic Policies for End-to-End Sensorimotor Learning

In NeurIPS 2020 (Spotlight) [Project Website] [Project Video]

Shikhar Bahl, Mustafa Mukadam, Abhinav Gupta, Deepak Pathak
Carnegie Mellon University & Facebook AI Research

This is a PyTorch based implementation for our NeurIPS 2020 paper on Neural Dynamic Policies for end-to-end sensorimotor learning. In this work, we begin to close this gap and embed dynamics structure into deep neural network-based policies by reparameterizing action spaces with differential equations. We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space as opposed to prior policy learning methods where action represents the raw control space. The embedded structure allow us to perform end-to-end policy learning under both reinforcement and imitation learning setups. If you find this work useful in your research, please cite:

  @inproceedings{bahl2020neural,
    Author = { Bahl, Shikhar and Mukadam, Mustafa and
    Gupta, Abhinav and Pathak, Deepak},
    Title = {Neural Dynamic Policies for End-to-End Sensorimotor Learning},
    Booktitle = {NeurIPS},
    Year = {2020}
  }

1) Installation and Usage

  1. This code is based on PyTorch. This code needs MuJoCo 1.5 to run. To install and setup the code, run the following commands:
#create directory for data and add dependencies
cd neural-dynamic-polices; mkdir data/
git clone https://github.com/rll/rllab.git
git clone https://github.com/openai/baselines.git

#create virtual env
conda create --name ndp python=3.5
source activate ndp

#install requirements
pip install -r requirements.txt
#OR try
conda env create -f ndp.yaml
  1. Training imitation learning
cd neural-dynamic-polices
# name of the experiment
python main_il.py --name NAME
  1. Training RL: run the script run_rl.sh. ENV_NAME is the environment (could be throw, pick, push, soccer, faucet). ALGO-TYPE is the algorithm (dmp for NDPs, ppo for PPO [Schulman et al., 2017] and ppo-multi for the multistep actor-critic architecture we present in our paper).
sh run_rl.sh ENV_NAME ALGO-TYPE EXP_ID SEED
  1. In order to visualize trained models/policies, use the same exact arguments as used for training but call vis_policy.sh
  sh vis_policy.sh ENV_NAME ALGO-TYPE EXP_ID SEED

2) Other helpful pointers

3) Acknowledgements

We use the PPO infrastructure from: https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail. We use environment code from: https://github.com/dibyaghosh/dnc/tree/master/dnc/envs, https://github.com/rlworkgroup/metaworld, https://github.com/vitchyr/multiworld. We use pytorch and RL utility functions from: https://github.com/vitchyr/rlkit. We use the DMP skeleton code from https://github.com/abr-ijs/imednet, https://github.com/abr-ijs/digit_generator. We also use https://github.com/rll/rllab.git and https://github.com/openai/baselines.git.

Owner
Shikhar Bahl
AI Researcher at CMU (PhD, Robotics Institute) interested in deep RL, machine learning, robotics and optimization
Shikhar Bahl
A simple pytorch pipeline for semantic segmentation.

SegmentationPipeline -- Pytorch A simple pytorch pipeline for semantic segmentation. Requirements : torch=1.9.0 tqdm albumentations=1.0.3 opencv-pyt

petite7 4 Feb 22, 2022
Official implementation of "Generating 3D Molecules for Target Protein Binding"

Generating 3D Molecules for Target Protein Binding This is the official implementation of the GraphBP method proposed in the following paper. Meng Liu

DIVE Lab, Texas A&M University 74 Dec 07, 2022
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

This is a release of our VIMPAC paper to illustrate the implementations. The pretrained checkpoints and scripts will be soon open-sourced in HuggingFace transformers.

Hao Tan 74 Dec 03, 2022
The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

SiamTPN Introduction This is the official implementation of the SiamTPN (WACV2022). The tracker intergrates pyramid feature network and transformer in

Robotics and Intelligent Systems Control @ NYUAD 28 Nov 25, 2022
SCNet: Learning Semantic Correspondence

SCNet Code Region matching code is contributed by Kai Han ([email protected]). Dense

Kai Han 34 Sep 06, 2022
Code for intrusion detection system (IDS) development using CNN models and transfer learning

Intrusion-Detection-System-Using-CNN-and-Transfer-Learning This is the code for the paper entitled "A Transfer Learning and Optimized CNN Based Intrus

Western OC2 Lab 38 Dec 12, 2022
Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)

Pytorch Code for VideoLT [Website][Paper] Updates [10/29/2021] Features uploaded to Google Drive, for access please send us an e-mail: zhangxing18 at

Skye 26 Sep 18, 2022
A torch implementation of "Pixel-Level Domain Transfer"

Pixel Level Domain Transfer A torch implementation of "Pixel-Level Domain Transfer". based on dcgan.torch. Dataset The dataset used is "LookBook", fro

Fei Xia 260 Sep 02, 2022
Diverse Branch Block: Building a Convolution as an Inception-like Unit

Diverse Branch Block: Building a Convolution as an Inception-like Unit (PyTorch) (CVPR-2021) DBB is a powerful ConvNet building block to replace regul

253 Dec 24, 2022
Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Human Pose Regression with Residual Log-likelihood Estimation [Paper] [arXiv] [Project Page] Human Pose Regression with Residual Log-likelihood Estima

JeffLi 347 Dec 24, 2022
Image augmentation library in Python for machine learning.

Augmentor is an image augmentation library in Python for machine learning. It aims to be a standalone library that is platform and framework independe

Marcus D. Bloice 4.8k Jan 07, 2023
Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis

PyEmits, a python package for easy manipulation in time-series data. Time-series data is very common in real life. Engineering FSI industry (Financial

Descript 150 Dec 06, 2022
This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis

This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis, accepted at ACMMM 2021.

Ziqi Yuan 10 Sep 30, 2022
Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

Box_Discretization_Network This repository is built on the pytorch [maskrcnn_benchmark]. The method is the foundation of our ReCTs-competition method

Yuliang Liu 266 Nov 24, 2022
[AAAI-2022] Official implementations of MCL: Mutual Contrastive Learning for Visual Representation Learning

Mutual Contrastive Learning for Visual Representation Learning This project provides source code for our Mutual Contrastive Learning for Visual Repres

winycg 48 Jan 02, 2023
Controlling the MicriSpotAI robot from scratch

Abstract: The SpotMicroAI project is designed to be a low cost, easily built quadruped robot. The design is roughly based off of Boston Dynamics quadr

Florian Wilk 405 Jan 05, 2023
(ICCV 2021) PyTorch implementation of Paper "Progressive Correspondence Pruning by Consensus Learning"

CLNet (ICCV 2021) PyTorch implementation of Paper "Progressive Correspondence Pruning by Consensus Learning" [project page] [paper] Citing CLNet If yo

Chen Zhao 22 Aug 26, 2022
PlenOctree Extraction algorithm

PlenOctrees_NeRF-SH This is an implementation of the Paper PlenOctrees for Real-time Rendering of Neural Radiance Fields. Not only the code provides t

49 Nov 05, 2022
Pytorch Implementation for Dilated Continuous Random Field

DilatedCRF Pytorch implementation for fully-learnable DilatedCRF. If you find my work helpful, please consider our paper: @article{Mo2022dilatedcrf,

DunnoCoding_Plus 3 Nov 13, 2022
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 29, 2022