SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

Overview

SceneCollisionNet

This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more information, please visit the project website.

License

This repo is released under NVIDIA source code license. For business inquiries, please contact [email protected]. For press and other inquiries, please contact Hector Marinez at [email protected]

Install and Setup

Clone and install the repo (we recommend a virtual environment, especially if training or benchmarking, to avoid dependency conflicts):

git clone --recursive https://github.com/mjd3/SceneCollisionNet.git
cd SceneCollisionNet
pip install -e .

These commands install the minimum dependencies needed for generating a mesh dataset and then training/benchmarking using Docker. If you instead wish to train or benchmark without using Docker, please first install an appropriate version of PyTorch and corresponding version of PyTorch Scatter for your system. Then, execute these commands:

git clone --recursive https://github.com/mjd3/SceneCollisionNet.git
cd SceneCollisionNet
pip install -e .[train]

If benchmarking, replace train in the last command with bench.

To rollout the object rearrangement MPPI policy in a simulated tabletop environment, first download Isaac Gym and place it in the extern folder within this repo. Next, follow the previous installation instructions for training, but replace the train option with policy.

To download the pretrained weights for benchmarking or policy rollout, run bash scripts/download_weights.sh.

Generating a Mesh Dataset

To save time during training/benchmarking, meshes are preprocessed and mesh stable poses are calculated offline. SceneCollisionNet was trained using the ACRONYM dataset. To use this dataset for training or benchmarking, download the ShapeNetSem meshes here (note: you must first register for an account) and the ACRONYM grasps here. Next, build Manifold (an external library included as a submodule):

./scripts/install_manifold.sh

Then, use the following script to generate a preprocessed version of the ACRONYM dataset:

python tools/generate_acronym_dataset.py /path/to/shapenetsem/meshes /path/to/acronym datasets/shapenet

If you have your own set of meshes, run:

python tools/generate_mesh_dataset.py /path/to/meshes datasets/your_dataset_name

Note that this dataset will not include grasp data, which is not needed for training or benchmarking SceneCollisionNet, but is be used for rolling out the MPPI policy.

Training/Benchmarking with Docker

First, install Docker and nvidia-docker2 following the instructions here. Pull the SceneCollisionNet docker image from DockerHub (tag scenecollisionnet) or build locally using the provided Dockerfile (docker build -t scenecollisionnet .). Then, use the appropriate configuration .yaml file in cfg to set training or benchmarking parameters (note that cfg file paths are relative to the Docker container, not the local machine) and run one of the commands below (replacing paths with your local paths as needed; -v requires absolute paths).

Train a SceneCollisionNet

Edit cfg/train_scenecollisionnet.yaml, then run:

docker run --gpus all --rm -it -v /path/to/dataset:/dataset:ro -v /path/to/models:/models:rw -v /path/to/cfg:/cfg:ro scenecollisionnet /SceneCollisionNet/scripts/train_scenecollisionnet_docker.sh

Train a RobotCollisionNet

Edit cfg/train_robotcollisionnet.yaml, then run:

docker run --gpus all --rm -it -v /path/to/models:/models:rw -v /path/to/cfg:/cfg:ro scenecollisionnet /SceneCollisionNet/scripts/train_robotcollisionnet_docker.sh

Benchmark a SceneCollisionNet

Edit cfg/benchmark_scenecollisionnet.yaml, then run:

docker run --gpus all --rm -it -v /path/to/dataset:/dataset:ro -v /path/to/models:/models:ro -v /path/to/cfg:/cfg:ro -v /path/to/benchmark_results:/benchmark:rw scenecollisionnet /SceneCollisionNet/scripts/benchmark_scenecollisionnet_docker.sh

Benchmark a RobotCollisionNet

Edit cfg/benchmark_robotcollisionnet.yaml, then run:

docker run --gpus all --rm -it -v /path/to/models:/models:rw -v /path/to/cfg:/cfg:ro -v /path/to/benchmark_results:/benchmark:rw scenecollisionnet /SceneCollisionNet/scripts/train_robotcollisionnet_docker.sh

Loss Plots

To get loss plots while training, run:

docker exec -d <container_name> python3 tools/loss_plots.py /models/<model_name>/log.csv

Benchmark FCL or SDF Baselines

Edit cfg/benchmark_baseline.yaml, then run:

docker run --gpus all --rm -it -v /path/to/dataset:/dataset:ro -v /path/to/benchmark_results:/benchmark:rw -v /path/to/cfg:/cfg:ro scenecollisionnet /SceneCollisionNet/scripts/benchmark_baseline_docker.sh

Training/Benchmarking without Docker

First, install system dependencies. The system dependencies listed assume an Ubuntu 18.04 install with NVIDIA drivers >= 450.80.02 and CUDA 10.2. You can adjust the dependencies accordingly for different driver/CUDA versions. Note that the NVIDIA drivers come packaged with EGL, which is used during training and benchmarking for headless rendering on the GPU.

System Dependencies

See Dockerfile for a full list. For training/benchmarking, you will need:

python3-dev
python3-pip
ninja-build
libcudnn8=8.1.1.33-1+cuda10.2
libcudnn8-dev=8.1.1.33-1+cuda10.2
libsm6
libxext6
libxrender-dev
freeglut3-dev
liboctomap-dev
libfcl-dev
gifsicle
libfreetype6-dev
libpng-dev

Python Dependencies

Follow the instructions above to install the necessary dependencies for your use case (either the train, bench, or policy options).

Train a SceneCollisionNet

Edit cfg/train_scenecollisionnet.yaml, then run:

PYOPENGL_PLATFORM=egl python tools/train_scenecollisionnet.py

Train a RobotCollisionNet

Edit cfg/train_robotcollisionnet.yaml, then run:

python tools/train_robotcollisionnet.py

Benchmark a SceneCollisionNet

Edit cfg/benchmark_scenecollisionnet.yaml, then run:

PYOPENGL_PLATFORM=egl python tools/benchmark_scenecollisionnet.py

Benchmark a RobotCollisionNet

Edit cfg/benchmark_robotcollisionnet.yaml, then run:

python tools/benchmark_robotcollisionnet.py

Benchmark FCL or SDF Baselines

Edit cfg/benchmark_baseline.yaml, then run:

PYOPENGL_PLATFORM=egl python tools/benchmark_baseline.py

Policy Rollout

To view a rearrangement MPPI policy rollout in a simulated Isaac Gym tabletop environment, run the following command (note that this requires a local machine with an available GPU and display):

python tools/rollout_policy.py --self-coll-nn weights/self_coll_nn --scene-coll-nn weights/scene_coll_nn --control-frequency 1

There are many possible options for this command that can be viewed using the --help command line argument and set with the appropriate argument. If you get RuntimeError: CUDA out of memory, try reducing the horizon (--mppi-horizon, default 40), number of trajectories (--mppi-num-rollouts, default 200) or collision steps (--mppi-collision-steps, default 10). Note that this may affect policy performance.

Citation

If you use this code in your own research, please consider citing:

@inproceedings{danielczuk2021object,
  title={Object Rearrangement Using Learned Implicit Collision Functions},
  author={Danielczuk, Michael and Mousavian, Arsalan and Eppner, Clemens and Fox, Dieter},
  booktitle={Proc. IEEE Int. Conf. Robotics and Automation (ICRA)},
  year={2021}
}
Owner
NVIDIA Research Projects
NVIDIA Research Projects
[python3.6] 运用tf实现自然场景文字检测,keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别

本文基于tensorflow、keras/pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别 update20190706 为解决本项目中对数学公式预测的准确性,做了其他的改进和尝试,效果还不错,https://github.com/xiaofengShi/Image2Katex 希

xiaofeng 2.7k Dec 25, 2022
Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

SSTDNet Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight' using pytorch. This code is work for general object detecti

HotaekHan 84 Jan 05, 2022
Select range and every time the screen changes, OCR is activated.

ASOCR(Auto Screen OCR) Select range and every time you press Space key, OCR is activated. 範囲を選ぶと、あなたがスペースキーを押すたびに、画面が変わる度にOCRが起動します。 usage1: simple OC

1 Feb 13, 2022
Using Opencv ,based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching

Using Opencv ,this project is based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching ,it will just mask that image . This project ,if used in cctv

1 Feb 13, 2022
Msos searcher - A half-hearted attempt at finding a magic square of squares

MSOS searcher A half-hearted attempt at finding (or rather searching) a MSOS (Magic Square of Squares) in the spirit of the Parker Square. Running I r

Niels Mündler 1 Jan 02, 2022
nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex.

faceprocessor nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex. Tech faceprocessor uses a number of open source projec

NoFaceDB 3 Sep 06, 2021
Pure Javascript OCR for more than 100 Languages 📖🎉🖥

Version 2 is now available and under development in the master branch, read a story about v2: Why I refactor tesseract.js v2? Check the support/1.x br

Project Naptha 29.2k Jan 05, 2023
scantailor - Scan Tailor is an interactive post-processing tool for scanned pages.

Scan Tailor - scantailor.org This project is no longer maintained, and has not been maintained for a while. About Scan Tailor is an interactive post-p

1.5k Dec 28, 2022
Histogram specification using openCV in python .

histogram specification using openCV in python . Have to input miu and sigma to draw gausssian distribution which will be used to map the input image . Example input can be miu = 128 sigma = 30

Tamzid hasan 6 Nov 17, 2021
Create single line SVG illustrations from your pictures

Create single line SVG illustrations from your pictures

Javier Bórquez 686 Dec 26, 2022
Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.

Este programa tem o intuito de ser um modificador de arquivos PDF. Os arquivos PDFs podem ser 3: PDFs verdadeiros - em que podem ser selecionados o ti

Daniel Soares Saldanha 2 Oct 11, 2021
MeshToGeotiff - A fast Python algorithm to convert a 3D mesh into a GeoTIFF

MeshToGeotiff - A fast Python algorithm to convert a 3D mesh into a GeoTIFF Python class for converting (very fast) 3D Meshes/Surfaces to Raster DEMs

8 Sep 10, 2022
This is a real life mario project using python and mediapipe

real-life-mario This is a real life mario project using python and mediapipe How to run to run this just run - realMario.py file requirements This req

Programminghut 42 Dec 22, 2022
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 4 Jul 11, 2022
Distilling Knowledge via Knowledge Review, CVPR 2021

ReviewKD Distilling Knowledge via Knowledge Review Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia This project provides an implementation for the

DV Lab 194 Dec 28, 2022
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

hocr-tools About About the code Installation System-wide with pip System-wide from source virtualenv Available Programs hocr-check -- check the hOCR f

OCRopus 285 Dec 08, 2022
Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

LayoutAnalysisEvaluator Layout Analysis Evaluator for: ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records ICD

17 Dec 08, 2022
TedEval: A Fair Evaluation Metric for Scene Text Detectors

TedEval: A Fair Evaluation Metric for Scene Text Detectors Official Python 3 implementation of TedEval | paper | slides Chae Young Lee, Youngmin Baek,

Clova AI Research 167 Nov 20, 2022
question‘s area recognition using image processing and regular expression

======================================== Paper-Question-recognition ======================================== question‘s area recognition using image p

Yuta Mizuki 7 Dec 27, 2021
MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI. It is an open-source and easy-to-install ecosystem that can run locally on a machine with one

Project MONAI 344 Dec 23, 2022