Tensorflow AffordanceNet and AffContext implementations

Overview

AffordanceNet and AffContext

This is tensorflow AffordanceNet and AffContext implementations. Both are implemented and tested with tensorflow 2.3.

The main objective of both architectures is to identify action affordances, so that they can be used in real robotic applications to understand the diverse objects present in the environment.

Both models have been trained on IIT-AFF and UMD datasets.

Detections on novel image

Novel image

Example of ground truth affordances compared with the affordance detection results by AffordanceNet and AffContext on the IIT-AFF dataset.

IIT results

IIT colours

Example of ground truth affordances compared with the affordance detection results by AffordanceNet and AffContext on the UMD dataset.

UMD results

UMD colours

AffordanceNet simultaneously detects multiple objects with their corresponding classes and affordances. This network mainly consists of two branches: an object detection branch to localise and classify the objects in the image, and an affordance detection branch to predict the most probable affordance label for each pixel in the object.

AffordanceNet

AffContext correctly predicts the pixel-wise affordances independently of the class of the object, which allows to infer the affordances for unseen objects. The structure of this network is similar to AffordanceNet, but the object detection branch only performs binary classification into foreground and background areas, and it includes two new blocks: an auxiliary task to infer the affordances in the region and a self-attention mechanism to capture rich contextual dependencies through the region.

AffContext

Results

The results of the tensorflow implementation are contrasted with the values provided in the papers from AffordanceNet and AffContext. However, since the procedure of how the results are processed to obtain the final metrics in both networks may be different, the results are also compared with the values obtained by running the original trained models, but processing the outputs and calculating the measures with the code from this repository. These results are denoted with * in the comparison tables.

Affordances AffordanceNet
(Caffe)
AffordanceNet* AffordanceNet
(tf)
contain 79.61 73.68 74.17
cut 75.68 64.71 66.97
display 77.81 82.81 81.84
engine 77.50 81.09 82.63
grasp 68.48 64.13 65.49
hit 70.75 82.13 83.25
pound 69.57 65.90 65.73
support 69.57 74.43 75.26
w-grasp 70.98 77.63 78.45
Average 73.35 74.06 74.87
Affordances AffContext
(Caffe)
AffContext* AffContext
(tf)
grasp 0.60 0.51 0.55
cut 0.37 0.31 0.26
scoop 0.60 0.52 0.52
contain 0.61 0.55 0.57
pound 0.80 0.68 0.64
support 0.88 0.69 0.21
w-grasp 0.94 0.88 0.85
Average 0.69 0.59 0.51

Setup guide

Requirements

  • Python 3
  • CUDA 10.1

Installation

  1. Clone the repository into your $AffordanceNet_ROOT folder.

  2. Install the required Python3 packages with: pip3 install -r requirements.txt

Testing

  1. Download the pretrained weights:

    • AffordanceNet weights trained on IIT-AFF dataset.
    • AffContext weights trained on UMD dataset.
  2. Extract the file into $AffordanceNet_ROOT/weights folder.

  3. Visualize results for AffordanceNet trained on IIT-AFF dataset:

python3 affordancenet_predictor.py --config_file config_iit_test
  1. Visualize results for AffContext trained on UMD dataset:
python3 affcontext_predictor.py --config_file config_umd_test

Training

  1. Download the IIT-AFF or UMD datasets in Pascal-VOC format following the instructions in AffordanceNet (IIT-AFF) and AffContext(UMD).

  2. Extract them into the $AffordanceNet_ROOT/data folder and make sure to have the following folder structure for IIT-AFF dataset:

    • cache/
    • VOCdevkit2012/

The same applies for UMD dataset, but folder names should be cache_UMD and VOCdevkit2012_UMD

  1. Run the command to train AffordanceNet on IIT-AFF dataset:
python3 affordancenet_trainer.py --config_file config_iit_train
  1. Run the command to train AffContext on UMD dataset:
python3 affcontext_trainer.py --config_file config_umd_train

Acknowledgements

This repo used source code from AffordanceNet and Faster-RCNN

Owner
Beatriz Pérez
MSc student in Computer Science at Universität Bonn, Germany. Computer Engineer from Universidad de Zaragoza, Spain.
Beatriz Pérez
From Perceptron model to Deep Neural Network from scratch in Python.

Neural-Network-Basics Aim of this Repository: From Perceptron model to Deep Neural Network (from scratch) in Python. ** Currently working on a basic N

Aditya Kahol 1 Jan 14, 2022
Implementations for the ICLR-2021 paper: SEED: Self-supervised Distillation For Visual Representation.

Implementations for the ICLR-2021 paper: SEED: Self-supervised Distillation For Visual Representation.

Jacob 27 Oct 23, 2022
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Ubisoft 76 Dec 30, 2022
Multi-Task Learning as a Bargaining Game

Nash-MTL Official implementation of "Multi-Task Learning as a Bargaining Game". Setup environment conda create -n nashmtl python=3.9.7 conda activate

Aviv Navon 87 Dec 26, 2022
Repo for EMNLP 2021 paper "Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression"

beyond-preserved-accuracy Repo for EMNLP 2021 paper "Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression" How to implemen

Kevin Canwen Xu 10 Dec 23, 2022
A SAT-based sudoku solver

SAT Sudoku solver A SAT-based Sudoku solver made in the context of a small project in the "Logic Problem Solving" class in the first year at the Polyt

Alexandre Malfreyt 5 Apr 15, 2022
Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

CenterPose Overview This repository is the official implementation of the paper "Single-stage Keypoint-based Category-level Object Pose Estimation fro

NVIDIA Research Projects 188 Dec 27, 2022
Wide Residual Networks (WideResNets) in PyTorch

Wide Residual Networks (WideResNets) in PyTorch WideResNets for CIFAR10/100 implemented in PyTorch. This implementation requires less GPU memory than

Jason Kuen 296 Dec 27, 2022
Implicit Model Specialization through DAG-based Decentralized Federated Learning

Federated Learning DAG Experiments This repository contains software artifacts to reproduce the experiments presented in the Middleware '21 paper "Imp

Operating Systems and Middleware Group 5 Oct 16, 2022
Ros2-voiceroid2 - ROS2 wrapper package of VOICEROID2

ros2_voiceroid2 ROS2 wrapper package of VOICEROID2 Windows Only Installation Ins

Nkyoku 1 Jan 23, 2022
Fully Convolutional Refined Auto Encoding Generative Adversarial Networks for 3D Multi Object Scenes

Fully Convolutional Refined Auto-Encoding Generative Adversarial Networks for 3D Multi Object Scenes This repository contains the source code for Full

Yu Nishimura 106 Nov 21, 2022
Efficiently computes derivatives of numpy code.

Note: Autograd is still being maintained but is no longer actively developed. The main developers (Dougal Maclaurin, David Duvenaud, Matt Johnson, and

Formerly: Harvard Intelligent Probabilistic Systems Group -- Now at Princeton 6.1k Jan 08, 2023
Simultaneous NMT/MMT framework in PyTorch

This repository includes the codes, the experiment configurations and the scripts to prepare/download data for the Simultaneous Machine Translation wi

<a href=[email protected]"> 37 Sep 29, 2022
The project covers common metrics for super-resolution performance evaluation.

Super-Resolution Performance Evaluation Code The project covers common metrics for super-resolution performance evaluation. Metrics support The script

xmy 10 Aug 03, 2022
A GOOD REPRESENTATION DETECTS NOISY LABELS

A GOOD REPRESENTATION DETECTS NOISY LABELS This code is a PyTorch implementation of the paper: Prerequisites Python 3.6.9 PyTorch 1.7.1 Torchvision 0.

<a href=[email protected]"> 64 Jan 04, 2023
This is the code for HOI Transformer

HOI Transformer Code for CVPR 2021 accepted paper End-to-End Human Object Interaction Detection with HOI Transformer. Reproduction We recomend you to

BigBangEpoch 124 Dec 29, 2022
PowerGridworld: A Framework for Multi-Agent Reinforcement Learning in Power Systems

PowerGridworld provides users with a lightweight, modular, and customizable framework for creating power-systems-focused, multi-agent Gym environments that readily integrate with existing training fr

National Renewable Energy Laboratory 37 Dec 17, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

selfcontact This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] It includes the main function

Lea Müller 68 Dec 06, 2022
Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"

Scripts for "Current best-practices in single-cell RNA-seq: a tutorial" This repository is complementary to the publication: M.D. Luecken, F.J. Theis,

Theis Lab 968 Dec 28, 2022
Official pytorch implementation for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion (CVPR 2022)

Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion This repository contains a pytorch implementation of "Learning to Listen: Modeling

50 Dec 17, 2022