Hooks for VCOCO

Last update: Nov 24, 2022

Related tags

Overview

Verbs in COCO (V-COCO) Dataset

This repository hosts the Verbs in COCO (V-COCO) dataset and associated code to evaluate models for the Visual Semantic Role Labeling (VSRL) task as ddescribed in this technical report.

Citing

If you find this dataset or code base useful in your research, please consider citing the following papers:

@article{gupta2015visual,
  title={Visual Semantic Role Labeling},
  author={Gupta, Saurabh and Malik, Jitendra},
  journal={arXiv preprint arXiv:1505.04474},
  year={2015}
}

@incollection{lin2014microsoft,
  title={Microsoft COCO: Common objects in context},
  author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence},
  booktitle={Computer Vision--ECCV 2014},
  pages={740--755},
  year={2014},
  publisher={Springer}
}

Installation

Clone repository (recursively, so as to include COCO API).

git clone --recursive https://github.com/s-gupta/v-coco.git

This dataset builds off MS COCO, please download MS-COCO images and annotations.
Current V-COCO release only uses a subset of MS-COCO images (Image IDs listed in data/splits/vcoco_all.ids). Use the following script to pick out annotations from the COCO annotations to allow faster loading in V-COCO.
```
# Assume you cloned the repository to `VCOCO_DIR'
cd $VCOCO_DIR
# If you downloaded coco annotations to coco-data/annotations
python script_pick_annotations.py coco-data/annotations
```

Build coco/PythonAPI/pycocotools/_mask.so, cython_bbox.so.

# Assume you cloned the repository to `VCOCO_DIR'
cd $VCOCO_DIR/coco/PythonAPI/ && make
cd $VCOCO_DIR && make

Using the dataset

An IPython notebook, illustrating how to use the annotations in the dataset is available in V-COCO.ipynb
The current release of the dataset includes annotations as indicated in Table 1 in the paper. We are collecting role annotations for the 6 categories (that are missing) and will make them public shortly.

Evaluation

We provide evaluation code that computes agent AP and role AP, as explained in the paper.

In order to use the evaluation code, store your predictions as a pickle file (.pkl) in the following format:

[ {'image_id':        # the coco image id,
   'person_box':      #[x1, y1, x2, y2] the box prediction for the person,
   '[action]_agent':  # the score for action corresponding to the person prediction,
   '[action]_[role]': # [x1, y1, x2, y2, s], the predicted box for role and 
                      # associated score for the action-role pair.
   } ]

Assuming your detections are stored in det_file=/path/to/detections/detections.pkl, do

from vsrl_eval import VCOCOeval
vcocoeval = VCOCOeval(vsrl_annot_file, coco_file, split_file)
  # e.g. vsrl_annot_file: data/vcoco/vcoco_val.json
  #      coco_file:       data/instances_vcoco_all_2014.json
  #      split_file:      data/splits/vcoco_val.ids
vcocoeval._do_eval(det_file, ovr_thresh=0.5)

We introduce two scenarios for role AP evaluation.

[Scenario 1] In this scenario, for the test cases with missing role annotations an agent role prediction is correct if the action is correct & the overlap between the person boxes is >0.5 & the corresponding role is empty e.g. [0,0,0,0] or [NaN,NaN,NaN,NaN]. This scenario is fit for missing roles due to occlusion.
[Scenario 2] In this scenario, for the test cases with missing role annotations an agent role prediction is correct if the action is correct & the overlap between the person boxes is >0.5 (the corresponding role is ignored). This scenario is fit for the cases with roles outside the COCO categories.

Hooks for VCOCO

Related tags

Overview

Verbs in COCO (V-COCO) Dataset

Citing

Installation

Using the dataset

Evaluation

Owner

Saurabh Gupta

CTF Challenge for CSAW Finals 2021

A toolset of Python programs for signal modeling and indentification via sparse semilinear autoregressors.

Remote sensing change detection using PaddlePaddle

Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.

PyGCL: A PyTorch Library for Graph Contrastive Learning

Read number plates with https://platerecognizer.com/

Yolov3 pytorch implementation

Deep learning based hand gesture recognition using LSTM and MediaPipie.

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Fast, flexible and easy to use probabilistic modelling in Python.

This repository contains code from the paper "TTS-GAN: A Transformer-based Time-Series Generative Adversarial Network"

A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Code for visualizing the loss landscape of neural nets

Price-Prediction-For-a-Dream-Home - A machine learning based linear regression trained model for house price prediction.

A novel Engagement Detection with Multi-Task Training (ED-MTT) system

Programming with Neural Surrogates of Programs

A multi-entity Transformer for multi-agent spatiotemporal modeling.

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.