Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Overview

Head Detector

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection module can be installed using pip in order to be able to plug-and-play with HeadHunter-T.

Requirements

  1. Nvidia Driver >= 418

  2. Cuda 10.0 and compaitible CudNN

  3. Python packages : To install the required python packages; conda env create -f head_detection.yml.

  4. Use the anaconda environment head_detection by activating it, source activate head_detection or conda activate head_detection.

  5. Alternatively pip can be used to install required packages using pip install -r requirements.txt or update your existing environment with the aforementioned yml file.

Training

  1. To train a model, define environment variable NGPU, config file and use the following command

$python -m torch.distributed.launch --nproc_per_node=$NGPU --use_env train.py --cfg_file config/config_chuman.yaml --world_size $NGPU --num_workers 4

  1. Training is currently supported over (a) ScutHead dataset (b) CrowdHuman + ScutHead combined, (c) Our proposed CroHD dataset. This can be mentioned in the config file.

  2. To train the model, config files must be defined. More details about the config files are mentioned in the section below

Evaluation and Testing

  1. Unlike the training, testing and evaluation does not have a config file. Rather, all the parameters are set as argument variable while executing the code. Refer to the respective files, evaluate.py and test.py.
  2. evaluate.py evaluates over the validation/test set using AP, MMR, F1, MODA and MODP metrics.
  3. test.py runs the detector over a "bunch of images" in the testing set for qualitative evaluation.

Config file

A config file is necessary for all training. It's built to ease the number of arg variable passed during each execution. Each sub-sections are as elaborated below.

  1. DATASET

    1. Set the base_path as the parent directory where the dataset is situated at.
    2. Train and Valid are .txt files that contains relative path to respective images from the base_path defined above and their corresponding Ground Truth in (x_min, y_min, x_max, y_max) format. Generation files for the three datasets can be seen inside data directory. For example,
    /path/to/image.png
    x_min_1, y_min_1, x_max_1, y_max_1
    x_min_2, y_min_2, x_max_2, y_max_2
    x_min_3, y_min_3, x_max_3, y_max_3
    .
    .
    .
    
    1. mean_std are RGB means and stdev of the training dataset. If not provided, can be computed prior to the start of the training
  2. TRAINING

    1. Provide pretrained_model and corresponding start_epoch for resuming.
    2. milestones are epoch at which the learning rates are set to 0.1 * lr.
    3. only_backbone option loads just the Resnet backbone and not the head. Not applicable for mobilenet.
  3. NETWORK

    1. The mentioned parameters are as described in experiment section of the paper.
    2. When using median_anchors, the anchors have to be defined in anchors.py.
    3. We experimented with mobilenet, resnet50 and resnet150 as alternative backbones. This experiment was not reported in the paper due to space constraints. We found the accuracy to significantly decrease with mobilenet but resnet50 and resnet150 yielded an almost same performance.
    4. We also briefly experimented with Deformable Convolutions but again didn't see noticable improvements in performance. The code we used are available in this repository.

Note :

This codebase borrows a noteable portion from pytorch-vision owing to the fact some of their modules cannot be "imported" as a package.

Citation :

@InProceedings{Sundararaman_2021_CVPR,
    author    = {Sundararaman, Ramana and De Almeida Braga, Cedric and Marchand, Eric and Pettre, Julien},
    title     = {Tracking Pedestrian Heads in Dense Crowd},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {3865-3875}
}
Owner
Ramana Sundararaman
Ramana Sundararaman
A GOOD REPRESENTATION DETECTS NOISY LABELS

A GOOD REPRESENTATION DETECTS NOISY LABELS This code is a PyTorch implementation of the paper: Prerequisites Python 3.6.9 PyTorch 1.7.1 Torchvision 0.

<a href=[email protected]"> 64 Jan 04, 2023
Object Tracking and Detection Using OpenCV

Object tracking is one such application of computer vision where an object is detected in a video, otherwise interpreted as a set of frames, and the object’s trajectory is estimated. For instance, yo

Happy N. Monday 4 Aug 21, 2022
HarDNeXt: Official HarDNeXt repository

HarDNeXt-Pytorch HarDNeXt: A Stage Receptive Field and Connectivity Aware Convolution Neural Network HarDNeXt-MSEG for Medical Image Segmentation in 0

5 May 26, 2022
TCPNet - Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition

Temporal-attentive-Covariance-Pooling-Networks-for-Video-Recognition This is an implementation of TCPNet. Introduction For video recognition task, a g

Zilin Gao 21 Dec 08, 2022
Hand Gesture Volume Control | Open CV | Computer Vision

Gesture Volume Control Hand Gesture Volume Control | Open CV | Computer Vision Use gesture control to change the volume of a computer. First we look i

Jhenil Parihar 3 Jun 15, 2022
ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

ST++ This is the official PyTorch implementation of our paper: ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation. Lihe Ya

Lihe Yang 147 Jan 03, 2023
Nerf pl - NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning

nerf_pl Update: an improved NSFF implementation to handle dynamic scene is open! Update: NeRF-W (NeRF in the Wild) implementation is added to nerfw br

AI葵 1.8k Dec 30, 2022
Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"

High-Performance Brain-to-Text Communication via Handwriting Overview This repo is associated with this manuscript, preprint and dataset. The code can

Francis R. Willett 306 Jan 03, 2023
An open source app to help calm you down when needed.

By: Seanpm2001, Et; Al. Top README.md Read this article in a different language Sorted by: A-Z Sorting options unavailable ( af Afrikaans Afrikaans |

Sean P. Myrick V19.1.7.2 2 Oct 24, 2022
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)

PDVC Official implementation for End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021) [paper] [valse论文速递(Chinese)] This repo supports:

Teng Wang 118 Dec 16, 2022
Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

Visual Parser (ViP) This is the official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers. Key Feature

Shuyang Sun 117 Dec 11, 2022
deep_image_prior_extension

Code for "Is Deep Image Prior in Need of a Good Education?" Project page: https://jleuschn.github.io/docs.educated_deep_image_prior/. Supplementary Ma

riccardo barbano 7 Jan 09, 2022
This repository is for Competition for ML_data class

This repository is for Competition for ML_data class. Based on mmsegmentatoin,mainly using swin transformer to completed the competition.

jianlong 2 Oct 23, 2022
YOLOv5 + ROS2 object detection package

YOLOv5-ROS YOLOv5 + ROS2 object detection package This program changes the input of detect.py (ultralytics/yolov5) to sensor_msgs/Image of ROS2. Requi

Ar-Ray 23 Dec 19, 2022
BlueFog Tutorials

BlueFog Tutorials Welcome to the BlueFog tutorials! In this repository, we've put together a collection of awesome Jupyter notebooks. These notebooks

4 Oct 27, 2021
Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement Recently, the power of unconditional image synthesis has significantly advanced th

967 Jan 04, 2023
Aesara is a Python library that allows one to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Aesara is a Python library that allows one to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Aesara 898 Jan 07, 2023
The MLOps platform for innovators 🚀

​ DS2.ai is an integrated AI operation solution that supports all stages from custom AI development to deployment. It is an AI-specialized platform service that collects data, builds a training datas

9 Jan 03, 2023
PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Background Activation Suppression for Weakly Supervised Object Localization PyTorch implementation of ''Background Activation Suppression for Weakly S

35 Jan 06, 2023
[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

FaPN: Feature-aligned Pyramid Network for Dense Image Prediction [arXiv] [Project Page] @inproceedings{ huang2021fapn, title={{FaPN}: Feature-alig

EMI-Group 175 Dec 30, 2022