Official PyTorch implementation of "The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation" (ICCV 21).

Last update: Dec 25, 2022

Related tags

Deep Learning center-group

Overview

CenterGroup

This the official implementation of our ICCV 2021 paper

The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation,
Guillem Brasó, Nikita Kister, Laura Leal-Taixé
We introduce CenterGroup, an attention-based framework to estimate human poses from a set of identity-agnostic keypoints and person center predictions in an image. Our approach uses a transformer to obtain context-aware embeddings for all detected keypoints and centers and then applies multi-head attention to directly group joints into their corresponding person centers. While most bottom-up methods rely on non-learnable clustering at inference, CenterGroup uses a fully differentiable attention mechanism that we train end-to-end together with our keypoint detector. As a result, our method obtains state-of-the-art performance with up to 2.5x faster inference time than competing bottom-up methods.

@article{Braso_2021_ICCV,
    author    = {Bras\'o, Guillem and Kister, Nikita and Leal-Taix\'e, Laura},
    title     = {The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation},
    journal = {ICCV},
    year      = {2021}
}

Main Results

With the code contained in this repo, you should be able to reproduce the following results.

Results on COCO val2017

Method	Detector	Multi-Scale Test	Input size	AP	AP.5	AP .75	AP (M)	AP (L)
CenterGroup	HigherHRNet-w32	✘	512	69.0	87.7	74.4	59.9	75.3
CenterGroup	HigherHRNet-w48	✘	640	71.0	88.7	76.5	63.1	75.2
CenterGroup	HigherHRNet-w32	✔	512	71.9	89.0	78.0	63.7	77.4
CenterGroup	HigherHRNet-w48	✔	640	73.3	89.7	79.2	66.4	76.7

Results on COCO test2017

Method	Detector	Multi-Scale Test	Input size	AP	AP .5	AP .75	AP (M)	AP (L)
CenterGroup	HigherHRNet-w32	✘	512	67.6	88.6	73.6	62.0	75.6
CenterGroup	HigherHRNet-w48	✘	640	69.5	89.7	76.0	65.0	76.2
CenterGroup	HigherHRNet-w32	✔	512	70.3	90.0	76.9	65.4	77.5
CenterGroup	HigherHRNet-w48	✔	640	71.4	90.5	78.1	67.2	77.5

Results on CrowdPose test

Method	Detector	Multi-Scale Test	Input size	AP	AP .5	AP .75	AP (E)	AP (M)	AP (H)
CenterGroup	HigherHRNet-w48	✘	640	67.6	87.6	72.7	74.2	68.1	61.1
CenterGroup	HigherHRNet-w48	✔	640	70.3	89.1	75.7	77.3	70.8	63.2

Installation

Please see docs/INSTALL.md

Model Zoo

Please see docs/MODEL_ZOO.md

Evaluation

To evaluate a model you have to specify its configuration file, its checkpoint, and the number of GPUs you want to use. All of our configurations and checkpoints are available here) For example, to run CenterGroup with a HigherHRNet32 detector and a single GPU you can run the following:

NUM_GPUS=1
./tools/dist_test.sh configs/centergroup2/coco/higherhrnet_w32_coco_512x512 models/centergroup/centergroup_higherhrnet_w32_coco_512x512.pth $NUM_GPUS 1234

If you want to use multi-scale testing, please add the --multi-scale flag, e.g.:

./tools/dist_test.sh configs/centergroup2/coco/higherhrnet_w32_coco_512x512 models/centergroup/centergroup_higherhrnet_w32_coco_512x512.pth $NUM_GPUS 1234 --multi-scale

You can also modify any other config entry with the --cfg-options entry. For example, to disable flip-testing, which is used by default, you can run:

./tools/dist_test.sh configs/centergroup2/coco/higherhrnet_w32_coco_512x512 models/centergroup/centergroup_higherhrnet_w32_coco_512x512.pth $NUM_GPUS 1234 --cfg-options model.test_cfg.flip_test=False

You may need to modify the checkpoint's path, depending on where you downloaded it, and the entry data_root in the config file, depending on where you stored your data.

Training HigherHRNet with Centers

TODO

Training CenterGroup

TODO

Demo

TODO

Acknowledgements

Our code is based on mmpose, which reimplemented HigherHRNet's work. We thank the authors of these codebases for their great work!

Official PyTorch implementation of "The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation" (ICCV 21).

Related tags

Overview

CenterGroup

Main Results

Results on COCO val2017

Results on COCO test2017

Results on CrowdPose test

Installation

Model Zoo

Evaluation

Training HigherHRNet with Centers

Training CenterGroup

Demo

Acknowledgements

Owner

Dynamic Vision and Learning Group

PyTorch implementation for our paper Learning Character-Agnostic Motion for Motion Retargeting in 2D, SIGGRAPH 2019

Curating a dataset for bioimage transfer learning

Session-aware Item-combination Recommendation with Transformer Network

GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification

LQM - Improving Object Detection by Estimating Bounding Box Quality Accurately

Udacity Suse Cloud Native Foundations Scholarship Course Walkthrough

Fuzzification helps developers protect the released, binary-only software from attackers who are capable of applying state-of-the-art fuzzing techniques

For IBM Quantum Challenge 2021 (May 20 - 26)

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery

[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

LAMDA: Label Matching Deep Domain Adaptation

Official PyTorch implementation of the paper "Self-Supervised Relational Reasoning for Representation Learning", NeurIPS 2020 Spotlight.

Static-test - A playground to play with ideas related to testing the comparability of the code

A python package to perform same transformation to coco-annotation as performed on the image.

🎁 3,000,000+ Unsplash images made available for research and machine learning

Various operations like path tracking, counting, etc by using yolov5

Official implementation of "OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association" in PyTorch.

A paper using optimal transport to solve the graph matching problem.

VOLO: Vision Outlooker for Visual Recognition