Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Last update: Dec 18, 2022

Overview

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting (official Pytorch implementation)

This paper submitted to TIP is the extension of the previous Arxiv paper.

This project aims to

provide a baseline of pedestrian attribute recognition.
provide two new datasets RAPzs and PETAzs following zero-shot pedestrian identity setting.
provide a general training pipeline for pedestrian attribute recognition and multi-label classification task.

This project provide

DDP training, which is mainly used for multi-label classifition.
Training on all attributes, testing on "selected" attribute. Because the proportion of positive samples for other attributes is less than a threshold, such as 0.01.
1. For PETA and PETAzs, 35 of the 105 attributes are selected for performance evaluation.
2. For RAPv1, 51 of the 92 attributes are selected for performance evaluation.
3. For RAPv2 and RAPzs, 54 and 53 of the 152 attributes are selected for performance evaluation.
4. For PA100k, all attributes are selected for performance evaluation.
- However, training on all attributes can not bring consistent performance improvement on various datasets.
EMA model.
Transformer-base model, such as swin-transformer (with a huge performance improvement) and vit.
Convenient dataset info file like dataset_all.pkl

Dataset Info

PETA: Pedestrian Attribute Recognition At Far Distance [Paper][Project]
PA100K[Paper][Github]
RAP : A Richly Annotated Dataset for Pedestrian Attribute Recognition
- v1 [Paper][Project]
- v2 [Paper][Project]
PETAzs & RAPzs : Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting Paper [Project]

Performance

Pedestrian Attribute Recognition

Datasets	Models	ma	Acc	Prec	Rec	F1
PA100k	resnet50	80.21	79.15	87.79	87.01	87.40
--	resnet50*	79.85	79.13	89.45	85.40	87.38
--	resnet50 + EMA	81.97	80.20	88.06	88.17	88.11
--	bninception	79.13	78.19	87.42	86.21	86.81
--	TresnetM	74.46	68.72	79.82	80.71	80.26
--	swin_s	82.19	80.35	87.85	88.51	88.18
--	vit_s	79.40	77.61	86.41	86.22	86.32
--	vit_b	81.01	79.38	87.60	87.49	87.55
PETA	resnet50	83.96	78.65	87.08	85.62	86.35
PETAzs	resnet50	71.43	58.69	74.41	69.82	72.04
RAPv1	resnet50	79.27	67.98	80.19	79.71	79.95
RAPv2	resnet50	78.52	66.09	77.20	80.23	78.68
RAPzs	resnet50	71.76	64.83	78.75	76.60	77.66

The resnet* model is trained by using the weighted function proposed by Tan in AAAI2020.
Performance in PETAzs and RAPzs based on the first version of PETAzs and RAPzs as described in paper.
Experiments are conducted on the input size of (256, 192), so there may be minor differences from the results in the paper.
The reported performance can be achieved at the first drop of learning rate. We also take this model as the best model.
Pretrained models are provided now at Google Drive.

Multi-label Classification

Datasets	Models	mAP	CP	CR	CF1	OP	OR	OF1
COCO	resnet101	82.75	84.17	72.07	77.65	85.16	75.47	80.02

Pretrained Models

Dependencies

python 3.7
pytorch 1.7.0
torchvision 0.8.2
cuda 10.1

Get Started

Run git clone https://github.com/valencebond/Rethinking_of_PAR.git
Create a directory to dowload above datasets.
```
cd Rethinking_of_PAR
mkdir data
```

Prepare datasets to have following structure:

${project_dir}/data
    PETA
        images/
        PETA.mat
        dataset_all.pkl
        dataset_zs_run0.pkl
    PA100k
        data/
        dataset_all.pkl
    RAP
        RAP_dataset/
        RAP_annotation/
        dataset_all.pkl
    RAP2
        RAP_dataset/
        RAP_annotation/
        dataset_zs_run0.pkl
    COCO14
        train2014/
        val2014/
        ml_anno/
            category.json
            coco14_train_anno.pkl
            coco14_val_anno.pkl

Train baseline based on resnet50
```
sh train.sh
```

Acknowledgements

Codes are based on the repository from Dangwei Li and Houjing Huang. Thanks for their released code.

Citation

If you use this method or this code in your research, please cite as:

@article{jia2021rethinking,
  title={Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting},
  author={Jia, Jian and Huang, Houjing and Chen, Xiaotang and Huang, Kaiqi},
  journal={arXiv preprint arXiv:2107.03576},
  year={2021}
}

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Related tags

Overview

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting (official Pytorch implementation)

This project aims to

This project provide

Dataset Info

Performance

Pedestrian Attribute Recognition

Multi-label Classification

Pretrained Models

Dependencies

Get Started

Acknowledgements

Citation

Owner

Jian

DCA - Official Python implementation of Delaunay Component Analysis algorithm

UPSNet: A Unified Panoptic Segmentation Network

Repo público onde postarei meus estudos de Python, buscando aprender por meio do compartilhamento do aprendizado!

Tools for robust generative diffeomorphic slice to volume reconstruction

[PAMI 2020] Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation

Minimalistic PyTorch training loop

Neon-erc20-example - Example of creating SPL token and wrapping it with ERC20 interface in Neon EVM

Python interface for the DIGIT tactile sensor

PyTorch implementation of Constrained Policy Optimization

Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)

Pun Detection and Location

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

Code for "3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop"

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

OneFlow is a performance-centered and open-source deep learning framework.

Build Graph Nets in Tensorflow

PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Fashion Landmark Estimation with HRNet

A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.