Fashion Landmark Estimation with HRNet

Overview

HRNet for Fashion Landmark Estimation

(Modified from deep-high-resolution-net.pytorch)

Introduction

This code applies the HRNet (Deep High-Resolution Representation Learning for Human Pose Estimation) onto fashion landmark estimation task using the DeepFashion2 dataset. HRNet maintains high-resolution representations throughout the forward path. As a result, the predicted keypoint heatmap is potentially more accurate and spatially more precise.

Illustrating the architecture of the proposed HRNet

Please note that every image in DeepFashion2 contains multiple fashion items, while our model assumes that there exists only one item in each image. Therefore, what we feed into the HRNet is not the original image but the cropped ones provided by a detector. In experiments, one can either use the ground truth bounding box annotation to generate the input data or use the output of a detecter.

Main Results

Landmark Estimation Performance on DeepFashion2 Test set

We won the third place in the "DeepFashion2 Challenge 2020 - Track 1 Clothes Landmark Estimation" competition. DeepFashion2 Challenge 2020 - Track 1 Clothes Landmark Estimation

Landmark Estimation Performance on DeepFashion2 Validation Set

Arch BBox Source AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
pose_hrnet Detector 0.579 0.793 0.658 0.460 0.581 0.706 0.939 0.784 0.548 0.708
pose_hrnet GT 0.702 0.956 0.801 0.579 0.703 0.740 0.965 0.827 0.592 0.741

Quick start

Installation

  1. Install pytorch >= v1.2 following official instruction. Note that if you use pytorch's version < v1.0.0, you should follow the instruction at https://github.com/Microsoft/human-pose-estimation.pytorch to disable cudnn's implementations of BatchNorm layer. We encourage you to use higher pytorch's version(>=v1.0.0)

  2. Clone this repo, and we'll call the directory that you cloned as ${POSE_ROOT}.

  3. Install dependencies:

    pip install -r requirements.txt
    
  4. Make libs:

    cd ${POSE_ROOT}/lib
    make
    
  5. Init output(training model output directory) and log(tensorboard log directory) directory:

    mkdir output 
    mkdir log
    

    Your directory tree should look like this:

    ${POSE_ROOT}
    |-- lib
    |-- tools 
    |-- experiments
    |-- models
    |-- data
    |-- log
    |-- output
    |-- README.md
    `-- requirements.txt
    
  6. Download pretrained models from our Onedrive Cloud Storage

Data preparation

Our experiments were conducted on DeepFashion2, clone this repo, and we'll call the directory that you cloned as ${DF2_ROOT}.

1) Download the dataset

Extract the dataset under ${POSE_ROOT}/data.

2) Convert annotations into coco-type

The above code repo provides a script to convert annotations into coco-type.

We uploaded our converted annotation file onto OneDrive named as train/val-coco_style.json. We also made truncated json files such as train-coco_style-32.json meaning the first 32 samples in the dataset to save the loading time during development period.

3) Install the deepfashion_api

Enter ${DF2_ROOT}/deepfashion2_api/PythonAPI and run

python setup.py install

Note that the deepfashion2_api is modified from the cocoapi without changing the package name. Therefore, conflicts occur if you try to install this package when you have installed the original cocoapi in your computer. We provide two feasible solutions: 1) run our code in a virtualenv 2) use the deepfashion2_api as a local pacakge. Also note that deepfashion2_api is different with cocoapi mainly in the number of classes and the values of standard variations for keypoints.

At last the directory should look like this:

${POSE_ROOT}
|-- data
`-- |-- deepfashion2
    `-- |-- train
        |   |-- image
        |   |-- annos                           (raw annotation)
        |   |-- train-coco_style.json           (converted annotation file)
        |   `-- train-coco_style-32.json      (truncated for fast debugging)
        |-- validation
        |   |-- image
        |   |-- annos                           (raw annotation)
        |   |-- val-coco_style.json             (converted annotation file)
        |   `-- val-coco_style-64.json        (truncated for fast debugging)
        `-- json_for_test
            `-- keypoints_test_information.json

Training and Testing

Note that the GPUS parameter in the yaml config file is deprecated. To select GPUs, use the environment varaible:

 export CUDA_VISIBLE_DEVICES=1

Testing on DeepFashion2 dataset with BBox from ground truth using trained models:

python tools/test.py \
    --cfg experiments/deepfashion2/hrnet/w48_384x288_adam_lr1e-3.yaml \
    TEST.MODEL_FILE models/pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth \
    TEST.USE_GT_BBOX True

Testing on DeepFashion2 dataset with BBox from a detector using trained models:

python tools/test.py \
    --cfg experiments/deepfashion2/hrnet/w48_384x288_adam_lr1e-3.yaml \
    TEST.MODEL_FILE models/pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth \
    TEST.DEEPFASHION2_BBOX_FILE data/bbox_result_val.pkl \

Training on DeepFashion2 dataset using pretrained models:

python tools/train.py \
    --cfg experiments/deepfashion2/hrnet/w48_384x288_adam_lr1e-3.yaml \
     MODEL.PRETRAINED models/pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth

Other options

python tools/test.py \
    ... \
    DATASET.MINI_DATASET True \ # use a subset of the annotation to save loading time
    TAG 'experiment description' \ # this info will appear in the output directory name
    WORKERS 4 \ # num_of_worker for the dataloader
    TEST.BATCH_SIZE_PER_GPU 8 \
    TRAIN.BATCH_SIZE_PER_GPU 8 \

OneDrive Cloud Storage

OneDrive

We provide the following files:

  • Model checkpoint files
  • Converted annotation files in coco-type
  • Bounding box results from our self-implemented detector in a pickle file.
hrnet-for-fashion-landmark-estimation.pytorch
|-- models
|   `-- pose_hrnet-w48_384x288-deepfashion2_mAP_0.7017.pth
|
|-- data
|   |-- bbox_result_val.pkl
|   |
`-- |-- deepfashion2
    `---|-- train
        |   |-- train-coco_style.json           (converted annotation file)
        |   `-- train-coco_style-32.json      (truncated for fast debugging)
        `-- validation
            |-- val-coco_style.json             (converted annotation file)
            `-- val-coco_style-64.json        (truncated for fast debugging)
        

Discussion

Experiment Configuration

  • For the regression target of keypoint heatmaps, we tuned the standard deviation value sigma and finally set it to 2.
  • During training, we found that the data augmentation from the original code was too intensive which makes the training process unstable. We weakened the augmentation parameters and observed performance gain.
  • Due to the imbalance of classes in DeepFashion2 dataset, the model's performance on different classes varies a lot. Therefore, we adopted a weighted sampling strategy rather than the naive random shuffling strategy, and observed performance gain.
  • We expermented with the value of weight decay, and found that either 1e-4 or 1e-5 harms the performance. Therefore, we simply set weight decay to 0.
Owner
SVIP Lab
ShanghaiTech Vision and Intelligent Perception Lab
SVIP Lab
COLMAP - Structure-from-Motion and Multi-View Stereo

COLMAP About COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline with a graphical and command-line interface.

4.7k Jan 07, 2023
MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks.

MVGCN MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks. Developer: Fu Hait

13 Dec 01, 2022
Yas CRNN model training - Yet Another Genshin Impact Scanner

Yas-Train Yet Another Genshin Impact Scanner 又一个原神圣遗物导出器 介绍 该仓库为 Yas 的模型训练程序 相关资料 MobileNetV3 CRNN 使用 假设你会设置基本的pytorch环境。 生成数据集 python main.py gen 训练

wormtql 18 Jan 08, 2023
QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing

QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing Environment Tested on Ubuntu 14.04 64bit and 16.04 64bit Installation # disabl

gts3.org (<a href=[email protected])"> 581 Dec 30, 2022
N-Person-Check-Checker-Splitter - A calculator app use to divide checks

N-Person-Check-Checker-Splitter This is my from-scratch programmed calculator ap

2 Feb 15, 2022
Visualizing lattice vibration information from phonon dispersion to atoms (For GPUMD)

Phonon-Vibration-Viewer (For GPUMD) Visualizing lattice vibration information from phonon dispersion for primitive atoms. In this tutorial, we will in

Liangting 6 Dec 10, 2022
A medical imaging framework for Pytorch

Welcome to MedicalTorch MedicalTorch is an open-source framework for PyTorch, implementing an extensive set of loaders, pre-processors and datasets fo

Christian S. Perone 799 Jan 03, 2023
Interactive Terraform visualization. State and configuration explorer.

Rover - Terraform Visualizer Rover is a Terraform visualizer. In order to do this, Rover: generates a plan file and parses the configuration in the ro

Tu Nguyen 2.3k Jan 07, 2023
Code for the Paper "Diffusion Models for Handwriting Generation"

Code for the Paper "Diffusion Models for Handwriting Generation"

62 Dec 21, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 08, 2023
Mesh TensorFlow: Model Parallelism Made Easier

Mesh TensorFlow - Model Parallelism Made Easier Introduction Mesh TensorFlow (mtf) is a language for distributed deep learning, capable of specifying

1.3k Dec 26, 2022
Zsseg.baseline - Zero-Shot Semantic Segmentation

This repo is for our paper A Simple Baseline for Zero-shot Semantic Segmentation

98 Dec 20, 2022
Runtime type annotations for the shape, dtype etc. of PyTorch Tensors.

torchtyping Type annotations for a tensor's shape, dtype, names, ... Turn this: def batch_outer_product(x: torch.Tensor, y: torch.Tensor) - torch.Ten

Patrick Kidger 1.2k Jan 03, 2023
Deep Markov Factor Analysis (NeurIPS2021)

Deep Markov Factor Analysis (DMFA) Codes and experiments for deep Markov factor analysis (DMFA) model accepted for publication at NeurIPS2021: A. Farn

Sarah Ostadabbas 2 Dec 16, 2022
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

vanint 18 Dec 17, 2022
The source codes for TME-BNA: Temporal Motif-Preserving Network Embedding with Bicomponent Neighbor Aggregation.

TME The source codes for TME-BNA: Temporal Motif-Preserving Network Embedding with Bicomponent Neighbor Aggregation. Our implementation is based on TG

2 Feb 10, 2022
The code for 'Deep Residual Fourier Transformation for Single Image Deblurring'

Deep Residual Fourier Transformation for Single Image Deblurring Xintian Mao, Yiming Liu, Wei Shen, Qingli Li and Yan Wang code will be released soon

145 Dec 13, 2022
GPU implementation of $k$-Nearest Neighbors and Shared-Nearest Neighbors

GPU implementation of kNN and SNN GPU implementation of $k$-Nearest Neighbors and Shared-Nearest Neighbors Supported by numba cuda and faiss library E

Hyeon Jeon 7 Nov 23, 2022
Apache Flink

Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flin

The Apache Software Foundation 20.4k Dec 30, 2022
LieTransformer: Equivariant Self-Attention for Lie Groups

LieTransformer This repository contains the implementation of the LieTransformer used for experiments in the paper LieTransformer: Equivariant Self-At

OxCSML (Oxford Computational Statistics and Machine Learning) 50 Dec 28, 2022