Direct Multi-view Multi-person 3D Human Pose Estimation

Related tags

Miscellaneousmvp
Overview

Implementation of NeurIPS-2021 paper: Direct Multi-view Multi-person 3D Human Pose Estimation

[paper] [video-YouTube, video-Bilibili] [slides]

This is the official implementation of our NeurIPS-2021 work: Multi-view Pose Transformer (MvP). MvP is a simple algorithm that directly regresses multi-person 3D human pose from multi-view images.

Framework

mvp_framework

Example Result

mvp_framework

Reference

@article{wang2021mvp,
  title={Direct Multi-view Multi-person 3D Human Pose Estimation},
  author={Tao Wang and Jianfeng Zhang and Yujun Cai and Shuicheng Yan and Jiashi Feng},
  journal={Advances in Neural Information Processing Systems},
  year={2021}
}

1. Installation

  1. Set the project root directory as ${POSE_ROOT}.
  2. Install all the required python packages (with requirements.txt).
  3. compile deformable operation for projective attention.
cd ./models/ops
sh ./make.sh

2. Data and Pre-trained Model Preparation

2.1 CMU Panoptic

Please follow VoxelPose to download the CMU Panoptic Dataset and PoseResNet-50 pre-trained model.

The directory tree should look like this:

${POSE_ROOT}
|-- models
|   |-- pose_resnet50_panoptic.pth.tar
|-- data
|   |-- panoptic
|   |   |-- 16060224_haggling1
|   |   |   |-- hdImgs
|   |   |   |-- hdvideos
|   |   |   |-- hdPose3d_stage1_coco19
|   |   |   |-- calibration_160224_haggling1.json
|   |   |-- 160226_haggling1
|   |   |-- ...

2.2 Shelf/Campus

Please follow VoxelPose to download the Shelf/Campus Dataset.

Due to the limited and incomplete annotations of the two datasets, we use psudo ground truth 3D pose generated from VoxelPose to train the model, we expect mvp would perform much better with absolute ground truth pose data.

Please use voxelpose or other methods to generate psudo ground truth for the training set, you can also use our generated psudo GT: psudo_gt_shelf. psudo_gt_campus. psudo_gt_campus_fix_gtmorethanpred.

Due to the small dataset size, we fine-tune Panoptic pre-trained model to Shelf and Campus. Download the pretrained MvP on Panoptic from model_best_5view and model_best_3view_horizontal_view or model_best_3view_2horizon_1lookdown

The directory tree should look like this:

${POSE_ROOT}
|-- models
|   |-- model_best_5view.pth.tar
|   |-- model_best_3view_horizontal_view.pth.tar
|   |-- model_best_3view_2horizon_1lookdown.pth.tar
|-- data
|   |-- Shelf
|   |   |-- Camera0
|   |   |-- ...
|   |   |-- Camera4
|   |   |-- actorsGT.mat
|   |   |-- calibration_shelf.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_shelf.pickle
|   |-- CampusSeq1
|   |   |-- Camera0
|   |   |-- Camera1
|   |   |-- Camera2
|   |   |-- actorsGT.mat
|   |   |-- calibration_campus.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_campus.pickle
|   |   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle

2.3 Human3.6M dataset

Please follow CHUNYUWANG/H36M-Toolbox to prepare the data.

2.4 Full Directory Tree

The data and pre-trained model directory tree should look like this, you can only download the Panoptic dataset and PoseResNet-50 for reproducing the main MvP result and ablation studies:

${POSE_ROOT}
|-- models
|   |-- pose_resnet50_panoptic.pth.tar
|   |-- model_best_5view.pth.tar
|   |-- model_best_3view_horizontal_view.pth.tar
|   |-- model_best_3view_2horizon_1lookdown.pth.tar
|-- data
|   |-- pesudo_gt
|   |   |-- voxelpose_pesudo_gt_shelf.pickle
|   |   |-- voxelpose_pesudo_gt_campus.pickle
|   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle
|   |-- panoptic
|   |   |-- 16060224_haggling1
|   |   |   |-- hdImgs
|   |   |   |-- hdvideos
|   |   |   |-- hdPose3d_stage1_coco19
|   |   |   |-- calibration_160224_haggling1.json
|   |   |-- 160226_haggling1
|   |   |-- ...
|   |-- Shelf
|   |   |-- Camera0
|   |   |-- ...
|   |   |-- Camera4
|   |   |-- actorsGT.mat
|   |   |-- calibration_shelf.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_shelf.pickle
|   |-- CampusSeq1
|   |   |-- Camera0
|   |   |-- Camera1
|   |   |-- Camera2
|   |   |-- actorsGT.mat
|   |   |-- calibration_campus.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_campus.pickle
|   |   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle
|   |-- HM36

3. Training and Evaluation

The evaluation result will be printed after every epoch, the best result can be found in the log.

3.1 CMU Panoptic dataset

We train and validate on the five selected camera views. We trained our models on 8 GPUs and batch_size=1 for each GPU, note the total iteration per epoch should be 3205, if not, please check your data.

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/panoptic/best_model_config.yaml

Pre-trained models

Datasets AP25 AP25 AP25 AP25 MPJPE pth
Panoptic 92.3 96.6 97.5 97.7 15.8 here

3.1.1 Ablation Experiments

You can find several ablation experiment configs under ./configs/panoptic/, for example, removing RayConv:

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/panoptic/ablation_remove_rayconv.yaml

3.2 Shelf/Campus datasets

As shelf/campus are very small dataset with incomplete annotation, we finetune pretrained MvP with pseudo ground truth 3D pose extracted with VoxelPose, we expect more accurate GT would help MvP achieve much higher performance.

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/shelf/mvp_shelf.yaml

Pre-trained models

Datasets Actor 1 Actor 2 Actor 2 Average pth
Shelf 99.3 95.1 97.8 97.4 here
Campus 98.2 94.1 97.4 96.6 here

3.3 Human3.6M dataset

MvP also applies to the naive single-person setting, with dataset like Human3.6, to come

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/h36m/mvp_h36m.yaml

4. Evaluation Only

To evaluate a trained model, pass the config and model pth:

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/validate_3d.py --cfg xxx --model_path xxx

LICENSE

This repo is under the Apache-2.0 license. For commercial use, please contact the authors.

Owner
Sea AI Lab
Sea AI Lab
A collection of full-stack resources for programmers.

A collection of full-stack resources for programmers.

Charles-Axel Dein 22.3k Dec 30, 2022
Programmatic startup/shutdown of ASGI apps.

asgi-lifespan Programmatically send startup/shutdown lifespan events into ASGI applications. When used in combination with an ASGI-capable HTTP client

Florimond Manca 129 Dec 27, 2022
A framework that let's you compose websites in Python with ease!

Perry Perry = A framework that let's you compose websites in Python with ease! Perry works similar to Qt and Flutter, allowing you to create componen

Linkus 13 Oct 09, 2022
A numbers check python package

A numbers check python package

Fayas Noushad 3 Nov 28, 2021
This is the old code for bitcoin risk metric, the whole purpose form it is to help you DCA your investment according to bitcoin risk.

About The Project This is the old code for bitcoin risk metric, the whole purpose form it is to help you DCA your investment according to bitcoin risk

BitcoinRaven 2 Aug 03, 2022
Video Stream is an Advanced Telegram Bot that's allow you to play Video & Music on Telegram Group Video Chat

Video Stream is an Advanced Telegram Bot that's allow you to play Video & Music on Telegram Group Video Chat šŸ“Š Stats 🧪 Get SESSION_NAME from below:

dark phoenix 12 May 08, 2022
Batch obfuscator based on the obfuscation method used by the trick bot launcher

Batch obfuscator based on the obfuscation method used by the trick bot launcher

SlizBinksman 2 Mar 19, 2022
An unofficial opensource Pokemon cursor theme for Windows and Linux.

pokemon-cursor An unofficial opensource Pokemon cursor theme for Windows and Linux. Cursor Sizes 22 24 28 32 40 48 56 64 72 80 88 96 Colors Quick inst

Kaiz Khatri 72 Dec 26, 2022
The learning agent learns firstly approaching to the football and then kicking the football to the target position

Football Court This project utilized Pytorch and Tensorflow so that the learning agent learns firstly approaching to the football and then kicking the

1 Nov 19, 2021
How to use Microsoft Bing to search for leaks?

Installation In order to install the project, you need install its dependencies: $ pip3 install -r requirements.txt Add your Bing API key to bingKey.t

Ernestas Kardzys 2 Sep 21, 2022
TurtleBot Control App - TurtleBot Control App With Python

TURTLEBOT CONTROL APP INDEX: 1. Introduction 2. Environments 2.1. Simulated Envi

Rafanton 4 Aug 03, 2022
nbsafety adds a layer of protection to computational notebooks by solving the stale dependency problem when executing cells out-of-order

nbsafety adds a layer of protection to computational notebooks by solving the stale dependency problem when executing cells out-of-order

150 Jan 07, 2023
Python script to combine the statistical results of a TOPAS simulation that was split up into multiple batches.

topas-merge-simulations Python script to combine the statistical results of a TOPAS simulation that was split up into multiple batches At the top of t

Sebastian SchƤfer 1 Aug 16, 2022
Would upload anything I do with/related to brainfuck

My Brainfu*k Repo Basically wanted to create something with Brainfu*k but realized that with the smol brain I have, I need to see the cell values real

Rafeed 1 Mar 22, 2022
Laurence Billingham 1 Feb 16, 2022
Gerenciador de processos e registros pessoais do Departamento de Fiscalização de Produtos Controlados.

CRManager Gerenciador de processos e registros pessoais do Departamento de Fiscalização de Produtos Controlados. Descrição Este projeto tem como objet

Wolfgang Almeida 1 Nov 15, 2021
Astroquery is an astropy affiliated package that contains a collection of tools to access online Astronomical data.

Astroquery is an astropy affiliated package that contains a collection of tools to access online Astronomical data.

The Astropy Project 631 Jan 05, 2023
Two predictive attributes (Speed and Angle) and one attribute target (Power)

Two predictive attributes (Speed and Angle) and one attribute target (Power). A container crane has the function of transporting containers from one point to another point. The difficulty of this tas

Astitva Veer Garg 1 Jan 11, 2022
Graphsignal Logger

Graphsignal Logger Overview Graphsignal is an observability platform for monitoring and troubleshooting production machine learning applications. It h

Graphsignal 143 Dec 05, 2022
A Gura parser implementation for Python

Gura parser This repository contains the implementation of a Gura format parser in Python. Installation pip install gura-parser Usage import gura gur

JWare Solutions 19 Jan 25, 2022