Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Related tags

Deep LearningPPR10K
Overview

Portrait Photo Retouching with PPR10K

Paper | Supplementary Material

PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency
Jie Liang*, Hui Zeng*, Miaomiao Cui, Xuansong Xie and Lei Zhang.
In CVPR 2021.

The proposed Portrait Photo Retouching dataset (PPR10K) is a large-scale and diverse dataset that contains:

  • 11,161 high-quality raw portrait photos (resolutions from 4K to 8K) in 1,681 groups;
  • 3 versions of manual retouched targets of all photos given by 3 expert retouchers;
  • full resolution human-region masks of all photos.

Samples

sample_images

Two example groups of photos from the PPR10K dataset. Top: the raw photos; Bottom: the retouched results from expert-a and the human-region masks. The raw photos exhibit poor visual quality and large variance in subject views, background contexts, lighting conditions and camera settings. In contrast, the retouched results demonstrate both good visual quality (with human-region priority) and group-level consistency.

This dataset is first of its kind to consider the two special and practical requirements of portrait photo retouching task, i.e., Human-Region Priority and Group-Level Consistency. Three main challenges are expected to be tackled in the follow-up researches:

  • Flexible and content-adaptive models for such a diverse task regarding both image contents and lighting conditions;
  • Highly efficient models to process practical resolution from 4K to 8K;
  • Robust and stable models to meet the requirement of group-level consistency.

Agreement

  • All files in the PPR10K dataset are available for non-commercial research purposes only.
  • You agree not to reproduce, duplicate, copy, sell, trade, resell or exploit for any commercial purposes, any portion of the images and any portion of derived data.

Overview

All data is hosted on GoogleDrive, OneDrive and 百度网盘 (验证码: mrwn):

Path Size Files Format Description
PPR10K-dataset 406 GB 176,072 Main folder
├  raw 313 GB 11,161 RAW All photos in raw format (.CR2, .NEF, .ARW, etc)
├  xmp_source 130 MB 11,161 XMP Default meta-file of the raw photos in CameraRaw, used in our data augmentation
├  xmp_target_a 130 MB 11,161 XMP CameraRaw meta-file of the raw photos recoding the full adjustments by expert a
├  xmp_target_b 130 MB 11,161 XMP CameraRaw meta-file of the raw photos recoding the full adjustments by expert b
├  xmp_target_c 130 MB 11,161 XMP CameraRaw meta-file of the raw photos recoding the full adjustments by expert c
├  masks_full 697 MB 11,161 PNG Full-resolution human-region masks in binary format
├  masks_360p 56 MB 11,161 PNG 360p human-region masks for fast training and validation
├  train_val_images_tif_360p 91 GB 97894 TIF 360p Source (16 bit tiff, with 5 versions of augmented images) and target (8 bit tiff) images for fast training and validation
├  pretrained_models 268 MB 12 PTH pretrained models for all 3 versions
└  hists 624KB 39 PNG Overall statistics of the dataset

One can directly use the 360p (of 540x360 or 360x540 resolution in sRGB color space) training and validation files (photos, 5 versions of augmented photos and the corresponding human-region masks) we have provided following the settings in our paper (train with the first 8,875 files and validate with the last 2286 files).
Also, see the instructions to customize your data (e.g., augment the training samples regarding illuminations and colors, get photos with higher or full resolutions).

Training and Validating the PPR using 3DLUT

Installation

  • Clone this repo.
git clone https://github.com/csjliang/PPR10K
cd PPR10K/code_3DLUT/
  • Install dependencies.
pip install -r requirements.txt
  • Build. Modify the CUDA path in trilinear_cpp/setup.sh adaptively and
cd trilinear_cpp
sh trilinear_cpp/setup.sh

Training

  • Training without HRP and GLC strategy, save models:
python train.py --data_path [path_to_dataset] --gpu_id [gpu_id] --use_mask False --output_dir [path_to_save_models]
  • Training with HRP and without GLC strategy, save models:
python train.py --data_path [path_to_dataset] --gpu_id [gpu_id] --use_mask True --output_dir [path_to_save_models]
  • Training without HRP and with GLC strategy, save models:
python train_GLC.py --data_path [path_to_dataset] --gpu_id [gpu_id] --use_mask False --output_dir [path_to_save_models]
  • Training with both HRP and GLC strategy, save models:
python train_GLC.py --data_path [path_to_dataset] --gpu_id [gpu_id] --use_mask True --output_dir [path_to_save_models]

Evaluation

  • Generate the retouched results:
python validation.py --data_path [path_to_dataset] --gpu_id [gpu_id] --model_dir [path_to_models]
  • Use matlab to calculate the measures in our paper:
calculate_metrics(source_dir, target_dir, mask_dir)

Pretrained Models

mv your/path/to/pretrained_models/* saved_models/
  • specify the --model_dir and --epoch (-1) to validate or initialize the training using the pretrained models, e.g.,
python validation.py --data_path [path_to_dataset] --gpu_id [gpu_id] --model_dir mask_noglc_a --epoch -1
python train.py --data_path [path_to_dataset] --gpu_id [gpu_id] --use_mask True --output_dir mask_noglc_a --epoch -1

Citation

If you use this dataset or code for your research, please cite our paper.

@inproceedings{jie2021PPR10K,
  title={PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency},
  author={Liang, Jie and Zeng, Hui and Cui, Miaomiao and Xie, Xuansong and Zhang, Lei},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2021}
}

Related Projects

3D LUT

Contact

Should you have any questions, please contact me via [email protected].

hipCaffe: the HIP port of Caffe

Caffe Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Cent

ROCm Software Platform 126 Dec 05, 2022
Code for our NeurIPS 2021 paper 'Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation'

Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation (NeurIPS 2021) Code for our NeurIPS 2021 paper 'Exploiting the Intri

Shiqi Yang 53 Dec 25, 2022
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

19.4k Jan 04, 2023
BraTs-VNet - BraTS(Brain Tumour Segmentation) using V-Net

BraTS(Brain Tumour Segmentation) using V-Net This project is an approach to dete

Rituraj Dutta 7 Nov 27, 2022
ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representa

Bats Research 94 Nov 21, 2022
Learning 3D Part Assembly from a Single Image

Learning 3D Part Assembly from a Single Image This repository contains a PyTorch implementation of the paper: Learning 3D Part Assembly from A Single

18 Dec 21, 2022
Unet network with mean teacher for altrasound image segmentation

Unet network with mean teacher for altrasound image segmentation

5 Nov 21, 2022
Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Optimizing Dense Retrieval Model Training with Hard Negatives Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma 🔥 News 2021-10

Jingtao Zhan 99 Dec 27, 2022
Dataset para entrenamiento de yoloV3 para 4 clases

Deteccion de objetos en video Este repo basado en el proyecto PyTorch YOLOv3 para correr detección de objetos sobre video. Construí sobre este proyect

1 Nov 01, 2021
Composing methods for ML training efficiency

MosaicML Composer contains a library of methods, and ways to compose them together for more efficient ML training.

MosaicML 2.8k Jan 08, 2023
Differentiable Factor Graph Optimization for Learning Smoothers @ IROS 2021

Differentiable Factor Graph Optimization for Learning Smoothers Overview Status Setup Datasets Training Evaluation Acknowledgements Overview Code rele

Brent Yi 60 Nov 14, 2022
Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

This is a playground for pytorch beginners, which contains predefined models on popular dataset. Currently we support mnist, svhn cifar10, cifar100 st

Aaron Chen 2.4k Dec 28, 2022
PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

943 Jan 07, 2023
Source code for the NeurIPS 2021 paper "On the Second-order Convergence Properties of Random Search Methods"

Second-order Convergence Properties of Random Search Methods This repository the paper "On the Second-order Convergence Properties of Random Search Me

Adamos Solomou 0 Nov 13, 2021
Multimodal Temporal Context Network (MTCN)

Multimodal Temporal Context Network (MTCN) This repository implements the model proposed in the paper: Evangelos Kazakos, Jaesung Huh, Arsha Nagrani,

Evangelos Kazakos 13 Nov 24, 2022
Stochastic Extragradient: General Analysis and Improved Rates

Stochastic Extragradient: General Analysis and Improved Rates This repository is the official implementation of the paper "Stochastic Extragradient: G

Hugo Berard 4 Nov 11, 2022
A crash course in six episodes for software developers who want to become machine learning practitioners.

Featured code sample tensorflow-planespotting Code from the Google Cloud NEXT 2018 session "Tensorflow, deep learning and modern convnets, without a P

Google Cloud Platform 2.6k Jan 08, 2023
Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding by Qiaole Dong*, Chenjie Cao*, Yanwei Fu Paper and Supple

Qiaole Dong 190 Dec 27, 2022
Reinforcement learning library in JAX.

Reinforcement learning library in JAX.

Yicheng Luo 96 Oct 30, 2022
'A C2C E-COMMERCE TRUST MODEL BASED ON REPUTATION' Python implementation

Project description A library providing functionalities to calculate reputation and degree of trust on C2C ecommerce platforms. The work is fully base

Davide Bigotti 2 Dec 14, 2022