A curated list of papers, code and resources pertaining to image composition

Overview

Awesome Image Composition Awesome

A curated list of resources including papers, datasets, and relevant links pertaining to image composition.

Contributing

Contributions are welcome. If you wish to contribute, feel free to send a pull request. If you have suggestions for new sections to be included, please raise an issue and discuss before sending a pull request.

Table of Contents

Surveys

  • Li Niu, Wenyan Cong, Liu Liu, Yan Hong, Bo Zhang, Jing Liang, Liqing Zhang: "Making Images Real Again: A Comprehensive Survey on Deep Image Composition." arXiv preprint arXiv:2106.14490 (2021). [arXiv]

Papers

Image blending

  • Huikai Wu, Shuai Zheng, Junge Zhang, Kaiqi Huang: "GP-GAN: Towards Realistic High-Resolution Image Blending." ACM MM (2019) [arXiv] [code]
  • Lingzhi Zhang, Tarmily Wen, Jianbo Shi: "Deep Image Blending." WACV (2020) [pdf] [arXiv] [code]

Image harmonization

  • Jun Ling, Han Xue, Li Song, Rong Xie, Xiao Gu: "Region-Aware Adaptive Instance Normalization for Image Harmonization." CVPR (2021) [pdf] [supp] [arXiv] [code].
  • Zonghui Guo, Haiyong Zheng, Yufeng Jiang, Zhaorui Gu, Bing Zheng: "Intrinsic Image Harmonization." CVPR (2021) [pdf] [supp] [code].
  • Wenyan Cong, Li Niu, Jianfu Zhang, Jing Liang, Liqing Zhang: "BargainNet: Background-Guided Domain Translation for Image Harmonization." ICME (2021) [arXiv] [code].
  • Konstantin Sofiiuk, Polina Popenova, Anton Konushin: "Foreground-aware Semantic Representations for Image Harmonization." WACV (2021) [pdf] [supp] [arXiv] [code]
  • Guoqing Hao, Satoshi Iizuka, Kazuhiro Fukui: "Image Harmonization with Attention-based Deep Feature Modulation." BMVC (2020) [pdf] [supp] [code]
  • Wenyan Cong, Jianfu Zhang, Li Niu, Liu Liu, Zhixin Ling, Weiyuan Li, Liqing Zhang: "DoveNet: Deep Image Harmonization via Domain Verification." CVPR (2020) [pdf] [supp] [arXiv] [code].
  • Xiaodong Cun, Chi-Man Pun: "Improving the Harmony of the Composite Image by Spatial-Separated Attention Module." IEEE Trans. Image Process. 29: 4759-4771 (2020) [pdf] [arXiv] [code]
  • Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Xin Lu, Ming-Hsuan Yang: "Deep Image Harmonization." CVPR (2017) [pdf] [supp] [arXiv] [code]

Shadow generation

  • Daquan Liu, Chengjiang Long, Hongpan Zhang, Hanning Yu, Xinzhi Dong, Chunxia Xiao: "ARshadowGAN: Shadow generative adversarial network for augmented reality in single light scenes." CVPR (2020) [pdf] [code].

  • Shuyang Zhang, Runze Liang, Miao Wang: "ShadowGAN: Shadow synthesis for virtual objects with conditional adversarial networks." Computational Visual Media (2019) [pdf].

  • Fangneng Zhan, Shijian Lu, Changgong Zhang, Feiying Ma, Xuansong Xie: "Adversarial Image Composition with Auxiliary Illumination." ACCV (2020) [pdf].

Object placement and spatial transformation

  • Lingzhi Zhang, Tarmily Wen, Jie Min, Jiancong Wang, David Han, Jianbo Shi: "Learning Object Placement by Inpainting for Compositional Data Augmentation" ECCV (2020) [pdf]

  • Samaneh Azadi, Deepak Pathak, Sayna Ebrahimi, Trevor Darrell: "Compositional GAN: Learning Image-Conditional Binary Composition" International Journal of Computer Vision (2020) [arXiv] [code]

  • Song-Hai Zhang, Zhengping Zhou, Bin Liu, Xi Dong, Peter Hall: "What and Where: A Context-based Recommendation System for Object Insertion" Computational Visual Media (2020) [arXiv]

  • Shashank Tripathi, Siddhartha Chandra, Amit Agrawal, Ambrish Tyagi, James M. Rehg, Visesh Chari: "Learning to Generate Synthetic Data via Compositing" CVPR (2019) [arXiv]

  • Haoshu Fang, Jianhua Sun, Runzhong Wang, Minghao Gou, Yonglu Li, Cewu Lu: "InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-Pasting" ICCV (2019) [arXiv] [code]

  • Chen-Hsuan Lin, Ersin Yumer, Oliver Wang, Eli Shechtman, Simon Lucey: "ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing" CVPR (2018) [arXiv] [code]

  • Donghoon Lee, Sifei Liu, Jinwei Gu, Ming-Yu Liu, Ming-Hsuan Yang, Jan Kautz: "Context-Aware Synthesis and Placement of Object Instances" NeurIPS (2018) [arXiv] [code]

  • Fuwen Tan, Crispin Bernier, Benjamin Cohen, Vicente Ordonez, Connelly Barnes: "Where and Who? Automatic Semantic-Aware Person Composition" WACV (2018) [arXiv][code]

  • Tal Remez, Jonathan Huang, Matthew Brown: "learning to segment via cut-and-paste" ECCV (2018) [arXiv] [code]

Occlusion

  • Samaneh Azadi, Deepak Pathak, Sayna Ebrahimi, Trevor Darrell: "Compositional GAN: Learning Image-Conditional Binary Composition." IJCV (2020) [arXiv] [code]
  • Fangneng Zhan, Jiaxing Huang, Shijian Lu, "Hierarchy Composition GAN for High-fidelity Image Synthesis." Transactions on cybernetics (2021) [arXiv]

Datasets

  • iHarmony4 (image harmonization): It contains four subdatasets: HCOCO, HAdobe5k, HFlickr, Hday2night, with a total of 73,146 pairs of unharmonized images and harmonized images. [pdf] [link]
  • GMSDataset (image harmonization): It contains 183 images with image resolution of 1940*1440. It consists of 16 different objects and for each object, one source image and 11 target images in different background scenes and illumination conditions are captured. [pdf] [link] (access code: ekn2)
  • HVIDIT (image harmonization): A dataset built upon VIDIT (Virtual Image Dataset for Illumination Transfer) dataset for image harmonization. It contains 3007 images of 276 scenes for training and 329 images of 24 scenes for testing. [pdf] [link]
  • RHHarmony (image harmonization): A rendered image harmonization dataset, which contains 15000 ground-truth rendered images and has the potential to generate 135000 composite rendered images. [pdf] [link]
  • Shadow-AR (shadow generation): It contains 3,000 quintuples, Each quintuple consists of 5 images 640×480 resolution: a synthetic image without the virtual object shadow and its corresponding image containing the virtual object shadow, a mask of the virtual object, a labeled real-world shadow matting and its corresponding labeled occluder. [pdf] [link]
  • DESOBA (shadow generation): It contains 840 training images with totally 2,999 object-shadow pairs and 160 test images with totally 624 object-shadow pairs. [pdf] [link]
  • OPA (object placement): It contains 62,074 training images and 11,396 test images, in which the foregrounds/backgrounds in training set and test set have no overlap. The training (resp., test) set contains 21,351 (resp.,3,566) positive samples and 40,724 (resp., 7,830) negative samples. [pdf] [link]

Other resources

Owner
BCMI
Center for Brain-Like Computing and Machine Intelligence, Shanghai Jiao Tong University.
BCMI
Deep Learning Chinese Word Segment

引用 本项目模型BiLSTM+CRF参考论文:http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文:https://arxiv.org/abs/1702.02098 构建 安装好bazel代码构建工具,安装好tensorflow(目前本项目需

2.1k Dec 23, 2022
利用Paddle框架复现CRAFT

CRAFT-Paddle 利用Paddle框架复现CRAFT CRAFT 本项目基于paddlepaddle框架复现CRAFT,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: CRAFT: Character-Region Awarenes

QuanHao Guo 2 Mar 07, 2022
Maze generator and solver with python

Procedural-Maze-Generator-Algorithms Check out my youtube channel : Auctux Ressources Thanks to Jamis Buck Book : Mazes for programmers Requirements P

Joseph 19 Dec 07, 2022
This is used to convert a string to an Image with Handwritten Characters.

Text-to-Handwriting-using-python This is used to convert a string to an Image with Handwritten Characters. text_to_handwriting(string: str, save_to: s

Akashdeep Mahata 3 Aug 15, 2022
Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Deskew by Marek Mauder https://galfar.vevb.net/deskew https://github.com/galfar/deskew v1.30 2019-06-07 Overview Deskew is a command line tool for des

Marek Mauder 127 Dec 03, 2022
Vietnamese Language Detection and Recognition

Table of Content Introduction (Khôi viết) Dataset (đổi link thui thành 3k5 ảnh mình) Getting Started (An Viết) Requirements Usage Example Training & E

6 May 27, 2022
A real-time dolly zoom camera effect

Dolly-Zoom I've always been amazed by the gradual perspective change of dolly zoom, and I have some experience in python and OpenCV, so I decided to c

Dylan Kai Lau 52 Dec 08, 2022
Using python libraries to track hands

Python-HandTracking Using python libraries to track hands on a camera Uses cv2 and mediapipe libraries custom hand tracking module PyCharm IDE Final E

Martin Matsudaira 1 Dec 17, 2021
This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

alebogado 1 Jan 27, 2022
A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

Home-Security-Demo A facial recognition program that plays a alarm (mp3 file) when a person is seen in the room. A basic theif using Python and OpenCV

SysKey 4 Nov 02, 2021
A simple Digits Recogniser made in Python

⭐ Python Digit Recogniser A simple digit Recogniser made in Python Demo Run Locally Clone the project git clone https://github.com/yashraj-n/python-

Yashraj narke 4 Nov 29, 2021
TableBank: A Benchmark Dataset for Table Detection and Recognition

TableBank TableBank is a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on th

844 Jan 04, 2023
PAGE XML format collection for document image page content and more

PAGE-XML PAGE XML format collection for document image page content and more For an introduction, please see the following publication: http://www.pri

PRImA Research Lab 46 Nov 14, 2022
Official code for :rocket: Unsupervised Change Detection of Extreme Events Using ML On-Board :rocket:

RaVAEn The RaVÆn system We introduce the RaVÆn system, a lightweight, unsupervised approach for change detection in satellite data based on Variationa

SpaceML 35 Jan 05, 2023
Awesome anomaly detection in medical images

A curated list of awesome anomaly detection works in medical imaging, inspired by the other awesome-* initiatives.

Kang Zhou 57 Dec 19, 2022
kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

Kaldi 12.3k Jan 05, 2023
QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021)

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021) Yuanming Hu, Jiafeng Liu, Xuanda Yang, Mingkuan Xu, Ye Kuang, Weiwei Xu, Qiang Dai, W

Taichi Developers 119 Dec 02, 2022
text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

text-detection-ctpn Scene text detection based on ctpn (connectionist text proposal network). It is implemented in tensorflow. The origin paper can be

Shaohui Ruan 3.3k Dec 30, 2022
Line based ATR Engine based on OCRopy

OCR Engine based on OCRopy and Kraken using python3. It is designed to both be easy to use from the command line but also be modular to be integrated

948 Dec 23, 2022
Text recognition (optical character recognition) with deep learning methods.

What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis | paper | training and evaluation data | failure cases and cle

Clova AI Research 3.2k Jan 04, 2023