Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

Overview

DewarpNet

This repository contains the codes for DewarpNet training.

Recent Updates

  • [May, 2020] Added evaluation images and an important note about Matlab SSIM.
  • [Dec, 2020] Added OCR evaluation details.

Training

  • Prepare Data: train.txt & val.txt. Contents should be like:
1/824_8-cp_Page_0503-7Ns0001
1/824_1-cp_Page_0504-2Cw0001
  • Train Shape Network: python trainwc.py --arch unetnc --data_path ./data/DewarpNet/doc3d/ --batch_size 50 --tboard
  • Train Texture Mapping Network: python trainbm.py --arch dnetccnl --img_rows 128 --img_cols 128 --img_norm --n_epoch 250 --batch_size 50 --l_rate 0.0001 --tboard --data_path ./DewarpNet/doc3d

Inference:

  • Run: python infer.py --wc_model_path ./eval/models/unetnc_doc3d.pkl --bm_model_path ./eval/models/dnetccnl_doc3d.pkl --show

Evaluation (Image Metrics):

  • We use the same evaluation code as DocUNet. To reproduce the quantitative results reported in the paper use the images available here.

  • [Important note about Matlab version] We noticed that Matlab 2020a uses a different SSIM implementation which gives a better MS-SSIM score (0.5623). Whereas we have used Matlab 2018b. Please compare the scores according to your Matlab version.

Evaluation (OCR Metrics):

  • The 25 images used for OCR evaluation is /eval/ocr_eval/ocr_files.txt
  • The corresponding ground-truth text is given in /eval/ocr_eval/tess_gt.json
  • For the OCR errors reported in the paper we had used cv2.blur as pre-processing which gives higher error in all the cases. For convenience, we provide the updated numbers (without using blur) in the following table:
Method ED CER ED (no blur) CER (no blur)
DocUNet 1975.86 0.4656(0.263) 1671.80 0.403 (0.256)
DocUNet on Doc3D 1684.34 0.3955 (0.272) 1296.00 0.294 (0.235)
DewarpNet 1288.60 0.3136 (0.248) 1007.28 0.249 (0.236)
DewarpNet (ref) 1114.40 0.2692 (0.234) 812.48 0.204 (0.228)
  • We had used the Tesseract (v4.1.0) default configuration for evaluation with PyTesseract (v0.2.6).

Models:

  • Pre-trained models are available here. These models are captured prior to end-to-end training, thus won't give you the end-to-end results reported in Table 2 of the paper. Use the images provided above to get the exact numbers as Table 2.

Dataset:

  • The doc3D dataset can be downloaded using the scripts here.

More Stuff:

Citation:

If you use the dataset or this code, please consider citing our work-

@inproceedings{SagnikKeICCV2019, 
Author = {Sagnik Das*, Ke Ma*, Zhixin Shu, Dimitris Samaras, Roy Shilkrot}, 
Booktitle = {Proceedings of International Conference on Computer Vision}, 
Title = {DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks}, 
Year = {2019}}   

Acknowledgements:

Owner
[email protected]
Computer Vision Lab at Stony Brook University
<a href=[email protected]">
Virtualdragdrop - Virtual Drag and Drop Using OpenCV and Arduino

Virtualdragdrop - Virtual Drag and Drop Using OpenCV and Arduino

Rizky Dermawan 4 Mar 10, 2022
PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

News Python3 implementations of PSENet [1], PAN [2] and PAN++ [3] are released at https://github.com/whai362/pan_pp.pytorch. [1] W. Wang, E. Xie, X. L

1.1k Dec 24, 2022
TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

FOTS: Fast Oriented Text Spotting with a Unified Network I am still working on this repo. updates and detailed instructions are coming soon! Table of

Masao Taketani 52 Nov 11, 2022
Qrcode Attendence System with Opencv and Pyzbar

Setup process Creates a virtual environment (Scripts that ensure executed Python code uses the Python interpreter and site packages installed inside t

Ganesh 5 Aug 01, 2022
Drowsiness Detection and Alert System

A countless number of people drive on the highway day and night. Taxi drivers, bus drivers, truck drivers, and people traveling long-distance suffer from lack of sleep.

Astitva Veer Garg 4 Aug 01, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 03, 2023
Repository for playing the computer vision apps: People analytics on Raspberry Pi.

play-with-torch Repository for playing the computer vision apps: People analytics on Raspberry Pi. Tools Tested Hardware RasberryPi 4 Model B here, RA

eMHa 1 Sep 23, 2021
Code for AAAI 2021 paper: Sequential End-to-end Network for Efficient Person Search

This repository hosts the source code of our paper: [AAAI 2021]Sequential End-to-end Network for Efficient Person Search. SeqNet achieves the state-of

Zj Li 218 Dec 31, 2022
基于Paddle框架的PSENet复现

PSENet-Paddle 基于Paddle框架的PSENet复现 本项目基于paddlepaddle框架复现PSENet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 AIStudio链接 参考项目: whai362-PSENet 环境配置 本项目

QuanHao Guo 4 Apr 24, 2022
A tool to make dumpy among us GIFS

Among Us Dumpy Gif Maker Made by ThatOneCalculator & Pixer415 With help from Telk, karl-police, and auguwu! Please credit this repository when you use

Kainoa Kanter 535 Jan 07, 2023
governance proposal to make fei redeemable for eth

Feil Proposal 🌲 Abstract Migrate all ETH from Fei protocol-controlled value into Yearn ETH Vault. Allow redemptions of outstanding FEI for yvETH. At

13 Mar 31, 2022
The open source extract transaction infomation by using OCR.

Transaction OCR Mã nguồn trích xuất thông tin transaction từ file scaned pdf, ở đây tôi lựa chọn tài liệu sao kê công khai của Thuy Tien. Mã nguồn có

Nguyen Xuan Hung 18 Jun 02, 2022
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 354 Dec 12, 2022
ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

VistaOCR ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data Publications "How to Efficiently Increase Resolutio

ISI Center for Vision, Image, Speech, and Text Analytics 21 Dec 08, 2021
Official code for :rocket: Unsupervised Change Detection of Extreme Events Using ML On-Board :rocket:

RaVAEn The RaVÆn system We introduce the RaVÆn system, a lightweight, unsupervised approach for change detection in satellite data based on Variationa

SpaceML 35 Jan 05, 2023
~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

cosc428-structor I had an open-ended Computer Vision assignment to complete, and an out-of-copyright book that I wanted to turn into an ebook. Convent

Chad Oliver 45 Dec 06, 2022
Demo processor to illustrate OCR-D Python API

ocrd_vandalize/ Demo processor to illustrate the OCR-D/core Python API Description :TODO: write docs :) Installation From PyPI pip3 install ocrd_vanda

Konstantin Baierer 5 May 05, 2022
Ackermann Line Follower Robot Simulation.

Ackermann Line Follower Robot This is a simulation of a line follower robot that works with steering control based on Stanley: The Robot That Won the

Lucas Mazzetto 2 Apr 16, 2022
RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

RepMLP RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition Released the code of RepMLP together with an example o

260 Jan 03, 2023
A curated list of papers and resources for scene text detection and recognition

Awesome Scene Text A curated list of papers and resources for scene text detection and recognition The year when a paper was first published, includin

Jan Zdenek 43 Mar 15, 2022