docstrum

Last update: Dec 13, 2022

Related tags

Computer Vision docstrum

Overview

Docstrum Algorithm

Getting Started

This repo is for developing a Docstrum algorithm presented by O’Gorman (1993).

Disclaimer

This source code is built on top of the work by Chadoliver. Please find the original code from here (https://github.com/chadoliver/cosc428-structor).

Objective

This project aims at segmenting a document image into meaningful components. The domain of image is specified on historical machine-printed/hand-written document image.

Dependencies

python 2.7
Packages:
- numpy
- cv2

Process

Pre-processing Optional for vertical-line removal
- Blurring Bilateral Filtering
- Otsu's thresholding
- Morphological erosion & dilation
- Smoothing (Averaging)
- Static thresholding
Nearest-Neighbor Clustering and Docstrum Plot
Spacing and Orientation Estimation
Determination of Text-lines
Structural Block Determination
Post-processing
- TBD

Evaluation

Citing Docstrum

O'Gorman, L., 1993. The document spectrum for page layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11), pp.1162-1173. pdf.

@article{o1993document,
  title={The document spectrum for page layout analysis},
  author={O'Gorman, Lawrence},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={15},
  number={11},
  pages={1162--1173},
  year={1993},
  publisher={IEEE}
}

Notes

How to remove .DS_Store

find . -name '.DS_Store' -type f -delete

docstrum

Related tags

Overview

Docstrum Algorithm

Getting Started

Disclaimer

Objective

Dependencies

Process

Evaluation

Citing Docstrum

Notes

How to remove .DS_Store

Owner

Chulwoo Mike Pack

基于图像识别的开源RPA工具，理论上可以支持所有windows软件和网页的自动化

MXNet OCR implementation. Including text recognition and detection.

Camera Intrinsic Calibration and Hand-Eye Calibration in Pybullet

Generic framework for historical document processing

Application that instantly translates sign-language to letters.

Augmenting Anchors by the Detector Itself

MeshToGeotiff - A fast Python algorithm to convert a 3D mesh into a GeoTIFF

Drowsiness Detection and Alert System

Python bindings for JIGSAW: a Delaunay-based unstructured mesh generator.

Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, rastreia padrões de gestos em vez de um mouse físico.

A little but useful tool to explore OCR data extracted with `pytesseract` and `opencv`

Document Layout Analysis

Image processing in Python

CNN+LSTM+CTC based OCR implemented using tensorflow.

This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Some codes from PyImageSearch course's and external projects.

With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want.

A tensorflow implementation of EAST text detector

A python programusing Tkinter graphics library to randomize questions and answers contained in text files