Detect handwritten words in a text-line (classic image processing method).

Overview

Word segmentation

Implementation of scale space technique for word segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is from 1999, the method still achieves good results, is fast, and is easy to implement. The algorithm takes an image of a line as input and outputs the segmented words.

example

Run demo

Go to the src/ directory and run the script python main.py. The images from the data/ directory (taken from IAM dataset) are segmented into words and the results are saved to the out/ directory.

Documentation

An anisotropic filter kernel is applied to the input image to create blobs corresponding to words. After thresholding the blob-image, connected components are extracted which correspond to words.

Parameters

Most of the parameters of the function wordSegmentation deal with the shape of the filter kernel:

  • img: grayscale uint8 image of the text-line to be segmented.
  • kernelSize: size of filter kernel, must be an odd integer.
  • sigma: standard deviation of Gaussian function used for filter kernel.
  • theta: approximated width/height ratio of words, filter function is distorted by this factor.
  • minArea: ignore word candidates smaller than specified area.

The function prepareImg can be used to convert the input image to grayscale and to resize it to a fixed height:

  • img: input image.
  • height: image will be resized to fit specified height.

Algorithm

The illustration below shows how the algorithm works:

  • top left: input image.
  • top right: filter kernel is applied.
  • bottom left: blob image after thresholding.
  • bottom right: bounding boxes around words in original image.

illustration

Results

This algorithm gives good results on datasets with large inter-word-distances and small intra-word-distances like IAM. However, for historical datasets like Bentham or Ratsprotokolle results are not very good and more complex approaches should be used instead (e.g., a neural network based approach as implemented in the WordDetectorNN repository).

Owner
Harald Scheidl
Interested in computer vision, deep learning, C++ and Python.
Harald Scheidl
Read-only mirror of https://gitlab.gnome.org/GNOME/ocrfeeder

================================= OCRFeeder - A Complete OCR Suite ================================= OCRFeeder is a complete Optical Character Recogn

GNOME Github Mirror 81 Dec 23, 2022
Fully-automated scripts for collecting AI-related papers

AI-Paper-Collector Web demo: https://ai-paper-collector.vercel.app/ (recommended) Colab notebook: here Motivation Fully-automated scripts for collecti

772 Dec 30, 2022
TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

FOTS: Fast Oriented Text Spotting with a Unified Network I am still working on this repo. updates and detailed instructions are coming soon! Table of

Masao Taketani 52 Nov 11, 2022
PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

365 Dec 20, 2022
Hand gesture detection project with aweome UI implementation.

an awesome hand gesture detection project for you to be creative! Imagination is the limit to do with this project.

AR Ashraf 39 Sep 26, 2022
A tool to enhance your old/damaged pictures built using python & opencv.

Breathe Life into your Old Pictures Table of Contents About The Project Getting Started Prerequisites Usage Contact Acknowledgments About The Project

Shah Anwaar Khalid 5 Dec 16, 2021
Polaris is a Face recognition attendance system .

Support Me 🚀 About Polaris 📄 Polaris is a system based on facial recognition with a futuristic GUI design, Can easily find people informations store

XN3UR0N 215 Dec 26, 2022
Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

Streaming speaker diarization Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation by Juan Manuel Coria, Hervé

Juanma Coria 185 Jan 01, 2023
CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

Watson Yang 356 Dec 08, 2022
Virtualdragdrop - Virtual Drag and Drop Using OpenCV and Arduino

Virtualdragdrop - Virtual Drag and Drop Using OpenCV and Arduino

Rizky Dermawan 4 Mar 10, 2022
Optical character recognition for Japanese text, with the main focus being Japanese manga

Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran

Maciej Budyś 327 Jan 01, 2023
Automatically resolve RidderMaster based on TensorFlow & OpenCV

AutoRiddleMaster Automatically resolve RidderMaster based on TensorFlow & OpenCV 基于 TensorFlow 和 OpenCV 实现的全自动化解御迷士小马谜题 Demo How to use Deploy the ser

神龙章轩 5 Nov 19, 2021
This repo contains a script that allows us to find range of colors in images using openCV, and then convert them into geo vectors.

Vectorizing color range This repo contains a script that allows us to find range of colors in images using openCV, and then convert them into geo vect

Development Seed 9 Jul 27, 2022
Here use convulation with sobel filter from scratch in opencv python .

Here use convulation with sobel filter from scratch in opencv python .

Tamzid hasan 2 Nov 11, 2021
This Repository contain Opencv Projects in python

Python-Opencv OpenCV OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was

Yash Sakre 2 Nov 06, 2021
2 telegram-bots: for image recognition and for text generation

💻 📱 Telegram_Bots 🔎 & 📖 2 telegram-bots: for image recognition and for text generation. About Image recognition bot: User sends a photo and bot de

Marina Polukoshko 1 Jan 27, 2022
Document Image Dewarping

Document image dewarping using text-lines and line Segments Abstract Conventional text-line based document dewarping methods have problems when handli

Taeho Kil 268 Dec 23, 2022
Hand Detection and Finger Detection on Live Feed

Hand-Detection-On-Live-Feed Hand Detection and Finger Detection on Live Feed Getting Started Install the dependencies $ git clone https://github.com/c

Chauhan Mahaveer 2 Jan 02, 2022
Line based ATR Engine based on OCRopy

OCR Engine based on OCRopy and Kraken using python3. It is designed to both be easy to use from the command line but also be modular to be integrated

948 Dec 23, 2022
Crop regions in napari manually

napari-crop Crop regions in napari manually Usage Create a new shapes layer to annotate the region you would like to crop: Use the rectangle tool to a

Robert Haase 4 Sep 29, 2022