An expandable and scalable OCR pipeline

Related tags

Computer Visionnidaba
Overview

Overview

Nidaba is the central controller for the entire OGL OCR pipeline. It oversees and automates the process of converting raw images into citable collections of digitized texts.

It offers the following functionality:

  • Grayscale Conversion
  • Binarization utilizing Sauvola adaptive thresholding, Otsu, or ocropus's nlbin algorithm
  • Deskewing
  • Dewarping
  • Integration of tesseract, kraken, and ocropus OCR engines
  • Page segmentation from the aforementioned OCR packages
  • Various postprocessing utilities like spell-checking, merging of multiple results, and ground truth comparison.

As it is designed to use a common storage medium on network attached storage and the celery distributed task queue it scales nicely to multi-machine clusters.

Build

To easiest way to install the latest stable(-ish) nidaba is from PyPi:

$ pip install nidaba

or run:

$ pip install .

in the git repository for the bleeding edge development version.

Some useful tasks have external dependencies. A good start is:

# apt-get install libtesseract3 tesseract-ocr-eng libleptonica-dev liblept

Tests

Per default no dictionaries and OCR models necessary to runs the tests are installed. To download the necessary files run:

$ python setup.py download
$ python setup.py nosetests

Tests for modules that call external programs, at the time only tesseract, ocropus, and kraken, will be skipped if these aren't installed.

Running

First edit (the installed) nidaba.yaml and celery.yaml to fit your needs. Have a look at the docs if you haven't set up a celery-based application before.

Then start up the celery daemon with something like:

$ celery -A nidaba worker

Next jobs can be added to the pipeline using the nidaba executable:

$ nidaba batch -b otsu -l tesseract -o tesseract:eng -- ./input.tiff
Preparing filestore             [✓]
Building batch                  [✓]
951c57e5-f8a0-432d-8d77-8a2e27fff53c

Using the return code the current state of the job can be retrieved:

$ nidaba status 25d79a54-9d4a-4939-acb6-8e168d6dbc7c
PENDING

When the job has been processed the status command will return a list of paths containing the final output:

$ nidaba status 951c57e5-f8a0-432d-8d77-8a2e27fff53c
SUCCESS
14.tif → .../input_img.rgb_to_gray_binarize.otsu_ocr.tesseract_grc.tif.hocr

Documentation

Want to learn more? Read the Docs

One Metrics Library to Rule Them All!

onemetric Installation Install onemetric from PyPI (recommended): pip install onemetric Install onemetric from the GitHub source: git clone https://gi

Piotr Skalski 49 Jan 03, 2023
A post-processing tool for scanned sheets of paper.

unpaper Originally written by Jens Gulden — see AUTHORS for more information. Licensed under GNU GPL v2 — see COPYING for more information. Overview u

27 Dec 07, 2022
Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

PPE ✨ Repository for our CVPR'2022 paper: Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-

Zipeng Xu 34 Nov 28, 2022
CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

CUTIE TensorFlow implementation of the paper "CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor." Xiaohu

Zhao,Xiaohui 147 Dec 20, 2022
This project proposes a camera vision based cursor control system, using hand moment captured from a webcam through a landmarks of hand by using Mideapipe module

This project proposes a camera vision based cursor control system, using hand moment captured from a webcam through a landmarks of hand by using Mideapipe module

Chandru 2 Feb 20, 2022
aardio的opencv库

opencv_aardio dll库下载地址:https://github.com/xuncv/opencv-plugin/releases import cv2 img = cv2.imread("./images/Lena.jpg",1) img = cv2.medianBlur(img,5)

71 Dec 31, 2022
The virtual calculator will be above the live streaming from your camera

The virtual calculator is above the live streaming from my camera usb , the program first detect my hand and in each frame calculate the distance between two finger ,if the distance is lower than the

gasbaoui mohammed al amine 5 Jul 01, 2022
🖺 OCR using tensorflow with attention

tensorflow-ocr 🖺 OCR using tensorflow with attention, batteries included Installation git clone --recursive http://github.com/pannous/tensorflow-ocr

646 Nov 11, 2022
Python Computer Vision from Scratch

This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both f

Milaan Parmar / Милан пармар / _米兰 帕尔马 221 Dec 26, 2022
Read-only mirror of https://gitlab.gnome.org/GNOME/ocrfeeder

================================= OCRFeeder - A Complete OCR Suite ================================= OCRFeeder is a complete Optical Character Recogn

GNOME Github Mirror 81 Dec 23, 2022
An expandable and scalable OCR pipeline

Overview Nidaba is the central controller for the entire OGL OCR pipeline. It oversees and automates the process of converting raw images into citable

81 Jan 04, 2023
Text-to-Image generation

Generate vivid Images for Any (Chinese) text CogView is a pretrained (4B-param) transformer for text-to-image generation in general domain. Read our p

THUDM 1.3k Jan 05, 2023
Controlling the computer volume with your hands // OpenCV

HandsControll-AI Controlling the computer volume with your hands // OpenCV Step 1 git clone https://github.com/Hayk-21/HandsControll-AI.git pip instal

Hayk 1 Nov 04, 2021
Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera.

Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera. Fingertip location is mapped to RGB images to control the mouse cursor.

Ravi Sharma 71 Dec 20, 2022
Fine tuning keras-ocr python package with custom synthetic dataset from scratch

OCR-Pipeline-with-Keras The keras-ocr package generally consists of two parts: a Detector and a Recognizer: Detector is responsible for creating bound

Eugene 1 Jan 05, 2022
CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

453 Dec 28, 2022
An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive

Kushal Shingote 1 Feb 12, 2022
Code for AAAI 2021 paper: Sequential End-to-end Network for Efficient Person Search

This repository hosts the source code of our paper: [AAAI 2021]Sequential End-to-end Network for Efficient Person Search. SeqNet achieves the state-of

Zj Li 218 Dec 31, 2022
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 4 Jul 11, 2022
Camelot: PDF Table Extraction for Humans

Camelot: PDF Table Extraction for Humans Camelot is a Python library that makes it easy for anyone to extract tables from PDF files! Note: You can als

Atlan Technologies Pvt Ltd 3.3k Dec 31, 2022