OCR, Scene-Text-Understanding, Text Recognition

Overview

Scene-Text-Understanding

Survey

  • [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper
  • [2014-Front.Comput.Sci] Scene Text Detection and Recognition: Recent Advances and Future Trends paper
  • [2020-Arxiv] Text Recognition in the Wild: A surveypaper

Scene Text Detection

  • [2019-CVPR] Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation [paper]
  • [2019-CVPR] A Multitask Network for Localization and Recognition of Text in Images(end-to-end) [paper]
  • [2019-CVPR] AFDM: Handwriting Recognition in Low-resource Scripts using Adversarial Learning(data augmentation) [paper] [code]
  • [2019-CVPR] CRAFT: Character Region Awareness for Text Detection [paper] [code]
  • [2019-CVPR] Data Extraction from Charts via Single Deep Neural Network(*) [paper]
  • [2019-CVPR] E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text [paper]
  • [2019-arXiv] FACLSTM: ConvLSTM with Focused Attention for Scene Text Recognition [paper]
  • [2019-CVPR] Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes [paper]
  • [2019-CVPR] PSENET: Shape Robust Text Detection with Progressive Scale Expansion Network [paper][tensorflow][Pytorch]
  • [2019-CVPR] PMTD: Pyramid Mask Text Detector [paper] [code]
  • [2019-CVPR] Spatial Fusion GAN for Image Synthesis (word Synthesis) [[paper]](https://arxiv.org/abs/1812.05840 [code]
  • [2019-CVPR] Scene Text Detection with Supervised Pyramid Context Network [paper][keras]
  • [2019-arXiv] TextField: Learning A Deep Direction Field for Irregular Scene Text Detection [paper] [code]
  • [2019-CVPR] Typography with Decor: Intelligent Text Style Transfer [paper] [code]
  • [2019-CVPR] TIOU: Tightness-aware Evaluation Protocol for Scene Text Detection(new Evalution tool)[paper] [code]
  • [2019-arXiv] MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition [paper] [code]
  • [2019-CVPR] Scene Text Magnifier [paper]
  • [2018-CVPR] Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks [paper]
  • [2018-ECCV] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes [paper] [code]
  • [2018-AAAI] PixelLink: Detecting Scene Text via Instance Segmentation [paper] [code]
  • [2018-CVPR] RRPN: Arbitrary-Oriented Scene Text Detection via Rotation Proposals [paper] [code]
  • [2018-CPVR] Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation [Paper]
  • [2018-arxiv] PixelLink: Detecting Scene Text via Instance Segmentation [Paper]
  • [2018-AAAI] SEE: Towards Semi-Supervised End-to-End Scene Text Recognition [Paper]
  • [2018-arxiv] TextBoxes++: A Single-Shot Oriented Scene Text Detector[Paper]
  • [2017-arxiv] Attention-based Extraction of Structured [Paper]
  • [2017-ICCV]Single Shot TextDetector with Regional Attention [Paper]
  • [2017-ICCV]WordSup: Exploiting Word Annotations for Character based Text Detection [Paper]
  • [2017-arXiv]R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection[Paper]
  • [2017-CVPR]EAST: An Efficient and Accurate Scene Text Detector [Paper] [Code]
  • [2017-arXiv]Cascaded Segmentation-Detection Networks for Word-Level Text Spotting[Paper]
  • [2017-arXiv]Deep Direct Regression for Multi-Oriented Scene Text Detection [Paper]
  • [2017-CVPR]Detecting oriented text in natural images by linking segments [Paper]
  • [2017-CVPR]Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection [Paper]
  • [2017-arXiv]Arbitrary-Oriented Scene Text Detection via Rotation Proposals [Paper]
  • [2017-AAAI]TextBoxes: A Fast Text Detector with a Single Deep Neural Network[Paper][Code]
  • [2016-arXiv]Accurate Text Localization in Natural Image with Cascaded Convolutional TextNetwork [Paper]
  • [2016-arXiv]DeepText : A Unified Framework for Text Proposal Generation and Text Detectionin Natural Images [Paper] [Data]
  • [2017-PR]TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild [paper] [code]
  • [2016-arXiv] Scene Text Detection via Holistic, Multi-Channel Prediction [Paper]
  • [2016-CVPR] CannyText Detector: Fast and Robust Scene Text Localization Algorithm [Paper]
  • [2016-CVPR]Synthetic Data for Text Localisation in Natural Images[Paper] [Data] [Code]
  • [2016-ECCV]Detecting Text in Natural Image with Connectionist Text Proposal Network[Paper] [Demo][Code]
  • [2016-TIP]Text-Attentional Convolutional Neural Networks for Scene Text Detection[Paper]
  • [2016-IJDAR]TextCatcher: a method to detect curved and challenging text in natural scenes[Paper]
  • [2016-CVPR]Multi-oriented text detection with fully convolutional networks[Paper]
  • [2015-TPRMI]Real-time Lexicon-free Scene Text Localization and Recognition
  • [2015-CVPR]Symmetry-Based Text Line Detection in Natural Scenes
  • [2015-ICCV]FASText: Efficient unconstrained scene text detector [Paper] https://github.com/MichalBusta/FASText
  • [2015-D.PhilThesis] Deep Learning for Text Spotting [Paper]
  • [2015 ICDAR]Object Proposals for Text Extraction in the Wild [Paper] https://github.com/lluisgomez/TextProposals
  • [2014-ECCV] Deep Features for Text Spotting [Paper] https://bitbucket.org/jaderberg/eccv2014_textspotting https://bitbucket.org/jaderberg/eccv2014_textspotting http://gitxiv.com/posts/uB4y7QdD5XquEJ69c/deep-features-for-text-spotting
  • [2014-TPAMI] Word Spotting and Recognition with Embedded Attributes [Paper] http://www.cvc.uab.es/~almazan/index/projects/words-att/index.html https://github.com/almazan/watts
  • [2014-TPRMI]Robust Text Detection in Natural Scene Images
  • [2014-ECCV] Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees [Paper]
  • [2013-ICCV] Photo OCR: Reading Text in Uncontrolled Conditions [Paper]
  • [2012-CVPR]Real-time scene text localization and recognition [Paper]
  • [2010-CVPR]Detecting Text in Natural Scenes with Stroke Width Transform [Paper]

Scene Text Recognition

Phd Thesis

  • [2016-PhD Thesis] Context Modeling for Semantic Text Matching and Scene Text Detection [Paper]
  • [2015-PhD Thesis] Deep Learning for Text Spotting [Paper]
  • [2012-PhD thesis] End-to-End Text Recognition with Convolutional Neural Networks [Paper]

Text Detection

  • [2018-arxiv] TextBoxes++: A Single-Shot Oriented Scene Text Detector [Paper]

Dataset

PowerPoint Text Detection and Recognition Dataset 2017

COCO-Text (ComputerVision Group, Cornell) 2016

  • 63,686images, 173,589 text instances, 3 fine-grained text attributes.
  • Task:text location and recognition

COCO-Text API

Synthetic Data for Text Localisation in Natural Image (VGG)2016

  • 800k thousand images
  • 8 million synthetic word instances
  • download

Synthetic Word Dataset (Oxford, VGG) 2014

  • 9million images covering 90k English words
  • Task:text recognition, segmentation
  • download

IIIT 5K-Words 2012

  • 5000images from Scene Texts and born-digital (2k training and 3k testing images)
  • Eachimage is a cropped word image of scene text with case-insensitive labels
  • Task:text recognition
  • download

StanfordSynth(Stanford, AI Group) 2012

  • Small single-character images of 62 characters (0-9, a-z, A-Z)
  • Task:text recognition
  • download

MSRA Text Detection 500 Database(MSRA-TD500) 2012

  • 500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)
  • Chinese,English or mixture of both
  • Task:text detection

Street View Text (SVT) 2010

  • 350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)
  • Only word level bounding boxes are provided with case-insensitive labels
  • Task:text location

KAIST Scene_Text Database 2010

  • 3000 images of indoor and outdoor scenes containing text
  • Korean,English (Number), and Mixed (Korean + English + Number)
  • Task:text location, segmentation and recognition

Chars74k 2009

  • Over 74K images from natural images, as well as a set of synthetically generatedcharacters

  • Smallsingle-character images of 62 characters (0-9, a-z, A-Z)

  • Task:text recognition

  • ICDAR Benchmark Datasets

Dataset Discription Competition Paper
ICDAR 2017 42618 training images and 9837 testing images paper link
ICDAR 2015 1000 training images and 500 testing images paper link
ICDAR 2013 229 training images and 233 testing images paper link
ICDAR 2011 229 training images and 255 testing images paper link
ICDAR 2005 1001 training images and 489 testing images paper link
ICDAR 2003 181 training images and 251 testing images(word level and character level) paper link

Blogs

Online Service

Name Description
Online OCR API,Free
Free OCR API,Free
New OCR API,Free
ABBYY FineReader Online nonAPI,free

Open Resources Code

Hand Writing Recognition

Licence Tag Recognition

Owner
Alan Tang
Interested in Machine Learning, Data Mining, Operation Research, Reinforcement Learning
Alan Tang
A bot that plays TFT using OCR. Keeps track of bench, board, items, and plays the user defined team comp.

NOTES: To ensure best results, make sure you are running this on a computer that has decent specs. 1920x1080 fullscreen is required in League, game mu

francis 125 Dec 30, 2022
Generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv

basic-dataset-generator-from-image-of-numbers generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv inpu

1 Jan 01, 2022
Automatically remove the mosaics in images and videos, or add mosaics to them.

Automatically remove the mosaics in images and videos, or add mosaics to them.

Hypo 1.4k Dec 30, 2022
Pre-Recognize Library - library with algorithms for improving OCR quality.

PRLib - Pre-Recognition Library. The main aim of the library - prepare image for recogntion. Image processing can really help to improve recognition q

Alex 80 Dec 30, 2022
a Deep Learning Framework for Text

DeLFT DeLFT (Deep Learning Framework for Text) is a Keras and TensorFlow framework for text processing, focusing on sequence labelling (e.g. named ent

Patrice Lopez 350 Dec 19, 2022
Lightning Fast Language Prediction 🚀

whatthelang Lightning Fast Language Prediction 🚀 Dependencies The dependencies can be installed using the requirements.txt file: $ pip install -r req

Indix 152 Oct 16, 2022
Deep learning based page layout analysis

Deep Learning Based Page Layout Analyze This is a Python implementaion of page layout analyze tool. The goal of page layout analyze is to segment page

186 Dec 29, 2022
Shape Detection - It's a shape detection project with OpenCV and Python.

Shape Detection It's a shape detection project with OpenCV and Python. Setup pip install opencv-python for doing AI things. pip install simpleaudio fo

1 Nov 26, 2022
Provides OCR (Optical Character Recognition) services through web applications

OCR4all As suggested by the name one of the main goals of OCR4all is to allow basically any given user to independently perform OCR on a wide variety

174 Dec 31, 2022
Resizing Canny Countour In Python

Resizing_Canny_Countour Install Visual Studio Code , https://code.visualstudio.com/download Select Python and install with terminal( pip install openc

Walter Ng 1 Nov 07, 2021
Document manipulation detection with python

image manipulation detection task: -- tianchi function image segmentation salie

JiaKui Hu 3 Aug 22, 2022
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 4 Jul 11, 2022
Color Picker and Color Detection tool for METR4202

METR4202 Color Detection Help This is sample code that can be used for the METR4202 project demo. There are two files provided, both running on Python

Miguel Valencia 1 Oct 23, 2021
Contextual speed detection for python

Speed Prediction using Optical Flow and 2D CNN About the challenge: Comma.AI Speed Challenge This challenge was developed by Comma.AI to predict the s

Mahimana Bhatt 2 Dec 16, 2021
In this project we will be using the live feed coming from the webcam to create a virtual mouse with complete functionalities.

Virtual Mouse Using OpenCV In this project we will be using the live feed coming from the webcam to create a virtual mouse using hand tracking. Projec

Hassan Shahzad 8 Dec 20, 2022
Convert PDF/Image to TXT using EasyOcr - the best OCR engine available!

PDFImage2TXT - DOWNLOAD INSTALLER HERE What can you do with it? Convert scanned PDFs to TXT. Convert scanned Documents to TXT. No coding required!! In

Hans Alemão 2 Feb 22, 2022
Reference Code for AAAI-20 paper "Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labels"

Reference Code for AAAI-20 paper "Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labels" Please refer to htt

Ke Sun 1 Feb 14, 2022
An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

CV Lab @ Yonsei University 30 Nov 05, 2022
ocroseg - This is a deep learning model for page layout analysis / segmentation.

ocroseg This is a deep learning model for page layout analysis / segmentation. There are many different ways in which you can train and run it, but by

NVIDIA Research Projects 71 Dec 06, 2022
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

Zuming Huang 363 Jan 03, 2023