Implementation of EAST scene text detector in Keras

Overview

EAST: An Efficient and Accurate Scene Text Detector

This is a Keras implementation of EAST based on a Tensorflow implementation made by argman.

The original paper by Zhou et al. is available on arxiv.

  • Only RBOX geometry is implemented
  • Differences from the original paper
    • Uses ResNet-50 instead of PVANet
    • Uses dice loss function instead of balanced binary cross-entropy
    • Uses AdamW optimizer instead of the original Adam

The implementation of AdamW optimizer is borrowed from this repository.

The code should run under both Python 2 and Python 3.

Requirements

Keras 2.0 or higher, and TensorFlow 1.0 or higher should be enough.

The code should run with Keras 2.1.5. If you use Keras 2.2 or higher, you have to remove ZeroPadding2D from the model.py file. Specifically, replace the line containing ZeroPadding2D with x = concatenate([x, resnet.get_layer('activation_10').output], axis=3).

I will add a list of packages and their versions under which no errors should occur later.

Data

You can use your own data, but the annotation files need to conform the ICDAR 2015 format.

ICDAR 2015 dataset can be downloaded from this site. You need the data from Task 4.1 Text Localization.
You can also download the MLT dataset, which uses the same annotation style as ICDAR 2015, there.

Alternatively, you can download a training dataset consisting of all training images from ICDAR 2015 and ICDAR 2013 datasets with annotation files in ICDAR 2015 format here.
You can also get a subset of validation images from the MLT 2017 dataset containing only images with text in the Latin alphabet for validation here.
The original datasets are distributed by the organizers of the Robust Reading Competition and are licensed under the CC BY 4.0 license.

Training

You need to put all of your training images and their corresponding annotation files in one directory. The annotation files have to be named gt_IMAGENAME.txt.
You also need a directory for validation data, which requires the same structure as the directory with training images.

Training is started by running train.py. It accepts several arguments including path to training and validation data, and path where you want to save trained checkpoint models. You can see all of the arguments you can specify in the train.py file.

Execution example

python train.py --gpu_list=0,1 --input_size=512 --batch_size=12 --nb_workers=6 --training_data_path=../data/ICDAR2015/train_data/ --validation_data_path=../data/MLT/val_data_latin/ --checkpoint_path=tmp/icdar2015_east_resnet50/

You can download a model trained on ICDAR 2015 and 2013 here. It achieves 0.802 F-score on ICDAR 2015 test set. You also need to download this JSON file of the model to be able to use it.

Test

The images you want to classify have to be in one directory, whose path you have to pass as an argument. Classification is started by running eval.py with arguments specifying path to the images to be classified, the trained model, and a directory which you want to save the output in.

Execution example

python eval.py --gpu_list=0 --test_data_path=../data/ICDAR2015/test/ --model_path=tmp/icdar2015_east_resnet50/model_XXX.h5 --output_dir=tmp/icdar2015_east_resnet50/eval/

Detection examples

image_1 image_2 image_3 image_4 image_5 image_6 image_7 image_8 image_9

Owner
Jan Zdenek
Jan Zdenek
The CIS OCR PostCorrectionTool

The CIS OCR Post Correction Tool PoCoTo Source code for the Java-based PoCoTo client enabling fast interactive batch corrections of complete OCR error

CIS OCR Group 36 Dec 15, 2022
Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining

Scene Text Recognition Recommendations Everythin about Scene Text Recognition SOTA • Papers • Datasets • Code Contents 1. Papers 2. Datasets 2.1 Synth

Deep Learning and Vision Computing Lab, SCUT 197 Jan 05, 2023
PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV)

About PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV) Colorizor Приложение для проекта Yand

1 Apr 04, 2022
The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"

Mask TextSpotter A Pytorch implementation of Mask TextSpotter along with its extension can be find here Introduction This is the official implementati

Pengyuan Lyu 261 Nov 21, 2022
Histogram specification using openCV in python .

histogram specification using openCV in python . Have to input miu and sigma to draw gausssian distribution which will be used to map the input image . Example input can be miu = 128 sigma = 30

Tamzid hasan 6 Nov 17, 2021
Code for CVPR 2022 paper "SoftGroup for Instance Segmentation on 3D Point Clouds"

SoftGroup We provide code for reproducing results of the paper SoftGroup for 3D Instance Segmentation on Point Clouds (CVPR 2022) Author: Thang Vu, Ko

Thang Vu 231 Dec 27, 2022
Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

PPE ✨ Repository for our CVPR'2022 paper: Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-

Zipeng Xu 34 Nov 28, 2022
Detect textlines in document images

Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data

QURATOR-SPK 70 Jun 30, 2022
This tool will help you convert your text to handwriting xD

So your teacher asked you to upload written assignments? Hate writing assigments? This tool will help you convert your text to handwriting xD

Saurabh Daware 4.2k Jan 07, 2023
This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

alebogado 1 Jan 27, 2022
Fatigue Driving Detection Based on Dlib

Fatigue Driving Detection Based on Dlib

5 Dec 14, 2022
Ddddocr - 通用验证码识别OCR pypi版

带带弟弟OCR通用验证码识别SDK免费开源版 今天ddddocr又更新啦! 当前版本为1.3.1 想必很多做验证码的新手,一定头疼碰到点选类型的图像,做样本费时

Sml2h3 4.4k Dec 31, 2022
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Subramanyam 76 Dec 06, 2022
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 354 Dec 12, 2022
This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

This is an oriented object detector based on tensorflow object detection API. Most of the code is not changed except for those related to the need of

Dafang He 30 Oct 22, 2022
Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

Xi Yang 91 Nov 22, 2022
A Vietnamese personal card OCR website built with Django.

Django VietCardOCR Installation Creation of virtual environments is done by executing the command venv: python -m venv venv That will create a new fol

Truong Hoang Thuan 4 Sep 04, 2021
📷 This repository is focused on having various feature implementation of OpenCV in Python.

📷 This repository is focused on having various feature implementation of OpenCV in Python. The aim is to have a minimal implementation of all OpenCV features together, under one roof.

Aditya Kumar Gupta 128 Dec 04, 2022
This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"

StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image

Meta Research 840 Dec 26, 2022
1st place solution for SIIM-FISABIO-RSNA COVID-19 Detection Challenge

SIIM-COVID19-Detection Source code of the 1st place solution for SIIM-FISABIO-RSNA COVID-19 Detection Challenge. 1.INSTALLATION Ubuntu 18.04.5 LTS CUD

Nguyen Ba Dung 170 Dec 21, 2022