This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Last update: Dec 30, 2022

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.

Thanks for the author's (@whai362) awesome work!

Installation

Any version of tensorflow version > 1.0 should be ok.
python 2 or 3 will be ok.

Download

trained on ICDAR 2015 (training set) + ICDAR2017 MLT (training set):

baiduyun extract code: pffd

google drive

This model is not as good as article's, it's just a reference. You can finetune on it or you can do a lot of optimization based on this code.

Database	Precision (%)	Recall (%)	F-measure (%)
ICDAR 2015(val)	74.61	80.93	77.64

Train

If you want to train the model, you should provide the dataset path, in the dataset path, a separate gt text file should be provided for each image, and make sure that gt text and image file have the same names.

Then run train.py like:

python train.py --gpu_list=0 --input_size=512 --batch_size_per_gpu=8 --checkpoint_path=./resnet_v1_50/ \
--training_data_path=./data/ocr/icdar2015/

If you have more than one gpu, you can pass gpu ids to gpu_list(like --gpu_list=0,1,2,3)

Note:

right now , only support icdar2017 data format input, like (116,1179,206,1179,206,1207,116,1207,"###"), but you can modify data_provider.py to support polygon format input
Already support polygon shrink by using pyclipper module
this re-implementation is just for fun, but I'll continue to improve this code.
re-implementation pse algorithm by using c++ (if you use python2, just run it, if python3, please replace python-config with python3-config in makefile)

Test

run eval.py like:

python eval.py --test_data_path=./tmp/images/ --gpu_list=0 --checkpoint_path=./resnet_v1_50/ \
--output_dir=./tmp/

a text file and result image will be then written to the output path.

Examples

About issues

If you encounter any issue check issues first, or you can open a new issue.

Reference

Acknowledge

@rkshuai found a bug about concat features in model.py.

If this repository helps you，please star it. Thanks.

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Related tags

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

Installation

Download

Train

Test

Examples

About issues

Reference

Acknowledge

Owner

Michael liu

Drowsiness Detection and Alert System

A webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV.

OCR powered screen-capture tool to capture information instead of images

Automatically fishes for you while you are afk :)

Framework for the Complete Gaze Tracking Pipeline

CRAFT-Pyotorch：Character Region Awareness for Text Detection Reimplementation for Pytorch

Program created with opencv that allows you to automatically count your repetitions on several fitness exercises.

Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

A tensorflow implementation of EAST text detector

Automatically resolve RidderMaster based on TensorFlow & OpenCV

With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want.

Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

Some Boring Research About Products Recognition 、Duplicate Img Detection、Img Stitch、OCR

This is a repository to learn and get more computer vision skills, make robotics projects integrating the computer vision as a perception tool and create a lot of awesome advanced controllers for the robots of the future.

A document scanner application for laptops/desktops developed using python, Tkinter and OpenCV.

OCR of Chicago 1909 Renumbering Plan

Deep LearningImage Captcha 2

Source Code for AAAI 2022 paper "Graph Convolutional Networks with Dual Message Passing for Subgraph Isomorphism Counting and Matching"

Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"