PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

Last update: Dec 24, 2022

Related tags

Overview

News

Python3 implementations of PSENet [1], PAN [2] and PAN++ [3] are released at https://github.com/whai362/pan_pp.pytorch.

[1] W. Wang, E. Xie, X. Li, W. Hou, T. Lu, G. Yu, and S. Shao. Shape robust text detection with progressive scale expansion network. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 9336–9345, 2019.
[2] W. Wang, E. Xie, X. Song, Y. Zang, W. Wang, T. Lu, G. Yu, and C. Shen. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In Proc. IEEE Int. Conf. Comp. Vis., pages 8440–8449, 2019.
[3] Paper is in preparation.

Shape Robust Text Detection with Progressive Scale Expansion Network

Requirements

Python 2.7
PyTorch v0.4.1+
pyclipper
Polygon2
OpenCV 3.4 (for c++ version pse)
opencv-python 3.4

Introduction

Progressive Scale Expansion Network (PSENet) is a text detector which is able to well detect the arbitrary-shape text in natural scene.

Training

CUDA_VISIBLE_DEVICES=0,1,2,3 python train_ic15.py

Testing

CUDA_VISIBLE_DEVICES=0 python test_ic15.py --scale 1 --resume [path of model]

Eval script for ICDAR 2015 and SCUT-CTW1500

cd eval
sh eval_ic15.sh
sh eval_ctw1500.sh

Performance (new version paper)

ICDAR 2015

Method	Extra Data	Precision (%)	Recall (%)	F-measure (%)	FPS (1080Ti)	Model
PSENet-1s (ResNet50)	-	81.49	79.68	80.57	1.6	baiduyun(extract code: rxti); OneDrive
PSENet-1s (ResNet50)	pretrain on IC17 MLT	86.92	84.5	85.69	1.6	baiduyun(extract code: aieo); OneDrive
PSENet-4s (ResNet50)	pretrain on IC17 MLT	86.1	83.77	84.92	3.8	baiduyun(extract code: aieo); OneDrive

SCUT-CTW1500

Method	Extra Data	Precision (%)	Recall (%)	F-measure (%)	FPS (1080Ti)	Model
PSENet-1s (ResNet50)	-	80.57	75.55	78.0	3.9	baiduyun(extract code: ksv7); OneDrive
PSENet-1s (ResNet50)	pretrain on IC17 MLT	84.84	79.73	82.2	3.9	baiduyun(extract code: z7ac); OneDrive
PSENet-4s (ResNet50)	pretrain on IC17 MLT	82.09	77.84	79.9	8.4	baiduyun(extract code: z7ac); OneDrive

Performance (old version paper)

ICDAR 2015 (training with ICDAR 2017 MLT)

Method	Precision (%)	Recall (%)	F-measure (%)
PSENet-4s (ResNet152)	87.98	83.87	85.88
PSENet-2s (ResNet152)	89.30	85.22	87.21
PSENet-1s (ResNet152)	88.71	85.51	87.08

ICDAR 2017 MLT

Method	Precision (%)	Recall (%)	F-measure (%)
PSENet-4s (ResNet152)	75.98	67.56	71.52
PSENet-2s (ResNet152)	76.97	68.35	72.40
PSENet-1s (ResNet152)	77.01	68.40	72.45

SCUT-CTW1500

Method	Precision (%)	Recall (%)	F-measure (%)
PSENet-4s (ResNet152)	80.49	78.13	79.29
PSENet-2s (ResNet152)	81.95	79.30	80.60
PSENet-1s (ResNet152)	82.50	79.89	81.17

ICPR MTWI 2018 Challenge 2

Method	Precision (%)	Recall (%)	F-measure (%)
PSENet-1s (ResNet152)	78.5	72.1	75.2

Results

Figure 3: The results on ICDAR 2015, ICDAR 2017 MLT and SCUT-CTW1500

Paper Link

[new version paper] https://arxiv.org/abs/1903.12473

[old version paper] https://arxiv.org/abs/1806.02559

Other Implements

[tensorflow version (thanks @liuheng92)] https://github.com/liuheng92/tensorflow_PSENet

Citation

@inproceedings{wang2019shape,
  title={Shape Robust Text Detection With Progressive Scale Expansion Network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}

PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

Related tags

Overview

News

Shape Robust Text Detection with Progressive Scale Expansion Network

Requirements

Introduction

Training

Testing

Eval script for ICDAR 2015 and SCUT-CTW1500

Performance (new version paper)

ICDAR 2015

SCUT-CTW1500

Performance (old version paper)

ICDAR 2015 (training with ICDAR 2017 MLT)

ICDAR 2017 MLT

SCUT-CTW1500

ICPR MTWI 2018 Challenge 2

Results

Paper Link

Other Implements

Citation

Owner

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

A Joint Video and Image Encoder for End-to-End Retrieval

Semantic-based Patch Detection for Binary Programs

A simple demo program for using OpenCV on Android

Simple app for visual editing of Page XML files

Rest API Written In Python To Classify NSFW Images.

OpenCVを用いたカメラキャリブレーションのサンプルです。2021/06/21時点でPython実装のある3種類(通常カメラ向け、魚眼レンズ向け(fisheyeモジュール)、全方位カメラ向け(omnidirモジュール))について用意しています。

Rotational region detection based on Faster-RCNN.

Table Extraction Tool

A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

Detect textlines in document images

PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV)

CellProfiler is a open-source application for biological image analysis

A machine learning software for extracting information from scholarly documents

deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

Face_mosaic - Mosaic blur processing is applied to multiple faces appearing in the video

Smart computer vision application

pulse2percept: A Python-based simulation framework for bionic vision