A pure pytorch implemented ocr project including text detection and recognition

Last update: Dec 30, 2022

Overview

ocr.pytorch

A pure pytorch implemented ocr project.
Text detection is based CTPN and text recognition is based CRNN.
More detection and recognition methods will be supported!

Prerequisite

python-3.5+
pytorch-0.4.1+
torchvision-0.2.1
opencv-3.4.0.14
numpy-1.14.3

They could all be installed through pip except pytorch and torchvision. As for pytorch and torchvision, they both depends on your CUDA version, you would prefer to reading pytorch's official site

Detection

Detection is based on CTPN, some codes are borrowed from pytorch_ctpn, several detection results:

Recognition

Recognition is based on CRNN, some codes are borrowed from crnn.pytorch

Test

Download pretrained models from Baidu Netdisk (extract code: u2ff) or Google Driver and put these files into checkpoints. Then run

python3 demo.py

The image files in ./test_images will be tested for text detection and recognition, the results will be stored in ./test_result.

If you want to test a single image, run

python3 test_one.py [filename]

Train

Training codes are placed into train_code directory.
Train CTPN
Train CRNN

Licence

MIT License

A pure pytorch implemented ocr project including text detection and recognition

Related tags

Overview

ocr.pytorch

Prerequisite

Detection

Recognition

Test

Train

Licence

Owner

coura

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

天池2021"全球人工智能技术创新大赛"【赛道一】：医学影像报告异常检测 - 第三名解决方案

This is a GUI program which consist of 4 OpenCV projects

A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

Autonomous Driving project for Euro Truck Simulator 2

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

BD-ALL-DIGIT - This Is Bangladeshi All Sim Cloner Tools

This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

textspotter - An End-to-End TextSpotter with Explicit Alignment and Attention

ERQA - Edge Restoration Quality Assessment

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

Toolbox for OCR post-correction

A dataset handling library for computer vision datasets in LOST-fromat

Give a solution to recognize MaoYan font.

Natural language detection

Face Detection with DLIB

This is a repository to learn and get more computer vision skills, make robotics projects integrating the computer vision as a perception tool and create a lot of awesome advanced controllers for the robots of the future.