Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

Last update: Dec 30, 2022

Related tags

Computer Vision TableNet

Overview

TableNet

Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images.

Paper

Overview

Paper: TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images

TableNet is a modern deep learning architecture that was proposed by a team from TCS Research year in the year 2019. The main motivation was to extract information from scanned tables through mobile phones or cameras.

They proposed a solution that includes accurate detection of the tabular region within an image and subsequently detecting and extracting information from the rows and columns of the detected table.

Architecture: The architecture is based out of Long et al., an encoder-decoder model for semantic segmentation. The same encoder/decoder network is used as the FCN architecture for table extraction. The images are preprocessed and modified using the Tesseract OCR.

Source: Nanonets

How to run

pip install -r requirements.txt

Download the Marmot Dataset from the link given in readme.
Run data_preprocess/generate_mask.py to generate Table and Column Mask of corresponding images.
Follow the TableNet.ipynb notebook to train and test the model.

Challenges

Require a very decent System with a good GPU for accurate result on High pixel images.

Dataset

Download the dataset provided in paper : Marmot Dataset.

Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

Related tags

Overview

TableNet

Overview

How to run

Challenges

Dataset

Owner

Jainam Shah

Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation, CVPR 2020 (Oral)

Drowsiness Detection and Alert System

Msos searcher - A half-hearted attempt at finding a magic square of squares

2 telegram-bots: for image recognition and for text generation

make a better chinese character recognition OCR than tesseract

Image processing using OpenCv

Distort a video using Seam Carving (video) and Vibrato effect (sound)

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Maze generator and solver with python

aardio的opencv库

A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

一款基于Qt与OpenCV的仿真数字示波器

Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

A simple python program to record security cam footage by detecting a face and body of a person in the frame.

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約

graph learning code for ogb

An application of high resolution GANs to dewarp images of perturbed documents

Resizing Canny Countour In Python

Augmenting Anchors by the Detector Itself

Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

Related tags

Overview

TableNet

Overview

How to run

Challenges

Dataset

Owner

Jainam Shah

Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation, CVPR 2020 (Oral)

Drowsiness Detection and Alert System

Msos searcher - A half-hearted attempt at finding a magic square of squares

2 telegram-bots: for image recognition and for text generation

make a better chinese character recognition OCR than tesseract

Image processing using OpenCv

Distort a video using Seam Carving (video) and Vibrato effect (sound)

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Maze generator and solver with python

aardio的opencv库

A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

一款基于Qt与OpenCV的仿真数字示波器

Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

A simple python program to record security cam footage by detecting a face and body of a person in the frame.

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約

graph learning code for ogb

An application of high resolution GANs to dewarp images of perturbed documents

Resizing Canny Countour In Python

Augmenting Anchors by the Detector Itself

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約