level2-data-annotation_cv-level2-cv-15 created by GitHub Classroom

Overview

[AI Tech 3기 Level2 P Stage] 글자 검출 대회

image

팀원 소개

김규리_T3016 박정현_T3094 석진혁_T3109 손정균_T3111 이현진_T3174 임종현_T3182

Overview

OCR (Optimal Character Recognition) 기술은 사람이 직접 쓰거나 이미지 속에 있는 문자를 얻은 다음 이를 컴퓨터가 인식할 수 있도록 하는 기술로, 컴퓨터 비전 분야에서 현재 널리 쓰이는 대표적인 기술 중 하나입니다.

OCR task는 글자 검출 (text detection), 글자 인식 (text recognition), 정렬기 (Serializer) 등의 모듈로 이루어져 있는데 본 대회는 글자 검출 (text detection)만을 해결하게 됩니다.

데이터를 구성하고 활용하는 방법에 집중하는 것을 장려하는 취지에서, 제공되는 베이스 코드 중 모델과 관련한 부분을 변경하는 것이 금지되어 있습니다. 데이터 수집과 preprocessing, data augmentation 그리고 optimizer, learning scheduler 등 최적화 방식을 변경할 수 있습니다.

  • Input : 글자가 포함된 전체 이미지
  • Output : bbox 좌표가 포함된 UFO Format

평가방법

  • DetEval

    이미지 레벨에서 정답 박스가 여러개 존재하고, 예측한 박스가 여러개가 있을 경우, 박스끼리의 다중 매칭을 허용하여 점수를 주는 평가방법 중 하나 입니다

    1. 모든 정답/예측박스들에 대해서 Area Recall, Area Precision을 미리 계산해냅니다.

    2. 모든 정답 박스와 예측 박스를 순회하면서, 매칭이 되었는지 판단하여 박스 레벨로 정답 여부를 측정합니다.

    3. 모든 이미지에 대하여 Recall, Precision을 구한 이후, 최종 F1-Score은 모든 이미지 레벨에서 측정 값의 평균으로 측정됩니다.

      image

Final Score  🏅

  • Public : f1 0.6897 → Private f1 : 0.6751
  • Public : 11위/19팀 → Private : 9위/19팀

image

Archive contents

template
├──code
│  ├──augmentation.py
│  ├──convert_mlt.py
│  ├──dataset.py
│  ├──deteval.py
│  ├──east_dataset.py
│  ├──inference.py
│  ├──loss.py
│  ├──model.py
│  └──train.py
└──input
   └──ICDAR2017_Korean
		  └──data
			  	├──images
		      └──ufo
			        ├──train.json
							└──val.json

Dataset

  • ICDAR MLT17 Korean : 536 images ⊆ ICDAR MLT17 : 7,200 images

  • ICDAR MLT19 : 10,000 images

  • ICAR ArT : 5,603 images

Experiment

Results

dataset 데이터 수 LB score (public→private) Recall Precision
01 ICDAR17_Korean 536 0.4469 → 0.4732 0.3580 → 0.3803 0.5944 → 0.6264
02 Camper (폴리곤 수정 전) 1288 0.4543 → 0.5282 0.3627 → 0.4349 0.6077 → 0.6727
03 Camper (폴리곤 수정 후) 1288 0.4644 → 0.5298 0.3491 → 0.4294 0.6936 → 0.6913
04 ICDAR17_Korean + Camper 1824 0.4447 → 0.5155 0.3471 → 0.4129 0.6183 → 0.6858
05 ICDAR17(859) 859 0.5435 → 0.5704 0.4510 → 0.4713 0.6837 → 0.7222
06 ICDAR17_MLT 7200 0.6749 → 0.6751 0.5877 → 0.5887 0.7927 → 0.7912
07 ICDAR19+ArT 약 15000 0.6344 → 0.6404 0.5489 → 0.5607 0.7514 → 0.7465

Requirements

pip install -r requirements.txt

UFO Format으로 변환

python convert_mlt.py

SRC_DATASET_DIR = {변환 전 data 경로}

DST_DATASET_DIR = {변환 된 data 경로}

UFO Format ****

File Name
    ├── img_h
    ├── img_w
    └── words
        ├── points
        ├── transcription
        ├── language
        ├── illegibillity
        ├── orientation
        └── word_tags

Train.py

python train.py --data_dir {train data path} --val_data_dir {val data path} --name {wandb run name} --exp_name {model name
Que es S4K Builder?, Fácil un constructor de tokens grabbers con muchas opciones, como BTC Miner, Clipper, shutdown PC, Y más! Disfrute el proyecto. <3

S4K Builder Este script Python 3 de código abierto es un constructor del muy popular registrador de tokens que está en [mi GitHub] (https://github.com

SadicX 1 Oct 22, 2021
Grokking the Object Oriented Design Interview

Grokking the Object Oriented Design Interview

Tusamma Sal Sabil 2.6k Jan 08, 2023
Generate modern Python clients from OpenAPI

openapi-python-client Generate modern Python clients from OpenAPI 3.x documents. This generator does not support OpenAPI 2.x FKA Swagger. If you need

555 Jan 02, 2023
A curated list of awesome mathematics resources

A curated list of awesome mathematics resources

Cyrille Rossant 6.7k Jan 05, 2023
This programm checks your knowlege about the capital of Japan

Introduction This programm checks your knowlege about the capital of Japan. Now, what does it actually do? After you run the programm you get asked wh

1 Dec 16, 2021
Essential Document Generator

Essential Document Generator Dead Simple Document Generation Whether it's testing database performance or a new web interface, we've all needed a dead

Shane C Mason 59 Nov 11, 2022
This is a tool to make easier brawl stars modding using csv manipulation

Brawler Maker : Modding Tool for Brawl Stars This is a tool to make easier brawl stars modding using csv manipulation if you want to support me, just

6 Nov 16, 2022
Collections of Beautiful Latex Snippets

HandyLatex Collections of Beautiful Latex Snippets Table 👉 Succinct table with bold separation line and gray text %################## Dependencies ##

Xintao 15 Apr 11, 2022
Beautiful static documentation generator for OpenAPI/Swagger 2.0

Spectacle The gentleman at REST Spectacle generates beautiful static HTML5 documentation from OpenAPI/Swagger 2.0 API specifications. The goal of Spec

Sourcey 1.3k Dec 13, 2022
A markdown wiki and dashboarding system for Datasette

datasette-notebook A markdown wiki and dashboarding system for Datasette This is an experimental alpha and everything about it is likely to change. In

Simon Willison 19 Apr 20, 2022
Convenient tools for using Swagger to define and validate your interfaces in a Pyramid webapp.

Convenient tools for using Swagger to define and validate your interfaces in a Pyramid webapp.

Scott Triglia 64 Sep 18, 2022
DocumentPy is a Python application that runs in a command-line interface environment, made for creating HTML documents.

DocumentPy DocumentPy is a Python application that runs in a command-line interface environment, made for creating HTML documents. Usage DocumentPy, a

Lotus 0 Jul 15, 2021
My Sublime Text theme

rsms sublime text theme Install: cd path/to/your/sublime/packages git clone https://github.com/rsms/sublime-theme.git rsms-theme You'll also need the

Rasmus 166 Jan 04, 2023
ACPOA plugin creation helper

ACPOA Plugin What is ACPOA ACPOA is the acronym for "Application Core for Plugin Oriented Applications". It's a tool to create flexible and extendable

Leikt Sol'Reihin 1 Oct 20, 2021
A tutorial for people to run synthetic data replica's from source healthcare datasets

Synthetic-Data-Replica-for-Healthcare Description What is this? A tailored hands-on tutorial showing how to use Python to create synthetic data replic

11 Mar 22, 2022
Python bindings to OpenSlide

OpenSlide Python OpenSlide Python is a Python interface to the OpenSlide library. OpenSlide is a C library that provides a simple interface for readin

OpenSlide 297 Dec 21, 2022
Tutorial for STARKs with supporting code in python

stark-anatomy STARK tutorial with supporting code in python Outline: introduction overview of STARKs basic tools -- algebra and polynomials FRI low de

121 Jan 03, 2023
SamrSearch - SamrSearch can get user info and group info with MS-SAMR

SamrSearch SamrSearch can get user info and group info with MS-SAMR.like net use

knight 10 Oct 06, 2022
Members: Thomas Longuevergne Program: Network Security Course: 1DV501 Date of submission: 2021-11-02

Mini-project report Members: Thomas Longuevergne Program: Network Security Course: 1DV501 Date of submission: 2021-11-02 Introduction This project was

1 Nov 08, 2021
Dynamic Resume Generator

Dynamic Resume Generator

Quinten Lisowe 15 May 19, 2022