A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

Last update: Jan 03, 2023

Overview

SVHNClassifier-PyTorch

A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

If you're interested in C++ inference, move HERE

Results

Steps	GPU	Batch Size	Learning Rate	Patience	Decay Step	Decay Rate	Training Speed (FPS)	Accuracy
54000	GTX 1080 Ti	512	0.16	100	625	0.9	~1700	95.65%

Sample

$ python infer.py -c=./logs/model-54000.pth ./images/test-75.png
length: 2
digits: 7 5 10 10 10

$ python infer.py -c=./logs/model-54000.pth ./images/test-190.png
length: 3
digits: 1 9 0 10 10

Loss

Requirements

Python 3.6
torch 1.0
torchvision 0.2.1
visdom
```
$ pip install visdom
```

h5py

In Ubuntu:
$ sudo apt-get install libhdf5-dev
$ sudo pip install h5py

protobuf
```
$ pip install protobuf
```
lmdb
```
$ pip install lmdb
```

Setup

Clone the source code

$ git clone https://github.com/potterhsu/SVHNClassifier-PyTorch
$ cd SVHNClassifier-PyTorch

Download SVHN Dataset format 1

Extract to data folder, now your folder structure should be like below:

SVHNClassifier
    - data
        - extra
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat
        - test
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat
        - train
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat

Usage

(Optional) Take a glance at original images with bounding boxes
```
Open `draw_bbox.ipynb` in Jupyter
```

Convert to LMDB format

$ python convert_to_lmdb.py --data_dir ./data

(Optional) Test for reading LMDBs

Open `read_lmdb_sample.ipynb` in Jupyter

Train

$ python train.py --data_dir ./data --logdir ./logs

Retrain if you need

$ python train.py --data_dir ./data --logdir ./logs_retrain --restore_checkpoint ./logs/model-100.pth

Evaluate

$ python eval.py --data_dir ./data ./logs/model-100.pth

Visualize

$ python -m visdom.server
$ python visualize.py --logdir ./logs

Infer

$ python infer.py --checkpoint=./logs/model-100.pth ./images/test1.png

Clean

$ rm -rf ./logs
or
$ rm -rf ./logs_retrain

A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

Related tags

Overview

SVHNClassifier-PyTorch

Results

Sample

Loss

Requirements

Setup

Usage

Owner

Potter Hsu

Multi-Modal Fingerprint Presentation Attack Detection: Evaluation On A New Dataset

LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

Implicit Model Specialization through DAG-based Decentralized Federated Learning

lightweight python wrapper for vowpal wabbit

The repository includes the code for training cell counting applications. (Keras + Tensorflow)

A modular, open and non-proprietary toolkit for core robotic functionalities by harnessing deep learning

Video Autoencoder: self-supervised disentanglement of 3D structure and motion

Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

An index of recommendation algorithms that are based on Graph Neural Networks.

Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweeper.

Code repository for "Reducing Underflow in Mixed Precision Training by Gradient Scaling" presented at IJCAI '20

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

SEJE Pytorch implementation

Sign Language Transformers (CVPR'20)

Code for Greedy Gradient Ensemble for Visual Question Answering （ICCV 2021, Oral）

InsTrim: Lightweight Instrumentation for Coverage-guided Fuzzing

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Semiconductor Machine learning project