Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Last update: Sep 23, 2021

Overview

play-with-torch

Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Tools

Tested Hardware

RasberryPi 4 Model B here, RAM: 4 GB and Processor 4-core @ 1.5 GHz
microSD Card 64 GB
5M USB Retractable Clip 120 Degrees WebCam Web Wide-angle Camera Laptop U7 Mini or Raspi Camera

Tested Software

Ubuntu Desktop 20.10 aarch64 64 bit, install on RasberriPi 4
PyTorch: torch 1.6.0 aarch64 and torchvision 0.7.0 aarch64
Python min. ver. 3.6 (3.8 recommended)

Install the prerequisites

Install packages

$ sudo apt install build-essential make cmake git python3-pip libatlas-base-dev
$ sudo apt install libssl-dev
$ sudo apt install libopenblas-dev libblas-dev m4 python3-yaml
$ sudo apt install libomp-dev

make swap space to 2048 MB

$ free -h
$ sudo swapoff -a
$ sudo dd if=/dev/zero of=/swapfile bs=1M count=2048
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
$ free -h

Install torch 1.6.0

$ pip3 install torch-1.6.0a0+b31f58d-cp38-cp38-linux_aarch64.whl

Folder Structure

play-with-torch/
├── config/
│    ├── config.json - holds configuration for training
│    └── parse_config.py - class to handle config file and cli options
│
├── docker/
│   ├── Dockerfile
│   └── requirements.txt
│
├── data/ - default directory for storing input data
│
├── docs/ - for documentation
│   └── play-with-torch.tex
│
├── models/ - models, losses, and metrics
│   ├── model.py
│   ├── metric.py
│   └── loss.py
│
├── samples/
│
├── saved/
│   ├── checkpoints/
│   ├── traced_model/
│   ├── models/ - trained models are saved here
│   └── logs/ - default logdir for tensorboard and logging output
│
├── site
├── templates/ - for serving model on Flask
│   └── index.html
├── tests/
├── utils/ - small utility functions
│   ├── data/
│   └── ...
│
├── inference.py - main script to inference model
├── README.md
├── trace_model.py - main script to convert model
└── train.py - main script to start training

Usage

Run inference

$ git clone https://github.com/mheriyanto/play-with-torch.git
$ cd play-with-torch/
$ python3 inference.py video --config config/nanodet-m.yml --model saved/models/nanodet_m.ckpt --path video.mp4

Convert model

$ python3 trace_model.py --cfg_path config/nanodet-m.yml --model_path saved/models/nanodet_m.ckpt --input_shape 320,320

Training

$ python3 train.py config/nanodet_custom_xml_dataset.yml

TO DO

Implement Unit-Test: Test-Driven Development (TDD)

Credit to

Share PyTorch binaries built for Raspberry Pi

Reference

NanoDet: Super fast and lightweight anchor-free object detection model. here
Yunjey Choi - PyTorch Tutorial for Deep Learning Researchers here
Victor Huang - PyTorch Template Project (here)

Repository for playing the computer vision apps: People analytics on Raspberry Pi.

Related tags

Overview

play-with-torch

Tools

Tested Hardware

Tested Software

Install the prerequisites

Folder Structure

Usage

TO DO

Credit to

Reference

Owner

eMHa

Python tool that takes the OCR.space JSON output as input and draws a text overlay on top of the image.

computer vision, image processing and machine learning on the web browser or node.

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

Introduction to image processing, most used and popular functions of OpenCV

Some bits of javascript to transcribe scanned pages using PageXML

Some Boring Research About Products Recognition 、Duplicate Img Detection、Img Stitch、OCR

Image augmentation library in Python for machine learning.

CNN+LSTM+CTC based OCR implemented using tensorflow.

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex.

scene-linear test images

Python-based tools for document analysis and OCR

Let's explore how we can extract text from forms

An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Scene text detection and recognition based on Extremal Region(ER)

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

Autonomous Driving project for Euro Truck Simulator 2

Aloception is a set of package for computer vision: aloscene, alodataset, alonet.

Distort a video using Seam Carving (video) and Vibrato effect (sound)