Framework for the Complete Gaze Tracking Pipeline

The figure below shows a general representation of the camera-to-screen gaze tracking pipeline [1]. The webcam image is preprocessed to create a normalized image of the eyes and face, from left to right. These images are fed into a model, which predicts the 3D gaze vector. The predicted gaze vector can be projected onto the screen once the user’s head pose is known.
This framework allows for the implementation of a real-time approach to predict the viewing position on the screen based only on the input image.

pip install -r requirements.txt
If necessary, calibrate the camera using the provided interactive script python calibrate_camera.py, see Camera Calibration by OpenCV.
For higher accuracy, it is also advisable to calibrate the position of the screen as described by Takahashiet al., which provide an OpenCV and matlab implementation.
To make reliable predictions, the proposed model needs to be specially calibration for each user. A software is provided to collect this calibration data.
Train a model or download a pretrained model.
If all previous steps are fulfilled, python main.py --calibration_matrix_path=./calibration_matrix.yaml --model_path=./p00.ckpt can be executed and a "red laser pointer" should be visible on the screen. main.py also provides multiple visualization options like:
1. --visualize_preprocessing to visualize the preprocessed images
2. --visualize_laser_pointer to show the gaze point the person is looking at on the screen like a red laserpointer dot, see the right monitor on the image below
3. --visualize_3d to visualize the head, the screen, and the gaze vector in a 3D scene, see left monitor on the image below

[1] Amogh Gudi, Xin Li, and Jan van Gemert, “Efficiency in real-time webcam gaze tracking”, in Computer Vision - ECCV 2020 Workshops - Glasgow, UK, August 23-28, 2020, Proceedings, Part I, Adrien Bartoli and Andrea Fusiello, Eds., ser. Lecture Notes in Computer Science, vol. 12535, Springer, 2020, pp. 529–543. DOI : 10.1007/978-3-030-66415-2_34. [Online]. Available: https://doi.org/10.1007/978-3-030-66415-2_34.

Framework for the Complete Gaze Tracking Pipeline

Related tags

Overview

Framework for the Complete Gaze Tracking Pipeline

Owner

Pascal

Using python libraries to track hands

The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Fine tuning keras-ocr python package with custom synthetic dataset from scratch

MeshToGeotiff - A fast Python algorithm to convert a 3D mesh into a GeoTIFF

Image augmentation for machine learning experiments.

Document manipulation detection with python

A machine learning software for extracting information from scholarly documents

An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports.

Open Source Differentiable Computer Vision Library for PyTorch

Shape Detection - It's a shape detection project with OpenCV and Python.

QED-C: The Quantum Economic Development Consortium provides these computer programs and software for use in the fields of quantum science and engineering.

Give a solution to recognize MaoYan font.

Official code for :rocket: Unsupervised Change Detection of Extreme Events Using ML On-Board :rocket:

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want.

Msos searcher - A half-hearted attempt at finding a magic square of squares

OCR powered screen-capture tool to capture information instead of images

This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.