GBIM(Gesture-Based Interaction map)

Overview

GBIM

Python 3.6 PaddleX License

手势交互地图 GBIM(Gesture-Based Interaction map),基于视觉深度神经网络的交互地图,通过电脑摄像头观察使用者的手势变化,进而控制地图进行简单的交互。网络使用PaddleX提供的轻量级模型PPYOLO Tiny以及MobileNet V3 small,使得整个模型大小约10MB左右,即使在CPU下也能快速定位和识别手势。

手势

手势 交互 手势 交互 手势 交互
向上滑动 向左滑动 地图放大
手势 交互 手势 交互 手势 交互
向下滑动 向右滑动 地图缩小

进度安排

基础

  • 确认用于交互的手势。
  • 使用det_acq.py采集一些电脑摄像头拍摄的人手姿势数据。
  • 数据标注,训练手的目标检测模型
  • 捕获目标手,使用clas_acq.py获取手部图像进行标注,并用于训练手势分类模型。
  • 交互手势的检测与识别组合验证。
  • 打开百度地图网页版,进行模拟按键交互。
  • 组合功能,验证基本功能。

进阶

  • 将图像分类改为序列图像分类,提高手势识别的流畅度和准确度。
  • 重新采集和标注数据,调参训练模型。
  • 搭建可用于参数调节的地图。
  • 界面整合,整理及美化。

数据集 & 模型

手势检测

  • 数据集使用来自联想小新笔记本摄像头采集的数据,使用labelImg标注为VOC格式,共1011张。该数据集场景、环境和人物单一,仅作为测试使用,不提供数据集下载。数据组织参考PaddelX下的PascalVOC数据组织方式。
  • 模型使用超轻量级PPYOLO Tiny,模型大小小于4MB,随便训练了100轮后保留best_model作为测试模型,由于数据集和未调参训练的原因,当前默认识别效果较差

手势分类

  • 数据集使用来自联想小新笔记本摄像头采集的数据,通过手势检测模型提出出手图像,人工分为7类,分别为6种交互手势以及“其他”,共1102张。该数据集数量较少,手型及手势单一,仅作为测试使用,不提供数据集下载。数据组织形式如下:
dataset
	├-- Images
	|     ├-- up
	┆     ┆    └-- xxx.jpg
	|     └-- other
	┆          └-- xxx.jpg
	├-- labels.txt
	├-- train_list.txt
	└-- val_list.txt
  • 模型使用超轻量级MobileNet V3 small,模型大小小于7MB,由于数据量很小,随便训练了20轮后保留best_model作为测试模型,当前识别分类效果较差

模型文件上传使用LFS,下拉时注意需要安装LFS,参考LFS文档。后续将重新采集和标注更加多样的大量数据集,并采用更好的调参方法获得更加准确的识别模型

演示

手势识别

地图交互

*未显示Capture界面

使用

  1. 克隆当前项目到本地,按照requirements.txt安装所依赖的包opencv、paddlex以及pynput。PaddleX对应请安装最新版的PaddlePaddle,由于模型轻量,CPU版本足矣,参考下面代码,细节参考官方网站
python -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
  1. 进入demo.py,将浏览器路径修改为自己使用的浏览器路径:
web_path = '"D:/Twinkstar/Twinkstar Browser/twinkstar.exe"'  # 自己的浏览器路径
  1. 运行demo.py启动程序:
cd GBIM
python demo.py

常见问题及解决

  1. Q: 拉项目时卡住不动

    A:首先确认按照文档安装LFS。如果已经安装那极大可能是网络问题,可以等待一段时间,或先跳过LFS文件,再单独拉取,参考下面git代码:

    // 开启跳过无法clone的LFS文件
    git lfs install --skip-smudge 
    // clone当前项目
    git clone "current project" 
    // 进入当前项目,单独拉取LFS文件
    cd "current project" 
    git lfs pull 
    // 恢复LFS设置
    git lfs install --force
  2. Q:按q或者手势交互无效

    A:请注意当前鼠标点击的焦点,焦点在Capture,则接受q退出;焦点在浏览器,则交互结果将驱动浏览器中的地图进行变换。

  3. Q:安装PaddleX时报错,关于MV C++

    A:若在Windows下安装coco tool时报错,则可能缺少Microsoft Visual C++,可在微软官方下载网页进行下载安装后重启,即可解决。

  4. Q:运行未报错,但没有保存数据到本地

    A:请检查路径是否有中文,cv2.imwrite保存图像时不能有中文路径。

参考

  1. 玩腻了小游戏?Paddle手势识别玩转游戏玩出新花样!
  2. https://github.com/PaddlePaddle/PaddleX

交流与反馈

Email:[email protected]

The official project of SimSwap (ACM MM 2020)

SimSwap: An Efficient Framework For High Fidelity Face Swapping Proceedings of the 28th ACM International Conference on Multimedia The official reposi

Six_God 2.6k Jan 08, 2023
Data visualization app for H&M competition in kaggle

handm_data_visualize_app Data visualization app by streamlit for H&M competition in kaggle. competition page: https://www.kaggle.com/competitions/h-an

Kyohei Uto 12 Apr 30, 2022
Detecting drunk people through thermal images using Deep Learning (CNN)

Drunk Detection CNN Detecting drunk people through thermal images using Deep Learning (CNN) Dataset We used thermal images provided by Electronics Lab

Giacomo Ferretti 3 Oct 27, 2022
Think Big, Teach Small: Do Language Models Distil Occam’s Razor?

Think Big, Teach Small: Do Language Models Distil Occam’s Razor? Software related to the paper "Think Big, Teach Small: Do Language Models Distil Occa

0 Dec 07, 2021
QilingLab challenge writeup

qiling lab writeup shielder 在 2021/7/21 發布了 QilingLab 來幫助學習 qiling framwork 的用法,剛好最近有用到,順手解了一下並寫了一下 writeup。 前情提要 Qiling 是一款功能強大的模擬框架,和 qemu user mode

Yuan 17 Nov 17, 2022
《Improving Unsupervised Image Clustering With Robust Learning》(2020)

Improving Unsupervised Image Clustering With Robust Learning This repo is the PyTorch codes for "Improving Unsupervised Image Clustering With Robust L

Sungwon Park 129 Dec 27, 2022
A PyTorch implementation of unsupervised SimCSE

A PyTorch implementation of unsupervised SimCSE

99 Dec 23, 2022
Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds."

DeltaConv [Paper] [Project page] Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds" by Ru

98 Nov 26, 2022
Stacked Recurrent Hourglass Network for Stereo Matching

SRH-Net: Stacked Recurrent Hourglass Introduction This repository is supplementary material of our RA-L submission, which helps reviewers to understan

28 Jan 03, 2023
An experimentation and research platform to investigate the interaction of automated agents in an abstract simulated network environments.

CyberBattleSim April 8th, 2021: See the announcement on the Microsoft Security Blog. CyberBattleSim is an experimentation research platform to investi

Microsoft 1.5k Dec 25, 2022
YOLO-v5 기반 단안 카메라의 영상을 활용해 차간 거리를 일정하게 유지하며 주행하는 Adaptive Cruise Control 기능 구현

자율 주행차의 영상 기반 차간거리 유지 개발 Table of Contents 프로젝트 소개 주요 기능 시스템 구조 디렉토리 구조 결과 실행 방법 참조 팀원 프로젝트 소개 YOLO-v5 기반으로 단안 카메라의 영상을 활용해 차간 거리를 일정하게 유지하며 주행하는 Adap

14 Jun 29, 2022
The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis (NeurIPS 2021) Project Page | Paper Xudong Xu, Xingang Pan, Dahua Lin and Bo Dai GOF

xuxudong 97 Nov 10, 2022
Torch code for our CVPR 2018 paper "Residual Dense Network for Image Super-Resolution" (Spotlight)

Residual Dense Network for Image Super-Resolution This repository is for RDN introduced in the following paper Yulun Zhang, Yapeng Tian, Yu Kong, Bine

Yulun Zhang 494 Dec 30, 2022
A Kaggle competition: discriminate gender based on handwriting

Gender discrimination based on handwriting See http://fastml.com/gender-discrimination/ for description. prep_data.py - a first step chunk_by_authors.

Zygmunt Zając 22 Jul 20, 2022
DUE: End-to-End Document Understanding Benchmark

This is the repository that provide tools to download data, reproduce the baseline results and evaluation. What can you achieve with this guide Based

21 Dec 29, 2022
Intelligent Video Analytics toolkit based on different inference backends.

English | 中文 OpenIVA OpenIVA is an end-to-end intelligent video analytics development toolkit based on different inference backends, designed to help

Quantum Liu 15 Oct 27, 2022
Software & Hardware to do multi color printing with Sharpies

3D Print Colorizer is a combination of 3D printed parts and a Cura plugin which allows anyone with an Ender 3 like 3D printer to produce multi colored

343 Jan 06, 2023
Statistical-Rethinking-with-Python-and-PyMC3 - Python/PyMC3 port of the examples in " Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath

Statistical Rethinking with Python and PyMC3 This repository has been deprecated in favour of this one, please check that repository for updates, for

Osvaldo Martin 786 Dec 29, 2022
NEG loss implemented in pytorch

Pytorch Negative Sampling Loss Negative Sampling Loss implemented in PyTorch. Usage neg_loss = NEG_loss(num_classes, embedding_size) optimizer =

Daniil Gavrilov 123 Sep 13, 2022
Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022