Gesture Recognition with Mediapipe

Training machine learning model for gesture recognition with Mediapipe Framework and K-Nearest Neighbors (K-Neighbors Classifier) algorithm.

Introduction

The purpose of this project is to explore some Machine Learning algorithms along with the Google Mediapipe framework.

The "Hand Tracking" feature is used, which consists of recognition of only one hand. The c ++ file located in here, has been changed to instead return a new mp4 video with Mediapipe on, it will return a .csv file that contains the coordinates of the landmarks. In total there will be 21 landmarks, as they are distributed by hand in full. Standard Media Pipe output : Modified MediaPipe output in a plot:

KNN Algorithm

The k-nearest neighbors (KNN) algorithm is a simple, easy-to-implement supervised machine learning algorithm that can be used to solve both classification and regression problems. This algorithm assumes that similar things exist in close proximity. In other words, similar things are near to each other.

Preparing environment

To be able to initialize this project, it is necessary to have some settings configured on your machine.

Download and install Python version 3.7+ and the Pip package manager. Follow the instructions (according to your operating system) on the official website of the distributor.
Create a Python virtual environment for the project using Virtualenv. This will cause project dependencies to be isolated from your Operating System. Once you create the python environment, enable it before proceeding to the next steps. Ex: You should see (env)your-user-name:$ in the terminal.
Run $ pip install -r requirements.txt to install dependencies.

Mediapipe Framework

Clone Mediapipe repository;
Install mediapipe as explained here;
Copy mediapipe (~/mediapipe/mediapipe/) folder to ~/training-mediapipe-model/mediapipe/;

Datasets

For the models to be able to classify the gesture, it is necessary to have at least two classes, that is, two datasets with different gestures and containing mp4 video files.

Running

Pre-process

The input data for the model is the coordinates of the landmarks provided by Mediapipe. It is necessary to start this preprocessing in order to obtain this data. You will need to do this step for all datasets:

At ~/training-mediapipe-model/ run with the first parameter being to indicate the path to your dataset and the second to classify this dataset:

$ python src/preprocess.py --input_dataset_path /path/to/dataset/ --classification NameOfTheLabel

Build

To train the model, we need all csv's of all classifications in just one file. So this is what the build file does.

At ~/training-mediapipe-model/ run with the first parameter (separated by commas) to indicate which datasets will be served to the model:

$ python src/build.py --datasets_compile "dataset01,dataset02,dataset003"

Try the model

To use the model (classify a gesture recorded in the video) just perform the following steps:

At ~/training-mediapipe-model/ run with the first parameter (separated by commas) to provide the input:

$ python src/predict.py --input_video_path "the-input-here"

Using with the Mediapipe-API

It is now possible to use this model together with the API I made earlier. Make a request to /recognition/gesture/ endpoint with the .mp4 file and will return a classification type. Check Mediapipe-API for more details.

Notes

There are only 4 types of classes that the model recognizes: rock, open hand, ok and no hand;
Feel free to change the model (algorithm and/or data training) and integrate into the project;

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
docs		docs
mediapipe		mediapipe
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requiriments.txt		requiriments.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gesture Recognition with Mediapipe

Introduction

KNN Algorithm

Preparing environment

Mediapipe Framework

Datasets

Running

Pre-process

Build

Try the model

Using with the Mediapipe-API

Notes

Reference

About

Releases

Packages

Languages

License

samborba/training-mediapipe-model

Folders and files

Latest commit

History

Repository files navigation

Gesture Recognition with Mediapipe

Introduction

KNN Algorithm

Preparing environment

Mediapipe Framework

Datasets

Running

Pre-process

Build

Try the model

Using with the Mediapipe-API

Notes

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages