Skip to content

Gesture recognition using Mediapipe Framework and KNN algorithm.

License

Notifications You must be signed in to change notification settings

samborba/training-mediapipe-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gesture Recognition with Mediapipe

Training machine learning model for gesture recognition with Mediapipe Framework and K-Nearest Neighbors (K-Neighbors Classifier) algorithm.

Introduction

The purpose of this project is to explore some Machine Learning algorithms along with the Google Mediapipe framework.

The "Hand Tracking" feature is used, which consists of recognition of only one hand. The c ++ file located in here, has been changed to instead return a new mp4 video with Mediapipe on, it will return a .csv file that contains the coordinates of the landmarks. In total there will be 21 landmarks, as they are distributed by hand in full. Standard Media Pipe output : Normal output from Mediapipe Modified MediaPipe output in a plot: Landmarks - Hands Open

KNN Algorithm

The k-nearest neighbors (KNN) algorithm is a simple, easy-to-implement supervised machine learning algorithm that can be used to solve both classification and regression problems. This algorithm assumes that similar things exist in close proximity. In other words, similar things are near to each other.

Preparing environment

To be able to initialize this project, it is necessary to have some settings configured on your machine.

  1. Download and install Python version 3.7+ and the Pip package manager. Follow the instructions (according to your operating system) on the official website of the distributor.
  2. Create a Python virtual environment for the project using Virtualenv. This will cause project dependencies to be isolated from your Operating System. Once you create the python environment, enable it before proceeding to the next steps. Ex: You should see (env)your-user-name:$ in the terminal.
  3. Run $ pip install -r requirements.txt to install dependencies.

Mediapipe Framework

  1. Clone Mediapipe repository;
  2. Install mediapipe as explained here;
  3. Copy mediapipe (~/mediapipe/mediapipe/) folder to ~/training-mediapipe-model/mediapipe/;

Datasets

For the models to be able to classify the gesture, it is necessary to have at least two classes, that is, two datasets with different gestures and containing mp4 video files.

Running

Pre-process

The input data for the model is the coordinates of the landmarks provided by Mediapipe. It is necessary to start this preprocessing in order to obtain this data. You will need to do this step for all datasets:

  1. At ~/training-mediapipe-model/ run with the first parameter being to indicate the path to your dataset and the second to classify this dataset:
$ python src/preprocess.py --input_dataset_path /path/to/dataset/ --classification NameOfTheLabel

Build

To train the model, we need all csv's of all classifications in just one file. So this is what the build file does.

  1. At ~/training-mediapipe-model/ run with the first parameter (separated by commas) to indicate which datasets will be served to the model:
$ python src/build.py --datasets_compile "dataset01,dataset02,dataset003"

Try the model

To use the model (classify a gesture recorded in the video) just perform the following steps:

  1. At ~/training-mediapipe-model/ run with the first parameter (separated by commas) to provide the input:
$ python src/predict.py --input_video_path "the-input-here"

Using with the Mediapipe-API

It is now possible to use this model together with the API I made earlier. Make a request to /recognition/gesture/ endpoint with the .mp4 file and will return a classification type. Check Mediapipe-API for more details.

Notes

  1. There are only 4 types of classes that the model recognizes: rock, open hand, ok and no hand;
  2. Feel free to change the model (algorithm and/or data training) and integrate into the project;

Reference

  1. Mediapipe;
  2. Scikit-Learn Documentation;
  3. Sentdex;

About

Gesture recognition using Mediapipe Framework and KNN algorithm.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published