GitHub - ggalmeida0/vector_hand_recognition: A program to make the robot anki vector count fingers in one hand

Anki Vector Hand Gesture Recognition

This repo is a step by step explanation of my procces in making the robot anki vector recognize and read out loud the number of fingers you're holding up in your hand in real time.

Click Here for a video demo

Installations

Anki Vector SDK - https://developer.anki.com/vector/docs/initial.html
Open CV 4.1.0 - https://pypi.org/project/opencv-python/
Numpy and Sci-kit Learn (best way is to install anaconda) - https://docs.anaconda.com/anaconda/install/

How to use

On top of the dependencies above you need to autheticate your anki vector with your computer here's how to do this on Linux, Windows , Mac

Then just clone the repository and run the hand_recognition.py with python3

Select the ROI (Region of Interest) and wait a couple seconds for vector to calibrate the background. During this proccess the background and vector need to be static

After that just put your hand on the ROI and he'll say how many fingers you have up.

How it Works

Getting the Camera Feed

First we access vector's camera and set his head angle properlly.

with anki_vector.Robot() as robot:
  robot.camera.init_camera_feed()
  robot.behavior.set_head_angle(degrees(25.0))
  robot.behavior.set_lift_height(0.0)
  while robot.camera.image_streaming_enabled():
    #All code goes in here

Pre-processing

Vector's images are in RGB, but OpenCV only displays BGR images so we have to convert it for display purposes, it's also good to flip the image that way it isn't a mirror image.

img = cv2.cvtColor(np.array(robot.camera.latest_image.raw_image),cv2.COLOR_RGB2BGR)
img = cv2.flip(img,1)

For processing we are going to convert the image to gray scale, that is because a gray scale image only has one color channel and color images have 3, with only one color channel processing will be smoother. Also to reduce noise we'll apply a gaussian blur.

gray = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY)
gray = cv2.GaussianBlur(gray, (7,7), 0)

Background Subtraction / Recognizing Hands

The method I used to recognize the hand was background subtraction + thresholding. In a nutshell we are going to identify the background by running the program for 40 frames, then we'll subtract the background from every frame after the first 40. On top of that we'll apply a binary threshold on every pixel of the result, leaving us with a black and white image of the hand. Finally we have to extract the contour of the threholded image.

Click here to read more about this method
This video is also excellent

We can get the average value of the background in the following way:

def get_background(img):
    global background
    if background is None:
        background = img.copy().astype('float32')
        return  
    cv2.accumulateWeighted(img.astype(background.dtype), background, 0.5)

Let's also run this function on a specific ROI (Region of Interest) that way we don't waste processing power on the entire image

if num_frames < 40:
            if num_frames == 0:
                x1,y1,x2,y2 = cv2.selectROI('ROI selector', img, False)
                cv2.destroyWindow('ROI selector')
                print('Calibrating Background...\nDo not move the camera')
            gray = gray[int(y1):int(y1+y2), int(x1):int(x1+x2)]
            get_background(gray)
            num_frames += 1
            continue

Now we can just peform the subtraction using the function below. The function also find the largest countour of the threholded image, which we will assume to be the hand.

def background_sub(img):
    diff = cv2.absdiff(img.astype(background.dtype), background)
    mask = cv2.threshold(diff, 25, 255, cv2.THRESH_BINARY)[1]
    contours = cv2.findContours(mask.astype('uint8'),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)[0]
    if len(contours) > 0:
        hand_contour = max(contours,key=cv2.contourArea)
        return mask, hand_contour
    return mask, None

Counting the fingers

In order to count how many fingers in a hand I used the algorithm described in this paper and also in this tutorial. Here's how it works:

1)Get the extreme points of the contour (top, bottom, left, right).
2)Find the center of the hand using those extreme points.
3)Draw a circle with the radius 70% of the distance between the center and the farthest extreme point.
4)Find the contour of the overlaping parts of the circle and the hand.
5)Count the remaining parts of the circle excluding the bottom which is should be the wrist.

The following function will do steps 1 - 4 of the algorithm:

def get_finger_contours(hand_contour):
    global img
    cv2.drawContours(img,[hand_contour+(x1,y1)],-1,(0,0,255))
    chull = cv2.convexHull(hand_contour)
    extreme_top    = tuple(chull[chull[:, :, 1].argmin()][0])
    extreme_bottom = tuple(chull[chull[:, :, 1].argmax()][0])
    extreme_left   = tuple(chull[chull[:, :, 0].argmin()][0])
    extreme_right  = tuple(chull[chull[:, :, 0].argmax()][0])
    x_bar = int((extreme_left[0] + extreme_right[0]) / 2)
    y_bar = int((extreme_top[1] + extreme_bottom[1]) / 2)
    distance = pairwise.euclidean_distances(X=[(x_bar, y_bar)], Y=[extreme_left, extreme_right, extreme_top, extreme_bottom])[0]
    max_distance = distance[distance.argmax()]
    radius = int(0.7 * max_distance)
    circumference = 2 * np.pi * radius
    circular_roi = np.zeros(mask.shape,dtype='uint8')
    cv2.circle(circular_roi,(x_bar,y_bar),radius,(255,255,255))
    circular_roi = cv2.bitwise_and(mask,mask,mask=circular_roi)
    cv2.imshow('circ',circular_roi)
    return cv2.findContours(circular_roi.astype('uint8'), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[0], y_bar, circumference

Finally this code will count the amout of contours and make vector read it out loud:

count = 0
            for c in finger_contours:
                (x, y, w, h) = cv2.boundingRect(c)
                if ((y_bar + (y_bar * 0.25)) > (y + h)) and ((circumference * 0.25) > c.shape[0]):
                    count += 1
            if previous_count == count and count < 6:
                count_accumulator += 1
                #Have vector read out the number only if it sees the number for 30 consecutive frames
                if count_accumulator == 30:
                    print(count)
                    robot.behavior.say_text(str(count))
                    count_accumulator = 0
            else:
                previous_count = count

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
images		images
README.md		README.md
hand_recognition.py		hand_recognition.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anki Vector Hand Gesture Recognition

Installations

How to use

How it Works

Getting the Camera Feed

Pre-processing

Background Subtraction / Recognizing Hands

Counting the fingers

About

Releases

Packages

Languages

ggalmeida0/vector_hand_recognition

Folders and files

Latest commit

History

Repository files navigation

Anki Vector Hand Gesture Recognition

Installations

How to use

How it Works

Getting the Camera Feed

Pre-processing

Background Subtraction / Recognizing Hands

Counting the fingers

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages