Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework of densepose alphalab content #620

Merged
merged 13 commits into from
Sep 13, 2023
152 changes: 42 additions & 110 deletions src/alpha-lab/dense-pose.md
Original file line number Diff line number Diff line change
@@ -1,145 +1,77 @@
---
title: "Map your gaze onto body parts with DensePose"
description: "Using densepose to map gaze onto body parts. To be or not to be? proclaims Prince Hamlet while holding a skull in his hand. But, where is the audience looking? At the hand, at the arm, or the face?"

Check warning on line 3 in src/alpha-lab/dense-pose.md

View workflow job for this annotation

GitHub Actions / ✍️ Check spelling

Unknown word (densepose)
permalink: /alpha-lab/dense-pose/
meta:
- name: twitter:card
content: player
- name: twitter:image
content: "https://i.ytimg.com/vi/nt_zNSBMJWI/maxresdefault.jpg"
- name: twitter:player
content: "https://www.youtube.com/embed/nt_zNSBMJWI"
- name: twitter:width
content: "1280"
- name: twitter:height
content: "720"
- property: og:image
content: "https://i.ytimg.com/vi/nt_zNSBMJWI/maxresdefault.jpg"
tags: [Pupil Invisible, Neon, Cloud]
---
# Map gaze onto body parts using DensePose

<TagLinks />
<Youtube src="nt_zNSBMJWI"/>

Check warning on line 23 in src/alpha-lab/dense-pose.md

View workflow job for this annotation

GitHub Actions / ✍️ Check spelling

Unknown word (NSBMJWI)

**Act 3, Scene 1:** *"To be or not to be?"* proclaims Prince Hamlet while holding a skull in his hand. But, where is the audience looking? At the hand, at the arm, or the face?<br>

## Introduction

Have you ever wondered which body parts we gaze upon while conversing with others? Where a professional basketball player looks just before passing? Does hand movement play a role when delivering a speech?

Here we introduce a powerful and time-saving automation that can help answer these questions. It works by identifying body parts of people in the scene video and recording whether they were gazed at!
**Act 3, Scene 1:** *"To be or not to be?"* But where is the audience looking? At the hand, the face or the arm? <br>

::: tip
<b>This sounds really complicated...</b><br>
Don't worry! We've made this powerful tool easy-to-use, so you can get started right away – no coding required! By the end, you'll have a video like the one above, and a CSV file documenting which body parts were gazed.
Have you ever wondered which body parts we gaze upon while conversing with others? Where a professional basketball player looks just before passing? Does hand movement play a role when delivering a speech? This guide will show you how to get data that can be used to answer these questions!
:::

## Powered by DensePose
<details>
<summary>This tool employs <b>DensePose</b> (<a href="https://ai.facebook.com/tools/detectron2/">detectron2</a>, Meta AI). Click here to learn more</summary><br>
<!-- This is collapsed -->
DensePose is a method for dense human pose estimation and dense human body part segmentation. It's based on the Mask R-CNN architecture and is trained on the COCO dataset. DensePose is now a part of Detectron2's framework. You can read all the details in their paper <a href="https://arxiv.org/abs/1802.00434">DensePose: Dense Human Pose Estimation In The Wild.</a>
</details>
<br>
## Understanding visual behaviour on body parts

Check warning on line 31 in src/alpha-lab/dense-pose.md

View workflow job for this annotation

GitHub Actions / ✍️ Check spelling

Unknown word (behaviour)

## What you'll need
Understanding which body parts people look at during interactions, whether visual or otherwise, is an important topic in
fields ranging from sports science to psycholinguistics. This guide shows you how to use Neon or Pupil Invisible eye
tracking with [DensePose](https://github.com/facebookresearch/DensePose) (Github repository of [Dense Human Pose Estimation In The Wild](https://arxiv.org/abs/1802.00434))
to characterise gaze behaviour on body parts that appear in the scene video, as shown above.

Check warning on line 36 in src/alpha-lab/dense-pose.md

View workflow job for this annotation

GitHub Actions / ✍️ Check spelling

Unknown word (characterise)

Check warning on line 36 in src/alpha-lab/dense-pose.md

View workflow job for this annotation

GitHub Actions / ✍️ Check spelling

Unknown word (behaviour)

1. A **[Raw data export](/export-formats/recording-data/neon/)** from your project in Pupil Cloud. The downloaded folder contains one subfolder for every recording in the project. Each subfolder contains a scene video, gaze.csv, and other files. We will work with those subfolders, so have them at hand
2. A **Google account (optional)**. We set up a Google Colab notebook so you can execute all of this [from your browser](#Running-in-your-browser). This way, Google will lend you some extra computing power and required dependencies. [FAQ about Google Colab](https://research.google.com/colaboratory/intl/en-GB/faq.html), including what it is, how it's free, and some limitations. (If you do fancy running things locally, check out [this section](#running-locally).)
## What tools enable this?

Pupil Cloud currently offers a [Face Mapper enrichment](/enrichments/face-mapper/), which tracks faces in scene video and determines whether they were gazed at. However, tracking the rest of the body is currently not available in Cloud. This is where our guide comes in. By following our instructions, you will be able to automatically detect and track body parts of people visible in the scene video of your recordings, and map gaze onto tracked body parts. This approach can enable a deeper understanding of how subjects (wearing the eye tracker) visually interacted with other people.

## Running in your browser
Until recently, existing methods for tracking body parts in eye tracking scene video were not very robust or accurate. However, [DensePose](https://github.com/facebookresearch/DensePose) promises to solve many of these problems, so we have decided to incorporate it into our guide.

Firstly, from the raw data export, upload **(uncompressed)** one of the subfolders(recording folder) that you're interested in to Google Drive.
## Steps

Now for the Google Colab part: [click here to access the notebook](https://colab.research.google.com/drive/1s6mBNAhcnxhJlqxeaQ2IZMk_Ca381p25#forceEdit=true&sandboxMode=true), and carefully follow the instructions there.
1. Download a [Raw data export]() from your project in Pupil Cloud.
2. Upload (uncompressed) one of the subfolders (recording folder) that you're interested in to Google Drive from the raw data export.

Check warning on line 47 in src/alpha-lab/dense-pose.md

View workflow job for this annotation

GitHub Actions / ✍️ Check spelling

Unknown word (subfolders)
3. Access our [Google Colab Notebook](https://colab.research.google.com/drive/1s6mBNAhcnxhJlqxeaQ2IZMk_Ca381p25?usp=sharing) and carefully follow the instructions.

Check warning on line 48 in src/alpha-lab/dense-pose.md

View workflow job for this annotation

GitHub Actions / ✍️ Check spelling

Unknown word (Colab)

<div class="mb-4" style="display:flex;justify-content:center;">
<a href="https://colab.research.google.com/drive/1s6mBNAhcnxhJlqxeaQ2IZMk_Ca381p25#forceEdit=true&sandboxMode=true" target="_blank">
<a href="https://colab.research.google.com/drive/1s6mBNAhcnxhJlqxeaQ2IZMk_Ca381p25?usp=sharing" target="_blank">
<img style="width:180px" src="https://img.shields.io/static/v1?label=&message=Open%20in%20Google%20Colab&color=blue&labelColor=grey&logo=Google%20Colab&logoColor=#F9AB00" alt="colab badge">

Check warning on line 52 in src/alpha-lab/dense-pose.md

View workflow job for this annotation

GitHub Actions / ✍️ Check spelling

Unknown word (colab)
</a>
</div>

::: tip
The notebook contains lots of rows and this can seem overwhelming. It needn't be. It's actually really simple. Just click on each of the play buttons...
:::

## Outputs

After the code is executed, new files will be generated. Check the new DensePoseColab folder. Let's see what's in those files!

#### Video

You've already seen the accompanying video above. There's no mystery – you have a bounding box delimiting each person detected, a blue shaded mask over the body parts, a yellow highlighted body part when it's gazed at, and the typical red circle for the gaze position.
## Results

#### Gaze map image

You'll also find an image reporting the body segments and the amount of frames in which they were gazed, like below.
After executing the code, new files will be generated. Check the new DensePoseColab folder for the results:

Check warning on line 58 in src/alpha-lab/dense-pose.md

View workflow job for this annotation

GitHub Actions / ✍️ Check spelling

Unknown word (Colab)

1. A video showing a bounding box that delimits each detected person, a blue shaded mask over the body parts, a yellow highlighted body part when it's gazed at, and the typical red circle for the gaze position.
2. An image showing the body segments and the number of frames in which they were gazed, as shown below:

<div class="mb-4" style="display:flex;justify-content:center;">
<v-img class="rounded" :src="require(`../media/alpha-lab/densepose-result.png`)" width="100%" />

Check warning on line 64 in src/alpha-lab/dense-pose.md

View workflow job for this annotation

GitHub Actions / ✍️ Check spelling

Unknown word (densepose)
</div>

#### CSV files

Two files are stored as well, 1) a simple `parts_count.csv` showing the number of times each body part is gazed, and 2) a `densepose.csv` following a structure similar to the gaze.csv, but also with a new column indicating gazed body parts.
3. Two files are also stored:
- `parts_count.csv` - shows the number of times each body part is gazed
- `densepose.csv` - follows a structure similar to `gaze.csv`, but also with a new column indicating gazed body parts

## Running locally

<details><summary>Learn about running this tool locally</summary><br>
<!-- This is collapsed -->
<b>Feeling brave?</b> This is how to run locally on your computer.

<b>Note:</b> this option is only for Linux and Mac users as detectron2 doesn't supports Windows 😕<br>

If you don't have a GPU on your computer, we strongly recommend avoiding running locally</b>.

### Requirements

- Hardware: Linux or MacOS + we recommend that you have a Cuda GPU
- Python 3.7 or higher
- Dependencies

You only need to install two packages as we have put almost all the dependencies within a single package. Just run the following command:

```
python -m pip install torch 'git+https://github.com/pupil-labs/densepose-module.git'
```

Then, run the following command in your command prompt to get the DensePose output:

```
pl-densepose --no-vis --device "cuda"
```

You can also check out the arguments below.

</details>
<!-- empty line -->

## Optional Arguments

<details><summary>Learn about additional command line arguments</summary><br>
<!-- This is collapsed -->
We can't build a shoe that fits everyone, so we also allow you to pass arguments to the code:<br><br>

**- Device**
The device on which to run the DensePose model. You can choose between `cpu` and `cuda`. The default is `--device "cpu"`. But this can be a bit slow, so we recommend using `cuda` if you have a GPU with CUDA support.

::: warning
Even running on `cuda` can be slow, just be aware we estimate inference time to be around 2.5 FPS on a Nvidia RTX 3060.
:::

**- Visualize**
Use the flag `--vis` to enable live visualization of the output. By default, the visualisation is turned off to save resources, but even with this off, you'll still get the final video output.

**- Input and output paths**
Specify the input and output paths using `--input_path` and `--output_path`. If none are given, a UI will open to select the input and output paths. The input path shall be the subdirectory of the raw download, containing the video, world, and gaze data. The output path shall be the directory where the output files shall be saved.

**- Confidence threshold**
The default confidence is 0.7. You can change this value to tune the confidence threshold by using `--confidence` followed by a number between 0 and 1.

**- Start and end**
If you want to run it only on one specific section of your recording, you can pass the start and end event annotations to be used, like this: `--start "recording.begin" --end "recording.end"`.

</details>
<!-- empty line -->

## Behind the scenes
You can also run everything on your local machine. However, this option is only available for Linux and MacOS users as detectron2 does not support Windows 😕. If you do not have a GPU on your computer, we strongly recommend using our Google Colab notebook. Detailed instructions on running locally can be found in the [Github repository](https://github.com/pupil-labs/densepose-module) and the associated [read the docs](https://densepose-module.readthedocs.io/).

<details>
<summary>Model details & FAQ</summary><br>
<!-- This is collapsed -->
- Model weights:<br>
<code>densepose_rcnn_R_50_FPN_DL_s1x</code><br><br>
- Why is there no gaze recorded on the back of the head, hands or feet?<br>
There is no definition for those parts in DensePose. Likewise the frontal view of the arms in the picture refers to the inside of the arms, not the front.<br>
<br>
</details>
::: tip
Need assistance implementing your own DensePose + gaze tracking application? Reach out to us via email at [[email protected]](mailto:[email protected]), on our [Discord server](https://pupil-labs.com/chat/), or visit our [Support Page](https://pupil-labs.com/products/support/) for formal support options.
:::
Loading