Skip to content

Commit

Permalink
Merge branch 'develop' of https://github.com/interaction-lab/HARMONI
Browse files Browse the repository at this point in the history
…into feature/local-speechbot

# Conflicts:
#	dockerfiles/harmoni/kinetic/base/dockerfile
#	dockerfiles/harmoni/noetic/base/dockerfile

Signed-off-by: Emily Zhou <[email protected]>
  • Loading branch information
emilyxzhou committed Jul 21, 2021
2 parents 30b2a2e + 9c88019 commit a1b229d
Show file tree
Hide file tree
Showing 31 changed files with 1,360 additions and 351 deletions.
7 changes: 5 additions & 2 deletions dockerfiles/harmoni/kinetic/base/dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -91,12 +91,15 @@ RUN \
google-api-python-client \
# Rasa
rasa==2.7.1 \
# Local STT
deepspeech \
# Testing
mock \
# TTS
gdown \
inflect \
sounddevice \
&& rm -rf -- /var/lib/apt/lists/*

&& rm -rf -- /var/lib/apt/lists/*

# ==================================================================
# Install Ros Kinetic
Expand Down
27 changes: 27 additions & 0 deletions dockerfiles/harmoni/kinetic/full/dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,33 @@ RUN ARCH= && dpkgArch="$(dpkg --print-architecture)" \
RUN mkdir -p /root/local_mount/ \
&& ln -vs /root/harmoni_catkin_ws/src/HARMONI /root/local_mount/

# ==================================================================
# Download default models for local STT, TTS services
# ------------------------------------------------------------------
WORKDIR $ROS_WS/src/HARMONI/harmoni_models

# STT
RUN mkdir stt && cd stt \
&& wget "https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm" \
&& wget "https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer"

# TTS
RUN cd .. \
&& mkdir -p tts && cd tts \
&& gdown --id 1dntzjWFg7ufWaTaFy80nRz-Tu02xWZos -O tts_model.pth.tar \
&& gdown --id 18CQ6G6tBEOfvCHlPqP8EBI4xWbrr9dBc -O config.json \
&& gdown --id 1Ty5DZdOc0F7OTGj9oJThYbL5iVu_2G0K -O vocoder_model.pth.tar \
&& gdown --id 1Rd0R_nRCrbjEdpOwq6XwZAktvugiBvmu -O config_vocoder.json \
&& gdown --id 11oY3Tv0kQtxK_JPgxrfesa99maVXHNxU -O scale_stats.npy

WORKDIR $ROS_WS/src/HARMONI/harmoni_actuators/harmoni_tts

RUN sudo apt-get update && sudo apt-get install espeak -y \
&& git clone https://github.com/coqui-ai/TTS \
&& cd TTS \
&& git checkout b1935c97 \
&& pip install -r requirements.txt \
&& python setup.py install

# ==================================================================
# For convenience add a source script to bashrc and update without clearing
Expand Down
4 changes: 4 additions & 0 deletions dockerfiles/harmoni/noetic/base/dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,10 @@ RUN \
google-api-python-client \
# Rasa
rasa==2.7.1 \
# Local STT
deepspeech \
# Testing
mock \
# TTS
gdown \
inflect \
Expand Down
27 changes: 27 additions & 0 deletions dockerfiles/harmoni/noetic/full/dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,33 @@ RUN ARCH= && dpkgArch="$(dpkg --print-architecture)" \
RUN mkdir -p /root/local_mount/ \
&& ln -vs /root/harmoni_catkin_ws/src/HARMONI /root/local_mount/

# ==================================================================
# Download default models for local STT, TTS services
# ------------------------------------------------------------------
WORKDIR $ROS_WS/src/HARMONI/harmoni_models

# STT
RUN mkdir stt && cd stt \
&& wget "https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm" \
&& wget "https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer"

# TTS
RUN cd .. \
&& mkdir -p tts && cd tts \
&& gdown --id 1dntzjWFg7ufWaTaFy80nRz-Tu02xWZos -O tts_model.pth.tar \
&& gdown --id 18CQ6G6tBEOfvCHlPqP8EBI4xWbrr9dBc -O config.json \
&& gdown --id 1Ty5DZdOc0F7OTGj9oJThYbL5iVu_2G0K -O vocoder_model.pth.tar \
&& gdown --id 1Rd0R_nRCrbjEdpOwq6XwZAktvugiBvmu -O config_vocoder.json \
&& gdown --id 11oY3Tv0kQtxK_JPgxrfesa99maVXHNxU -O scale_stats.npy

WORKDIR $ROS_WS/src/HARMONI/harmoni_actuators/harmoni_tts

RUN sudo apt-get update && sudo apt-get install espeak -y \
&& git clone https://github.com/coqui-ai/TTS \
&& cd TTS \
&& git checkout b1935c97 \
&& pip install -r requirements.txt \
&& python setup.py install

# ==================================================================
# For convenience add a source script to bashrc and update without clearing
Expand Down
169 changes: 166 additions & 3 deletions harmoni_actuators/harmoni_face/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,177 @@
Harmoni provides a wrapper on the cordial face, which is capable of expressing speech and emotion.

We provide a fork of the face of [CoRDial](https://github.com/ndennler/cordial-public) which was created by the Interaction Lab. Although it started as one face, we have implemented separate services for the eyes and mouth, allowing them to be controlled independently.

![packages](../images/screen_demo.png)


The face service is split into thress services: the mouth, the nose, and the eyes.
All the services can handle:
- Action Units (AUs, facial action units corresponding to the facial muscle movements and which coding system is found [here](https://imotions.com/blog/facial-action-coding-system))
- Facial expressions (combination of action units)

The mouth service handles also:
- Visemes (sound of words)


The eyes service handles also:
- Gaze direction (where the face is looking at)

## Usage

## Parameters
The following documentation refers to the facial expression and action units requests (the ones that are managed by all the services). These requests can be called from every face service: mouth, eyes, and nose.

The API for Facial Expression and Action Units has:
- Request Name: ActionType: DO
- Body: data(str)
- Response:
- response (int): SUCCESS, or FAILURE
- message (str): empty string because no response is provided for a DO action

The body string is a list of object with the following items:

| Key | Definition | Value Type |
|----------------------|------------|--------|
|start | timing of the facial expression or action units after the beginning of the request | int [seconds] |
|type | type of request for the face (e.g., action, viseme, gaze) | str, "action" or "au" |
|id | name of facial expression or action units (e.g., "happy_face", "au1") | str, "au$number", "$name_face" |
|pose | intensity of action units (no needed for facial expressions) | int [0 - 1] |


The Action Units are the following: "au1", "au2", "au4", "au5", "au6", "au43", "au9", "au38", "au39", "au10", "au12", "au13", "au14", "au15", "au16", "au17", "au18", "au20", "au23", "au24", "au25", "au26", "au27".

The Facial Expressions are the available in the [resource folder](https://github.com/interaction-lab/HARMONI/blob/feature/cordialface-request/harmoni_actuators/harmoni_face/src/harmoni_face/resource/cordial_face_expression.json).

Here two examples for requiring facial expressions and action units:

- Facial expression

data: str([ {'start': 1, 'type': 'action', 'id': 'saucy_face'}, {'start': 2, 'type': 'action', 'id': 'breath_face'}])

- Action Units

data:str([{'start': 0, 'type': 'au', 'id': 'au13', 'pose': 1}])


### Mouth service

The API for Visemes has:
- Request Name: ActionType: DO
- Body: data(str)
- Response:
- response (int): SUCCESS, or FAILURE
- message (str): empty string because no response is provided for a DO action

The body string is a list of object with the following items:

| Key | Definition | Value Type |
|----------------------|------------|--------|
|start | timing of the viseme after the beginning of the request | int [seconds] |
|type | type of request for the face (e.g., action, viseme, gaze) | str, "viseme" |
|id | name of the viseme (e.g., "POSTALVEOLAR") | str, "$viseme" |

Here the list of visemes ids ($viseme):
- "BILABIAL"
- "LABIODENTAL",
- "INTERDENTAL",
- "DENTAL_ALVEOLAR",
- "POSTALVEOLAR",
- "VELAR_GLOTTAL",
- "CLOSE_FRONT_VOWEL",
- "OPEN_FRONT_VOWEL",
- "MID_CENTRAL_VOWEL",
- "OPEN_BACK_VOWEL",
- "CLOSE_BACK_VOWEL",
- "IDLE"

This is an example for viseme request:
- Viseme

data: str([{'start': 0.075,'type': 'viseme', 'id': 'POSTALVEOLAR'}])


### Eyes service
The API for the Gaze Direction has:
- Request Name: ActionType: DO
- Body: data(str)
- Response:
- response (int): SUCCESS, or FAILURE
- message (str): empty string because no response is provided for a DO action

The body string is a list of object with the following items:

| Key | Definition | Value Type |
|----------------------|------------|--------|
|start | timing of the gaze after the beginning of the request | int [seconds] |
|type | type of request for the face (e.g., action, viseme, gaze) | str, "gaze" |
|id | name of the target (e.g., "target") | str, "$name_target" |
|point | coordinates (x,y,z) of gaze directions |list[], range are the following (never zero!): x: {-7,+7}, y:{-5, +5}, z:{-10, +10} |



This is an example for gaze direction request:
- Gaze direction

data: str([{'start': 0, 'type': 'gaze', 'id':'target', 'point': [1,5 ,10]}])


## Parameters
Parameters input for the face service is:

| Parameters | Definition | Values |
|----------------------|------------|--------|
|timer_interval | Time interval for facial expression and AUs (in seconds) | 1 |

### Mouth service
The AUs for the mouth are: AU10, AU12, AU13, AU14, AU15, AU16, AU17, AU18, AU20, AU23, AU24, AU25, AU26, AU27

Parameters input for the mouth service are:

| Parameters | Definition | Values |
|----------------------|------------|--------|
|min_duration_viseme | Minimum duration of visemes (in seconds) |0.05 |
|speed_viseme | Speed of the visemes (in milliseconds) |10 |
|timer_interval | Time interval for visemes (in seconds) | 0.01 |

### Eyes service
The AUs for the eyes are:
- Browns: AU1, AU2, AU4
- Eyelid: AU5, AU6, AU43


Parameters input for the eyes service are:

| Parameters | Definition | Values |
|----------------------|------------|--------|
|gaze_speed | Speed of the gaze (in milliseconds) |10 |
|timer_interval | Time interval for visemes (in seconds) | 0.01 |

### Nose service
The AUs for the nose are
- Nose wrinkle: AU9
- Nose width: AU38, AU39

Parameters input for the nose service are:

| Parameters | Definition | Values |
|----------------------|------------|--------|
|gaze_speed | Speed of the gaze (in milliseconds) |10 |
|timer_interval | Time interval for visemes (in seconds) | 0.01 |


## Testing

For testing the service you have to run:

"""
rostest harmoni_face face.test
"""

Then, open the browser at the link: http://172.18.3.4:8081/ , and wait for the face to appear.
The face will act a sequential set of expressions for three times:
1. Eyes service, where it will act action units, facial expression and gaze
2. Mouth service, where it will act action units, facial expression and viseme
3. Nose service, where it will act action units and facial expression

## References
[Documentation](https://harmoni.readthedocs.io/en/latest/packages/harmoni_face.html)
[Documentation](https://harmoni.readthedocs.io/en/latest/packages/harmoni_face.html)

10 changes: 4 additions & 6 deletions harmoni_actuators/harmoni_face/config/configuration.yaml
Original file line number Diff line number Diff line change
@@ -1,15 +1,13 @@
# Configuration file for the microphone
face:
default_param:
min_duration_viseme: 0.05 #s
speed_viseme: 10 #ms
timer_interval: 0.01 #s
gaze_speed: 10 #ms
timer_interval: 1 #s
mouth:
min_duration_viseme: 0.05 #s
speed_viseme: 10 #ms
timer_interval: 0.01 #s
eyes:
gaze_speed: 10 #ms

#shell_command: "firefox http://127.0.0.1:8080/index.html"
timer_interval: 0.01 #s
nose:
timer_interval: 0.01 #s
1 change: 1 addition & 0 deletions harmoni_actuators/harmoni_face/msg/FaceRequest.msg
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ int32 IDLE_ON=2
# if retarget_gaze is false, ignore gaze_target
bool retarget_gaze
geometry_msgs/Point gaze_target
## Gaze target range (never zero!): x: {-7,+7}, y:{-5, +5}, z:{-10, +10}

# velocity to move gaze, in rad/s
float64 gaze_vel
Loading

0 comments on commit a1b229d

Please sign in to comment.