Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Food-on-Fork Detection #169

Merged
merged 19 commits into from
Mar 23, 2024
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -197,24 +197,38 @@ good-names=a,
b,
c,
d,
f,
i,
j,
k,
m,
n,
p,
ps,
x,
x0,
x1,
X,
y,
y0,
y1,
z,
u,
v,
w,
h,
r,
S,
S_inv,
rc,
ax,
ex,
hz,
kw,
ns,
Run,
train_X,
test_X,
_

# Good variable names regexes, separated by a comma. If names match any regex,
Expand Down
1 change: 1 addition & 0 deletions ada_feeding_msgs/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ find_package(rosidl_default_generators REQUIRED)
rosidl_generate_interfaces(${PROJECT_NAME}
"msg/AcquisitionSchema.msg"
"msg/FaceDetection.msg"
"msg/FoodOnForkDetection.msg"
"msg/Mask.msg"

"action/AcquireFood.action"
Expand Down
17 changes: 17 additions & 0 deletions ada_feeding_msgs/msg/FoodOnForkDetection.msg
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# A message with the results of food on fork detection on a single frame.

# The header for the image the detection corresponds to
std_msgs/Header header

# The status of the food-on-fork detector.
int32 status
int32 SUCCESS=1
int32 ERROR_TOO_FEW_POINTS=-1
int32 UNKNOWN_ERROR=-99

# A probability in [0,1] that indicates the likelihood that there is food on the
# fork in the image. Only relevant if status == FoodOnForkDetection.SUCCESS
float64 probability

# Contains more details of the result, including any error messages that were encountered
string message
32 changes: 32 additions & 0 deletions ada_feeding_perception/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,3 +91,35 @@ Launch the web app along with all the other nodes (real or dummy) as documented
- `offline.images` (list of strings, required): The paths, relative to `install/ada_feeding_perception/share/ada_feeding_perception`, to the images to test.
- `offline.point_xs` (list of ints, required): The x-coordinates of the seed points. Must be the same length as `offline.images`.
- `offline.point_ys` (list of ints, required): The y-coordinates of the seed points. Must be the same length as `offline.images`.

## Food-on-Fork Detection

Our eye-in-hand Food-on-Fork Detection node and training/testing infrastructure was designed to make it easy to substitute and compare other food-on-fork detectors. Below are instructions on how to do so.

1. **Developing a new food-on-fork detector**: Create a subclass of `FoodOnForkDetector` that implements all of the abstractmethods. Note that as of now, a model does not have access to a real-time TF Buffer during test time; hence, **all transforms that the model relies on must be static**.
2. **Gather the dataset**: Because this node uses the eye-in-hand camera, it is sensitive to the relative pose between the camera and the fork. If you are using PRL's robot, [the dataset collected in early 2024](https://drive.google.com/drive/folders/1hNciBOmuHKd67Pw6oAvj_iN_rY1M8ZV0?usp=drive_link) may be sufficient. Otherwise, you should collect your own dataset:
1. The dataset should consist of a series of ROS2 bags, each recording the following: (a) the aligned depth to color image topic; (b) the color image topic; (c) the camera info topic (we assume it is the same for both); and (d) the TF topic(s).
2. We recorded three types of bags: (a) bags where the robot was going through the motions of feeding without food on the fork and without the fork nearing a person or plate; (b) the same as above but with food on the fork; and (c) bags where the robot was acquiring and feeding a bite to someone. We used the first two types of bags for training, and the third type of bag for evaluation.
3. All ROS2 bags should be in the same directory, with a file `bags_metadata.csv` at the top-level of that directory.
4. `bags_metadata.csv` contains the following columns: `rosbag_name` (str), `time_from_start` (float), `food_on_fork` (0/1), `arm_moving` (0/1). The file only needs rows for timestamps when one or both of the latter columns change; for intermediate timestamps, it is assumed that they stay the same.
5. To generate `bags_metadata.csv`, we recommend launching RVIZ, adding your depth and/or RGB image topic, and playing back the bag. e.g.,
1. `ros2 run rviz2 rviz2 --ros-args -p use_sim_time:=true`
2. `ros2 bag play 2024_03_01_two_bites_3 --clock`
3. Pause and play the rosbag script when food foes on/off the fork, and when the arm starts/stops moving, and populate `bags_metadata.csv` accordingly (elapsed time since bag start should be visible at the bottom of RVIZ2).
3. **Train/test the model on offline data**: We provide a flexible Python script, `food_on_fork_train_test.py`, to train, test, and/or compare one-or-more food-on-fork models. To use it, first ensure you have built and sourced your workspace, and you are in the directory that contains the script (e.g., `cd ~/colcon_ws/src/ada_feeding/ada_feeding_perception/ada_feeding_perception`). To enable flexible use, the script has **many** command-line arguments; we recommend you read their descriptions with `python3 food_on_fork_train_test.py -h`. For reference, we include the command we used to train our model below:
```
python3 food_on_fork_train_test.py --model-classes '{"distance_no_fof_detector_with_filters": "ada_feeding_perception.food_on_fork_detectors.FoodOnForkDistanceToNoFOFDetector"}' --model-kwargs '{"distance_no_fof_detector_with_filters": {"camera_matrix": [614.5933227539062, 0.0, 312.1358947753906, 0.0, 614.6914672851562, 223.70831298828125, 0.0, 0.0, 1.0], "min_distance": 0.001}}' --lower-thresh 0.25 --upper-thresh 0.75 --train-set-size 0.5 --crop-top-left 344 272 --crop-bottom-right 408 336 --depth-min-mm 310 --depth-max-mm 340 --rosbags-select 2024_03_01_no_fof 2024_03_01_no_fof_1 2024_03_01_no_fof_2 2024_03_01_no_fof_3 2024_03_01_no_fof_4 2024_03_01_fof_cantaloupe_1 2024_03_01_fof_cantaloupe_2 2024_03_01_fof_cantaloupe_3 2024_03_01_fof_strawberry_1 2024_03_01_fof_strawberry_2 2024_03_01_fof_strawberry_3 2024_02_29_no_fof 2024_02_29_fof_cantaloupe 2024_02_29_fof_strawberry --seed 42 --temporal-window-size 5 --spatial-num-pixels 10
```
Note that we trained our model on data where the fork either had or didn't have food the whole time, and didn't near any objects (e.g., the plate or the user's mouth). (Also, note that not all the above ROS2 bags are necessary; we've trained accurate detectors with half of them.) We then did an offline evaluation of the model on bags of actual feeding data:
```
python3 food_on_fork_train_test.py --model-classes '{"distance_no_fof_detector_with_filters": "ada_feeding_perception.food_on_fork_detectors.FoodOnForkDistanceToNoFOFDetector"}' --model-kwargs '{"distance_no_fof_detector_with_filters": {"camera_matrix": [614.5933227539062, 0.0, 312.1358947753906, 0.0, 614.6914672851562, 223.70831298828125, 0.0, 0.0, 1.0], "min_distance": 0.001}}' --lower-thresh 0.25 --upper-thresh 0.75 --train-set-size 0.5 --crop-top-left 308 248 --crop-bottom-right 436 332 --depth-min-mm 310 --depth-max-mm 340 --rosbags-select 2024_03_01_two_bites 2024_03_01_two_bites_2 2024_03_01_two_bites_3 2024_02_29_two_bites --seed 42 --temporal-window-size 5 --spatial-num-pixels 10 --no-train
```
4. **Test the model on online data**: First, copy the parameters you used when training your model, as well as the filename of the saved model, to `config/food_on_fork_detection.yaml`. Re-build and source your workspace.
1. **Live Robot**:
1. Launch the robot as usual; the `ada_feeding_perception`launchfile will launch food-on-fork detection.
2. Toggle food-on-fork detection on: `ros2 service call /toggle_food_on_fork_detection std_srvs/srv/SetBool "{data: true}"`
3. Echo the output of food-on-fork detection: `ros2 topic echo /food_on_fork_detection`
2. **ROS2 bag data**:
1. Launch perception: `ros2 launch ada_feeding_perception ada_feeding_perception.launch.py`
2. Toggle food-on-fork detection on and echo the output of food-on-fork detection, as documented above.
4. Launch RVIZ and play back a ROS2 bag, as documented above.
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,13 @@
from typing import Callable

# Third-party imports
from builtin_interfaces.msg import Time
import cv2 as cv
from cv_bridge import CvBridge
import numpy as np
import numpy.typing as npt
from sensor_msgs.msg import Image
from std_msgs.msg import Header


def create_mask_post_processor(
Expand Down Expand Up @@ -58,7 +60,10 @@ def mask_post_processor(msg: Image) -> Image:

# Get the new img message
masked_msg = bridge.cv2_to_imgmsg(masked_img)
masked_msg.header = msg.header
masked_msg.header = Header(
stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
frame_id=msg.header.frame_id,
)

return masked_msg

Expand Down Expand Up @@ -124,7 +129,10 @@ def temporal_post_processor(msg: Image) -> Image:

# Get the new img message
masked_msg = bridge.cv2_to_imgmsg(masked_img)
masked_msg.header = msg.header
masked_msg.header = Header(
stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
frame_id=msg.header.frame_id,
)

return masked_msg

Expand Down Expand Up @@ -176,7 +184,10 @@ def spatial_post_processor(msg: Image) -> Image:

# Get the new img message
masked_msg = bridge.cv2_to_imgmsg(masked_img)
masked_msg.header = msg.header
masked_msg.header = Header(
stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
frame_id=msg.header.frame_id,
)

return masked_msg

Expand Down Expand Up @@ -234,7 +245,10 @@ def threshold_post_processor(msg: Image) -> Image:

# Get the new img message
masked_msg = bridge.cv2_to_imgmsg(masked_img)
masked_msg.header = msg.header
masked_msg.header = Header(
stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
frame_id=msg.header.frame_id,
)

return masked_msg

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ class FaceDetectionNode(Node):
let the client decide which face to use.
"""

# pylint: disable=duplicate-code
# Much of the logic of this node mirrors FoodOnForkDetection. This is fine.
# pylint: disable=too-many-instance-attributes
# Needed for multiple model loads, publisher, subscribers, and shared variables
def __init__(
Expand Down Expand Up @@ -305,10 +307,6 @@ def toggle_face_detection_callback(
the face detection on or off depending on the request.
"""

# pylint: disable=duplicate-code
# We follow similar logic in any service to toggle a node
# (e.g., face detection)

self.get_logger().info(f"Incoming service request. data: {request.data}")
response.success = False
response.message = f"Failed to set is_on to {request.data}"
Expand Down Expand Up @@ -563,6 +561,7 @@ def get_mouth_depth(
f"Corresponding RGB image message received at {rgb_msg.header.stamp}. "
f"Time difference: {min_time_diff} seconds."
)
# TODO: This should use the ros_msg_to_cv2_image helper function
image_depth = self.bridge.imgmsg_to_cv2(
closest_depth_msg,
desired_encoding="passthrough",
Expand Down Expand Up @@ -651,6 +650,7 @@ def run(self) -> None:
continue

# Detect the largest face in the RGB image
# TODO: This should use the ros_msg_to_cv2_image helper function
image_bgr = cv2.imdecode(
np.frombuffer(rgb_msg.data, np.uint8), cv2.IMREAD_COLOR
)
Expand Down
Loading