personalrobotics · amalnanavati · Mar 23, 2024 · Feb 27, 2024 · Feb 28, 2024 · Feb 28, 2024
diff --git a/.pylintrc b/.pylintrc
@@ -197,24 +197,42 @@ good-names=a,
            b,
            c,
            d,
+           f,
            i,
            j,
            k,
+           m,
+           M,
+           n,
+           p,
+           ps,
            x,
+           x0,
+           x1,
+           X,
            y,
+           y0,
+           y1,
            z,
            u,
+           us,
            v,
+           vs,
            w,
            h,
            r,
            rc,
+           S,
+           S_inv,
+           t,
            ax,
            ex,
            hz,
            kw,
            ns,
            Run,
+           train_X,
+           test_X,
            _
 
 # Good variable names regexes, separated by a comma. If names match any regex,

diff --git a/ada_feeding_msgs/CMakeLists.txt b/ada_feeding_msgs/CMakeLists.txt
@@ -19,6 +19,7 @@ find_package(rosidl_default_generators REQUIRED)
 rosidl_generate_interfaces(${PROJECT_NAME}
   "msg/AcquisitionSchema.msg"
   "msg/FaceDetection.msg"
+  "msg/FoodOnForkDetection.msg"
   "msg/Mask.msg"
 
   "action/AcquireFood.action"

diff --git a/ada_feeding_msgs/msg/FoodOnForkDetection.msg b/ada_feeding_msgs/msg/FoodOnForkDetection.msg
@@ -0,0 +1,18 @@
+# A message with the results of food on fork detection on a single frame.
+
+# The header for the image the detection corresponds to
+std_msgs/Header header
+
+# The status of the food-on-fork detector.
+int32 status
+int32 SUCCESS=1
+int32 ERROR_TOO_FEW_POINTS=-1
+int32 ERROR_NO_TRANSFORM=-2
+int32 UNKNOWN_ERROR=-99
+
+# A probability in [0,1] that indicates the likelihood that there is food on the
+# fork in the image. Only relevant if status == FoodOnForkDetection.SUCCESS
+float64 probability
+
+# Contains more details of the result, including any error messages that were encountered
+string message
diff --git a/ada_feeding_perception/README.md b/ada_feeding_perception/README.md
@@ -91,3 +91,35 @@ Launch the web app along with all the other nodes (real or dummy) as documented
 - `offline.images` (list of strings, required): The paths, relative to `install/ada_feeding_perception/share/ada_feeding_perception`, to the images to test.
 - `offline.point_xs` (list of ints, required): The x-coordinates of the seed points. Must be the same length as `offline.images`.
 - `offline.point_ys` (list of ints, required): The y-coordinates of the seed points. Must be the same length as `offline.images`.
+
+## Food-on-Fork Detection
+
+Our eye-in-hand Food-on-Fork Detection node and training/testing infrastructure was designed to make it easy to substitute and compare other food-on-fork detectors. Below are instructions on how to do so.
+
+1. **Developing a new food-on-fork detector**: Create a subclass of `FoodOnForkDetector` that implements all of the abstractmethods. Note that as of now, a model does not have access to a real-time TF Buffer during test time; hence, **all transforms that the model relies on must be static**.
+2. **Gather the dataset**: Because this node uses the eye-in-hand camera, it is sensitive to the relative pose between the camera and the fork. If you are using PRL's robot, [the dataset collected in early 2024](https://drive.google.com/drive/folders/1hNciBOmuHKd67Pw6oAvj_iN_rY1M8ZV0?usp=drive_link) may be sufficient. Otherwise, you should collect your own dataset:
+    1. The dataset should consist of a series of ROS2 bags, each recording the following: (a) the aligned depth to color image topic; (b) the color image topic; (c) the camera info topic (we assume it is the same for both); and (d) the TF topic(s).
+    2. We recorded three types of bags: (a) bags where the robot was going through the motions of feeding without food on the fork and without the fork nearing a person or plate; (b) the same as above but with food on the fork; and (c) bags where the robot was acquiring and feeding a bite to someone. We used the first two types of bags for training, and the third type of bag for evaluation.
+    3. All ROS2 bags should be in the same directory, with a file `bags_metadata.csv` at the top-level of that directory.
+    4. `bags_metadata.csv` contains the following columns: `rosbag_name` (str), `time_from_start` (float), `food_on_fork` (0/1), `arm_moving` (0/1). The file only needs rows for timestamps when one or both of the latter columns change; for intermediate timestamps, it is assumed that they stay the same.
+    5. To generate `bags_metadata.csv`, we recommend launching RVIZ, adding your depth and/or RGB image topic, and playing back the bag. e.g.,
+        1. `ros2 run rviz2 rviz2 --ros-args -p use_sim_time:=true`
+        2. `ros2 bag play 2024_03_01_two_bites_3 --clock`
+        3. Pause and play the rosbag script when food foes on/off the fork, and when the arm starts/stops moving, and populate `bags_metadata.csv` accordingly (elapsed time since bag start should be visible at the bottom of RVIZ2).
+3. **Train/test the model on offline data**: We provide a flexible Python script, `food_on_fork_train_test.py`, to train, test, and/or compare one-or-more food-on-fork models. To use it, first ensure you have built and sourced your workspace, and you are in the directory that contains the script (e.g., `cd ~/colcon_ws/src/ada_feeding/ada_feeding_perception/ada_feeding_perception`). To enable flexible use, the script has **many** command-line arguments; we recommend you read their descriptions with `python3 food_on_fork_train_test.py -h`. For reference, we include the command we used to train our model below:
+    ```
+    python3 food_on_fork_train_test.py --model-classes '{"distance_no_fof_detector_with_filters": "ada_feeding_perception.food_on_fork_detectors.FoodOnForkDistanceToNoFOFDetector"}' --model-kwargs '{"distance_no_fof_detector_with_filters": {"camera_matrix": [614.5933227539062, 0.0, 312.1358947753906, 0.0, 614.6914672851562, 223.70831298828125, 0.0, 0.0, 1.0], "min_distance": 0.001}}' --lower-thresh 0.25 --upper-thresh 0.75 --train-set-size 0.5 --crop-top-left 344 272 --crop-bottom-right 408 336 --depth-min-mm 310 --depth-max-mm 340 --rosbags-select 2024_03_01_no_fof 2024_03_01_no_fof_1 2024_03_01_no_fof_2 2024_03_01_no_fof_3 2024_03_01_no_fof_4 2024_03_01_fof_cantaloupe_1 2024_03_01_fof_cantaloupe_2 2024_03_01_fof_cantaloupe_3 2024_03_01_fof_strawberry_1 2024_03_01_fof_strawberry_2 2024_03_01_fof_strawberry_3 2024_02_29_no_fof 2024_02_29_fof_cantaloupe 2024_02_29_fof_strawberry --seed 42  --temporal-window-size 5 --spatial-num-pixels 10
+    ```
+Note that we trained our model on data where the fork either had or didn't have food the whole time, and didn't near any objects (e.g., the plate or the user's mouth). (Also, note that not all the above ROS2 bags are necessary; we've trained accurate detectors with half of them.) We then did an offline evaluation of the model on bags of actual feeding data:
+    ```
+    python3 food_on_fork_train_test.py --model-classes '{"distance_no_fof_detector_with_filters": "ada_feeding_perception.food_on_fork_detectors.FoodOnForkDistanceToNoFOFDetector"}' --model-kwargs '{"distance_no_fof_detector_with_filters": {"camera_matrix": [614.5933227539062, 0.0, 312.1358947753906, 0.0, 614.6914672851562, 223.70831298828125, 0.0, 0.0, 1.0], "min_distance": 0.001}}' --lower-thresh 0.25 --upper-thresh 0.75 --train-set-size 0.5 --crop-top-left 308 248 --crop-bottom-right 436 332 --depth-min-mm 310 --depth-max-mm 340 --rosbags-select 2024_03_01_two_bites 2024_03_01_two_bites_2 2024_03_01_two_bites_3 2024_02_29_two_bites --seed 42  --temporal-window-size 5 --spatial-num-pixels 10 --no-train
+    ```
+4. **Test the model on online data**: First, copy the parameters you used when training your model, as well as the filename of the saved model, to `config/food_on_fork_detection.yaml`. Re-build and source your workspace. 
+    1. **Live Robot**: 
+        1. Launch the robot as usual; the `ada_feeding_perception`launchfile will launch food-on-fork detection.
+        2. Toggle food-on-fork detection on: `ros2 service call /toggle_food_on_fork_detection std_srvs/srv/SetBool "{data: true}"`
+        3. Echo the output of food-on-fork detection: `ros2 topic echo /food_on_fork_detection`
+    2. **ROS2 bag data**:
+        1. Launch perception: `ros2 launch ada_feeding_perception ada_feeding_perception.launch.py`
+        2. Toggle food-on-fork detection on and echo the output of food-on-fork detection, as documented above.
+        4. Launch RVIZ and play back a ROS2 bag, as documented above.
diff --git a/ada_feeding_perception/ada_feeding_perception/depth_post_processors.py b/ada_feeding_perception/ada_feeding_perception/depth_post_processors.py
@@ -7,11 +7,13 @@
 from typing import Callable
 
 # Third-party imports
+from builtin_interfaces.msg import Time
 import cv2 as cv
 from cv_bridge import CvBridge
 import numpy as np
 import numpy.typing as npt
 from sensor_msgs.msg import Image
+from std_msgs.msg import Header
 
 
 def create_mask_post_processor(
@@ -58,7 +60,10 @@ def mask_post_processor(msg: Image) -> Image:
 
         # Get the new img message
         masked_msg = bridge.cv2_to_imgmsg(masked_img)
-        masked_msg.header = msg.header
+        masked_msg.header = Header(
+            stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
+            frame_id=msg.header.frame_id,
+        )
 
         return masked_msg
 
@@ -124,7 +129,10 @@ def temporal_post_processor(msg: Image) -> Image:
 
         # Get the new img message
         masked_msg = bridge.cv2_to_imgmsg(masked_img)
-        masked_msg.header = msg.header
+        masked_msg.header = Header(
+            stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
+            frame_id=msg.header.frame_id,
+        )
 
         return masked_msg
 
@@ -176,7 +184,10 @@ def spatial_post_processor(msg: Image) -> Image:
 
         # Get the new img message
         masked_msg = bridge.cv2_to_imgmsg(masked_img)
-        masked_msg.header = msg.header
+        masked_msg.header = Header(
+            stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
+            frame_id=msg.header.frame_id,
+        )
 
         return masked_msg
 
@@ -234,7 +245,10 @@ def threshold_post_processor(msg: Image) -> Image:
 
         # Get the new img message
         masked_msg = bridge.cv2_to_imgmsg(masked_img)
-        masked_msg.header = msg.header
+        masked_msg.header = Header(
+            stamp=Time(sec=msg.header.stamp.sec, nanosec=msg.header.stamp.nanosec),
+            frame_id=msg.header.frame_id,
+        )
 
         return masked_msg
 

diff --git a/ada_feeding_perception/ada_feeding_perception/face_detection.py b/ada_feeding_perception/ada_feeding_perception/face_detection.py
@@ -56,6 +56,8 @@ class FaceDetectionNode(Node):
     let the client decide which face to use.
     """
 
+    # pylint: disable=duplicate-code
+    # Much of the logic of this node mirrors FoodOnForkDetection. This is fine.
     # pylint: disable=too-many-instance-attributes
     # Needed for multiple model loads, publisher, subscribers, and shared variables
     def __init__(
@@ -305,10 +307,6 @@ def toggle_face_detection_callback(
         the face detection on or off depending on the request.
         """
 
-        # pylint: disable=duplicate-code
-        # We follow similar logic in any service to toggle a node
-        # (e.g., face detection)
-
         self.get_logger().info(f"Incoming service request. data: {request.data}")
         response.success = False
         response.message = f"Failed to set is_on to {request.data}"
@@ -563,6 +561,7 @@ def get_mouth_depth(
                 f"Corresponding RGB image message received at {rgb_msg.header.stamp}. "
                 f"Time difference: {min_time_diff} seconds."
             )
+        # TODO: This should use the ros_msg_to_cv2_image helper function
         image_depth = self.bridge.imgmsg_to_cv2(
             closest_depth_msg,
             desired_encoding="passthrough",
@@ -651,6 +650,7 @@ def run(self) -> None:
                 continue
 
             # Detect the largest face in the RGB image
+            # TODO: This should use the ros_msg_to_cv2_image helper function
             image_bgr = cv2.imdecode(
                 np.frombuffer(rgb_msg.data, np.uint8), cv2.IMREAD_COLOR
             )