Skip to content

Launch doesn't stop running nodes when exception thrown #747

Open
@jackiescanlon

Description

@jackiescanlon

Bug report

Required Info:

  • Operating System:
    Ubuntu 22.04
  • Installation type:
    Binary
  • Version or commit hash:
    Iron 2.0.2
  • DDS implementation:
    Default
  • Client library (if applicable):
    N/A

Steps to reproduce issue

Create launch file my_launch.py

import launch
import launch_ros.actions
from launch import LaunchDescription


def generate_launch_description() -> LaunchDescription:
    return launch.LaunchDescription(
        [
            # Real nodes - these get left running when launch file throws exception
            launch_ros.actions.Node(
                package="examples_rclcpp_minimal_publisher",
                executable="publisher_member_function",
            ),
            launch_ros.actions.Node(
                package="examples_rclcpp_minimal_subscriber",
                executable="subscriber_member_function",
            ),
            # Fake node - this will cause an exception to be thrown.
            launch_ros.actions.Node(
                package="examples_rclcpp_minimal_subscriber",
                executable="fake_node",
            ),
        ]
    )

Launch it

$ ros2 launch my_launch.py

Expected behavior

Launch catches exception and stops, including stopping any nodes that were started before the exception was caught, i.e., ros2 node list returns nothing afterwards.

Actual behavior

Launch catches exception and stops, but doesn't kill any of the already started nodes.

$ ros2 launch my_launch.py
[INFO] [launch]: All log files can be found below /home/jackie/.ros/log/2023-12-05-14-26-16-767994-stealth-126383
[INFO] [launch]: Default logging verbosity is set to INFO
[ERROR] [launch]: Caught exception in launch (see debug for traceback): executable 'fake_node' not found on the libexec directory '/opt/ros/iron/lib/examples_rclcpp_minimal_subscriber' 
$ ros2 node list
/minimal_publisher
/minimal_subscriber

Additional information

Ideally of course the launch file is correctly written such that no exceptions are thrown in the first place. However, when in the process of writing a new launch file (especially one much larger than this example), it is hard to remember, unexpected and tedious to individually kill every node that was started before the exception was thrown.

In my own troubleshooting I looked at what happens when the launch file is Ctrl+C'd vs. what happens when an exception is caught. Perhaps there's something in there (namely, some behavior difference when LaunchService._shutdown() is passed due_to_sigint=True and force_sync=True)?

def _on_sigint(signum):
nonlocal sigint_received
base_msg = 'user interrupted with ctrl-c (SIGINT)'
if not sigint_received:
self.__logger.warning(base_msg)
ret = self._shutdown(
reason='ctrl-c (SIGINT)', due_to_sigint=True, force_sync=True
)
assert ret is None, ret
sigint_received = True
else:
self.__logger.warning('{} again, ignoring...'.format(base_msg))

except Exception as exc:
msg = 'Caught exception in launch (see debug for traceback): {}'.format(exc)
self.__logger.debug(traceback.format_exc())
self.__logger.error(msg)
ret = self._shutdown(reason=msg, due_to_sigint=False)
if ret is not None:
ret = await ret
assert ret is None, ret
return_code = 1
# keep running to let things shutdown properly
continue

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions