Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for preemption in actions #284

Open
naiveHobo opened this issue May 27, 2020 · 39 comments
Open

Add support for preemption in actions #284

naiveHobo opened this issue May 27, 2020 · 39 comments
Labels
backlog enhancement New feature or request

Comments

@naiveHobo
Copy link

Currently, there's no predefined way to handle preemption in rclcpp_actions. When a client sends a new goal when a previous goal is running, the old goal's state is set to ABORTED. This means that there's no way to differentiate between an abort caused by a true failure in the server and an abort caused by a request for preemption.

A client should be able to distinguish between the result codes received on true abort and abort due to preemption.

There are some ongoing discussions on the tickets and PRs listed below:
ros2/rclcpp#1104
ros2/rclcpp#1117
ros2/rcl_interfaces#105
ros-navigation/navigation2#1652

@naiveHobo
Copy link
Author

@jacobperron can we move the discussion here?

@jacobperron
Copy link
Member

Yes, this is a good place to consolidate discussion. Thanks!

@jacobperron jacobperron added the enhancement New feature or request label May 27, 2020
@SteveMacenski
Copy link
Contributor

@naiveHobo @jacobperron any progress on this?

@jacobperron
Copy link
Member

@naiveHobo @jacobperron any progress on this?

I don't have bandwidth to work on this, but maybe sometime for G-turtle.

It would be good to get input from some of the original authors of the design document of actions. cc/ @gbiggs @sloretz

@SteveMacenski
Copy link
Contributor

That would be helpful, this represents a pretty substantial bug in navigation2 that we need to resolve far before g-turtle. Ideally, I'd like a proposed solution with an implementation in review before the end of June. I'm also open to getting directly involved here myself but I think I'm probably more useful after the design discussion takes place and helping with the implementations.

@jacobperron
Copy link
Member

Patches like ros2/rclcpp#1117 (and the connected PRs) change API and so I don't think adding them to Foxy is likely to happen. I haven't had a chance to take a closer look into the situation, but if it can't be worked around in navigation2 or patched in rclcpp_action in an API (preferably ABI) compatible way for Foxy, then it will have to wait for G-turtle.

@jacobperron
Copy link
Member

it will have to wait for G-turtle.

To clarify, I'm proposing we work on this change for G-turtle.

@SteveMacenski
Copy link
Contributor

Ah yes, sorry, I interpreted that as working on it around g-turtle. I'm OK with it being in g-turtle's release.

Though, I think this is a breaking change that is worth making mid-distribution (assuming it doesn't change default behavior). I'd usually agree we shouldn't change it, but this particular issue at least on the nav side is pretty serious and there's no feasible work around. But lets work through the issue first and then we can figure it out later. Because this is LTS is the reason I want this in since this will represent a really serious bug for years to come after its been fixed. If it was non-LTS, I wouldn't care as much since it would go out of scope relatively quickly.

@naiveHobo
Copy link
Author

naiveHobo commented Jun 11, 2020

The proposal is to add a preempt branch to the existing goal state machine. This would allow all the existing action servers to keep running without breaking anything but also give the ability for an action server to initiate a preemption when a previous goal is being handled and a new goal arrives. preempt can be implemented the same way as abort with the only difference being the terminal state of the goal handle.

Would this change then be classified as a breaking change? It looks to me that this is just an addition to the existing capabilities.

RCLActionDesign

I'd love your inputs to this design proposal.

@jacobperron
Copy link
Member

@naiveHobo Thanks for providing an updated diagram (visuals are useful!).

After more thought, I'm not sure that we want to add the proposed "preempted" state.

Taking a step back, the only difference in the proposed "preempted" state and the "aborted' state is semantics (ie. why did the action server stop the goal?). It appears to me that there could be many reasons that a server stops the execution of a goal, and we certainly can't be expected to capture specific cases for all applications in this general state machine.

In the context of ros-navigation/navigation2#1652, the navigation2 behavior tree relies on the difference of an action goal terminating because it was overridden by a higher priority goal (e.g. a new path from the planner) versus some other reason (e.g. a path could not be found). If it were some other case, I would say that the result message should contain information to communicate the reason for an aborted goal. But, I see how having a layer of abstraction on top of ROS 2 actions complicates things; the simple action server would need to be adapted to communicate the desired terminal state information, regardless of the type of action.

On the other hand, it might be worth adding the new "preempt" goal state to make implementing N-goal action servers (like the simple action server) more straight-forward. I don't think I've thought enough about adapting the simple action server to resolve ros-navigation/navigation2#1652.

Would this change then be classified as a breaking change? It looks to me that this is just an addition to the existing capabilities.

It depends on how it is implemented. One of our goals is to maintain ABI compatibility within a ROS distribution. While we may be able to maintain ABI compatibility, I'm wary about breaking application code that may be relying on the goal state (e.g. what happens when an application receives an unexpected goal state "preempted"?). As a concrete example, the command line tool ros2action should be updated to be aware of new states. Admittedly, it would be a trivial change in the case of ros2action, but I would expect many similar changes to be needed in other application code. IMO,the risk of a more severe break is too great. Perhaps I'm wrong, but we should keep this in mind as we move forward. E.g. it may be the case that we update the state machine during the Galactic cycle and decide the risk of backporting to Foxy is acceptable.

@SteveMacenski
Copy link
Contributor

SteveMacenski commented Jun 12, 2020

Small thing, but I think in your diagram the arrows with "preempt" should be "preempting" to be in line with the other ones, but not a big deal.

I don't think that there should be a transition between cancelling and preempt. If its in the middle of canceling, we shouldn't be able to preempt it, that's probably bad form (and also if its being canceled, that's at odds with the semantic meaning of preemption). I think the only state to transition to preempted is from executing.

Taking a step back, the only difference in the proposed "preempted" state and the "aborted' state is semantics (ie. why did the action server stop the goal?).

Just to flip that around a bit, the only difference between "aborted" and "canceled" is also semantics (ei. why did the action stop the goal?). The way it exits is semantically important to represent as you have already identified by separating the concepts of a failure to process and a request to stop processing. Creating a separate concept of stop processing because superseded completes the exit criteria. This would also be usable for not just 1-goal-actions but also N goal actions where you have a limited number of workers to process goals and you want the most recent ones to go at the expense of older ones. When you exit, then its good to know that you didn't fail to process, or request it, but rather something else was more important so it dropped that goal.

If it were some other case, I would say that the result message should contain information to communicate the reason for an aborted goal.

The result code of the action is what we're looking at and what we should look at. If you suggest that we rather include all that same information in the result of the action server, then what's the point at all of having the rclcpp_action::ResultCode that implements every single other return state? What you'd be suggesting would be to completely duplicate this logic. Could that be done? Sure, but that's a really poor solution. Its also not general so it doesn't help the N other single-goal cases where this is important information (and makes any attempt at a generic action server wrapper fall flat).

While preemption has clear uses outside of the 1-goal-at-a-time action servers, ignoring all that for a moment, a super common use of rclcpp_action is for 1-goal-at-a-time tasks. Lets narrow into that a bit; navigate to a position with a physical asset, move a robot arm to a pose for a pick, move a forklift bed up, move a bulldozer bud down, etc. The return semantics for preemption is really important to be handled generally. The suggestion above to have the return codes in the result is non-general and would require then all of these common applications to have to reinvent the existing wheel per-action-message. To me, that defeats the purpose of then using this general rclcpp_actions library and the really smart design decision to have rclcpp_action::ResultCode. The role of a generic action library is to be able to use it to do generic actions. If we cannot enable a really common application of this library generically, rclcpp_action really isn't all that useful.

Anyhow, I think you get my point, don't mean to harp too much on it. I think the other (bad) option would be to remove the rclcpp_action::ResultCode completely then and make it the action interface's problem to enumerate and communicate the valid return codes per action.

@fujitatomoya
Copy link
Collaborator

If it were some other case, I would say that the result message should contain information to communicate the reason for an aborted goal.

since there is a clear difference between abort and preempt including use cases, I am not really comfortable with this either. for out robot products, preempt actually happens much more than abort.

@sloretz
Copy link
Contributor

sloretz commented Jun 18, 2020

I'm having trouble understanding the motivation for a preempt state. It looks like nav2_behavior_tree::BtActionNode is behavior tree node that sends a goal to an action server, waits for the result of the goal, and sets an appropriate status for the tree. If the action is successful or canceled by another client, it returns success. If the action is aborted by the server, it seems like it would forever get stuck in the BT::NodeStatus::RUNNING state.

As far as I understand the motivation is to enable BtActionNode to assume the action server aborts currently executing goals to make room for new goals, and enable it to do something when that happens to its own goal. It can't make that assumption now because there's no generic way to tell why the server aborted the goal.

After reading ros-navigation/navigation2#1652 , I can't tell what it is the BtActionNode is supposed to do if it could tell its goal was preempted. I doubt getting stuck in the RUNNING state forever is desired. Maybe the motivation is to make the BtActionNode resend the goal to preempt the goal that preempted it? I doubt that's the motivation either because two BtActionNode would get locked prempting each other forever. What does a preempt state enable BtActionNode to do?

Stepping back, I wonder whether BtActionNode should assume action servers implement ROS 1 SimpleActionServer style preemption. What if the action server can process multiple goals at once (probably not true of an action server in the context of navigating a robot)? What if the action server chose to reject new goals until the current one finished? What if the action server chose to queue goals, holding them in state ACCEPTED while in the queue and them moving to EXECUTING when they're done?


Just to flip that around a bit, the only difference between "aborted" and "canceled" is also semantics (ei. why did the action stop the goal?).

The result code of the action is what we're looking at and what we should look at. If you suggest that we rather include all that same information in the result of the action server, then what's the point at all of having the rclcpp_action::ResultCode that implements every single other return state?

This is an interesting way to look at it. In the spirit of flipping things around 🙃 , what if instead of adding more states, we removed existing ones?

Say the state machine for a goal only has three states: Accepted, Executing, and Done.

 +--------+          +---------+
 |Accepted+--------->+Executing|
 +--------+          +---------+
      |                   |
      |                   |
      |                   v
      |                +----+
      +--------------->+Done|
                       +----+

Canceling is no longer a state in the state machine. When the goal is Done, the server includes a ResultCode explaining why the goal was moved to Done with some common values defined:

  • Success: the server moved the goal to Done because it was completed as asked
  • Canceled: the server moved the goal to Done because a client asked to cancel it
  • Impossible: the server moved the goal to Done because the goal is no longer achievable as described
  • Preempted: The server moved the goal to Done because it chose to execute a newer goal instead
  • Timed out: The server moved the goal to Done because it ran out of time to complete it
  • Malfunction: The server moved the goal to Done because it became unable to complete goals, maybe it's not getting odometry or the system has run out of memory
  • Unknown: The server moved the goal to Done, and it's too hard to explain why

This would enable others to define their own result code values without waiting on the action design to change, such that they can be proved and added to a list of enums in the next release.

@SteveMacenski
Copy link
Contributor

SteveMacenski commented Jun 19, 2020

I don't really want to discuss the specifics of BtActionNode since its off topic and I think it will digress this discussion to unimportant nit-picking. The specific way we handle the action server isn't really important for this request, I don't think. Regardless of implementation details, the concept of supporting preemption was important in ROS1 and missing in ROS2 & its useful context.


This is an interesting way to look at it. In the spirit of flipping things around, what if instead of adding more states, we removed existing ones?

I suppose we could do that, but I don't see what that buys us, other than making the action server implementation take more responsibility over communicating state knowledge. That might be nice for power users, but probably not nice for other users. If the reason you mention that is because you don't want to deal with more changes in the future, I don't know that after preemption there is any other serious ask where we might want to extend this, given ~10 years of ROS1 actionlib existing. If we didn't come up with more in that phase, I don't expect anything more to come in the next ~10 years of ROS2 actions, this request is just to match feature parity.

Reducing the action state machine I don't think is a good or bad idea intrinsically, it just is a shift in perspective of responsibility. However if you reduce the state machine but keep the ResultCode the same, then its actively inhibiting again. You'd need to essentially remove the concept of a ResultCode (or extend it like we're asking for) and just return the action and let the action server implementation be responsible for fully communicating result information in its result fields. I think the ResultCode was a great idea in ROS2 to let us abstract away the result information from the specific implementation, but it just needs to be complete and on parity with actionlib common uses. Your suggestion above includes Preemption, so I'm OK with it either way. I don't know that reducing the state machine but including an extended ResultCode definition buys you much if any simplification. But I think we'd be open for that direction if you believed in it strongly.

@sloretz
Copy link
Contributor

sloretz commented Jul 1, 2020

I suppose we could do that, but I don't see what that buys us, other than making the action server implementation take more responsibility over communicating state knowledge.

Hopefully it makes it easier to implement an action server because it means fewer functions to be aware of.
The design proposed by @naiveHobo means someone implementing an action server must choose to call one of:

Simplifying the state machine means there would only be one method instead of those four.

  • done(goal_status, result_msg)

Simplifying might also mean the name of is_canceling() would change to client_requested_cancelation() since canceling would no longer be a state in the state machine.


Regardless of implementation details, the concept of supporting preemption was important in ROS1 and missing in ROS2 & its useful context.

Action servers can already implement preemption in ROS 2 by aborting the current goal when they accept a new goal.
If I understand correctly this issue is about making action clients aware that preemption is the reason for their goal being aborted.

I don't really want to discuss the specifics of BtActionNode since its off topic and I think it will digress this discussion to unimportant nit-picking. The specific way we handle the action server isn't really important for this request, I don't think.

Some amount of discussion of BtActionNode is relevant for justifying the design change.
Why does BtActionNode want to know if an action server preempted its goal?
What will BtActionNode do that it wouldn't do in any other kind of abort?

Maybe BtActionNode doesn't need to be discussed if it's justified by existing use cases in ROS 1, so I went looking for how preemption was used client side.
It looks like ROS 1 action clients get this info from the PREEMPTED enum in the GoalStatus message or the member in Python.
I used the prerelease script to check out all repos that depend on actionlib (depth 1) in ROS Melodic and grep'd for PREEMPTED.
Here's what I found ignoring unit tests and c++ functions that translate action status to a human readable string.

The only potentially interesting places to check are moveit_controller_manager::ExecutionStatus::PREEMPTED or smach_ros outcome 'preempted'.
I didn't dig into their dependencies though.
Maybe someone familiar with those can chime in how preemption is used?
All the other cases above seem to be burdened by the PREEMPTED status.
It's either an extra terminal state to check, or it's another state to pass through from one action server to another.

@gbiggs
Copy link
Member

gbiggs commented Jul 1, 2020

It appears to me that there could be many reasons that a server stops the execution of a goal, and we certainly can't be expected to capture specific cases for all applications in this general state machine.

I don't think I agree with this, although I do agree with your point that we don't want to handle all the application reasons for stopping a goal in the action state machine.

The reasons why the server stopped a goal are going to be quite specific. It was either cancelled externally, or it was cancelled internally. The latter (internal cancellation) is going to be due to not being able to achieve the goal any more, or the goal being superseded if the server is implemented with the capability to stop one goal in favour of another. The former (external cancellation) are going to be application-specific and we cannot do anything more to support them, other than perhaps providing a pass-through string that allows a client cancelling a goal to say why it's cancelling it, for other clients to know (I do not think this is a good idea).

Preemption, as a concept, may fall under external cancellation or internal cancellation, depending on your point of view. For example (non-exhaustive list):

  1. A client sends a goal. The server can handle only one goal at a time. Later on, that same client decides it wants to send a new goal, stopping the execution of its current goal.
  2. A client sends a goal. The server can handle only one goal at a time. Later on, a different client decides it wants to send a goal, stopping the execution of the current goal.
  3. A client sends a goal. The server can handle only one goal at a time. Later on, a different client decides it wants to send a goal. It doesn't care what the server is currently doing, just whether or not its goal was accepted and when it begins executing.
  4. The server can handle multiple goals. Several clients send goals, such that the server is now executing its maximum number of goals.

Depending on your application design, all of the above cases could be treated as "cancel the/a current goal, then start this goal" (external cancellation) or "I've received a new goal so I'll dump this existing one" (internal cancellation).

The way that preemption can be read differently depending on the application design makes me think that adding it as an explicit state to the actions state machine is not a good idea. The semantics of what it means are too application-dependent, in my opinion.

However, I do agree that there is benefit in providing a way for an action server to communicate to clients that a goal was stopped because another goal is going to be executed, whether the reason for that was a server-side decision or a client-side decision. It helps when there are multiple clients of a single server (if you have one client and one server, you can track this on the client side easily enough, but that's not the case we should design for).

The proposal to drop down to one terminal state and make more use of the result value sounds good, on the face of it. It both simplifies the state machine and makes it easier to provide additional reasons for why the goal reached that state in the future. However I think there is value in having separate states for "goal finished" and "goal not finished". It makes the distinction between successful completion and not completing (in a good way or a bad way) clear.

@gbiggs
Copy link
Member

gbiggs commented Jul 1, 2020

for out robot products, preempt actually happens much more than abort

This makes me want to know: are you preempting goals a lot, or is it that you want to know they were preempted?

Preemption-from-the-client is possible in the current design by cancelling the current goal. Knowing a goal was preempted is also possible if that is the only time your application cancels goals.

So I wonder if what you, @fujitatomoya, are really asking for is that the server automatically preempts running goals (also possible now) or if you want to distinguish between preempted and cancelled-for-other-reasons? It's this latter one that is the issue here, I think,

@SteveMacenski
Copy link
Contributor

done(goal_status, result_msg)

@sloretz so you're just proposing we change the API so that there's only 1 exit function now (and then we extend the goal_status message to have preempt or whatever else we might want). Yeah, I'd be OK with that. We send the goal code / result code and then the result message rather than letting the implied succeeded ... do it under the hood. To verify, all the existing functionality would still be around, just on the user to specify with this result / goal code rather than the API method used? We'd still add preemption to the result code / goal code?

Action servers can already implement preemption in ROS 2 by aborting the current goal when they accept a new goal. If I understand correctly this issue is about making action clients aware that preemption is the reason for their goal being aborted.

That assumes that the goal when returns a result has access to state information about the new goal, which I think breaks some important encapsulation. Goal 17 shouldn't rely on or require any knowledge about Goal 23. It also assumes that there's only 1 client talking to the server for there to be some internal state tracking that Goal 17 and 23 came from the same place for Goal 17 to be told "ok, you ended because I just sent 23".

From the long posts, what I'm still getting from it is a general agreement on the done() proposal, as long as the result / goal codes add preempt to the enum list.

@SteveMacenski
Copy link
Contributor

This makes me want to know: are you preempting goals a lot, or is it that you want to know they were preempted?

Probably preempting because the target pose has changed or refined over time.

or if you want to distinguish between preempted and cancelled-for-other-reasons?

I think that's what we're all discussing here

@sloretz
Copy link
Contributor

sloretz commented Jul 1, 2020

Action servers can already implement preemption in ROS 2 by aborting the current goal when they accept a new goal. If I understand correctly this issue is about making action clients aware that preemption is the reason for their goal being aborted.

That assumes that the goal when returns a result has access to state information about the new goal, which I think breaks some important encapsulation. Goal 17 shouldn't rely on or require any knowledge about Goal 23. It also assumes that there's only 1 client talking to the server for there to be some internal state tracking that Goal 17 and 23 came from the same place for Goal 17 to be told "ok, you ended because I just sent 23".

@SteveMacenski I don't understand. Are you talking about the action server or action client? The action server must know about all of the goals it is executing, otherwise how would it decide to preempt one?

@fujitatomoya
Copy link
Collaborator

are you preempting goals a lot, or is it that you want to know they were preempted?

both, as @SteveMacenski mentioned, it is for pose, behavior and so on.

you want to distinguish between preempted and cancelled-for-other-reasons?

yes, at least the client has to be notified that the goal is preempted (not aborted).

@SteveMacenski
Copy link
Contributor

@sloretz sorry, maybe a full example is required.

Action servers can already implement preemption in ROS 2 by aborting the current goal when they accept a new goal. If I understand correctly this issue is about making action clients aware that preemption is the reason for their goal being aborted.

Action servers can abort old goal when a new one comes in, lets call the old goal Goal 17 and the new one Goal 23.

For the situation with 1 action client: When Goal 17 result comes back, you're then requiring the action client application to be aware of the state of every recent goal so that it returns a failed state code, that its because a new one was processed. You're asking for the application to have to retain state about the nature of many goals because the state returned by the action server is incomplete. I think that breaks some important concepts of encapsulation of individual goals as independent actions. It also basically requires an application to have to maintain its own separate state for goals because it can't trust the actual results, which is pretty hacky.

For the situation with multiple action clients: when Goal 23 preempts Goal 17, and they were each applied by a different client, there's really no way for 17 to know why it was cancelled / failed. We'd then have to have each of these clients linked together to transmit information to let each other know what's happening so that they can properly handle the incomplete failed terminal code. Same arguments for encapsulation and hacky solutions.

@gbiggs
Copy link
Member

gbiggs commented Jul 3, 2020

From the long posts, what I'm still getting from it is a general agreement on the done() proposal, as long as the result / goal codes add preempt to the enum list.

Yes, in general I am supportive of having either a single done state or a pair of completed and terminated states, and relying on the return code to know why the goal terminated early.

@jacobperron
Copy link
Member

I also support the idea to simplify the goal states, and/or allowing the user to pass their own return code. I do think that there is still value in communicating "succeeded" and "aborted" states (or "completed" and "terminated", whatever we decide to call them). Having this pair of states will let tools (like ros2action) know the general outcome of a goal; success or failure.


It might be good at this point to discuss the API we imagine with a single done state (or pair completed and terminated states).

I'm thinking we want to add an additional "user" return code to GoalStatus.

Here's a couple example definitions we can poke at:

With single terminated state

# Indicates status has not been properly set.
int8 STATUS_UNKNOWN   = 0

# The goal has been accepted and is awaiting execution.
int8 STATUS_ACCEPTED  = 1

# The goal is currently being executed by the action server.
int8 STATUS_EXECUTING = 2

# The client has requested that the goal be canceled and the action server has
# accepted the cancel request.
int8 STATUS_CANCELING = 3

# The goal finished. This could mean the goal was completed or was terminated early.
int8 STATUS_DONE = 4

# Goal info (contains ID and timestamp).
GoalInfo goal_info

# Action goal state-machine status.
int8 status

# User provided return code. Useful for communicating application specific information.
int8 return_code

With "succeeded" and "aborted" states

# Indicates status has not been properly set.
int8 STATUS_UNKNOWN   = 0

# The goal has been accepted and is awaiting execution.
int8 STATUS_ACCEPTED  = 1

# The goal is currently being executed by the action server.
int8 STATUS_EXECUTING = 2

# The client has requested that the goal be canceled and the action server has
# accepted the cancel request.
int8 STATUS_CANCELING = 3

# The goal was achieved successfully by the action server.
int8 STATUS_SUCCEEDED = 4

# The goal was terminated by the action server without an external request.
int8 STATUS_ABORTED   = 5

# Goal info (contains ID and timestamp).
GoalInfo goal_info

# Action goal state-machine status.
int8 status

# User provided return code. Useful for communicating application specific information.
int8 return_code

@gbiggs
Copy link
Member

gbiggs commented Jul 9, 2020

I am against having a user defined return code if there is only one "finished" state. It would make it hard for tools to know in a generic way whether a goal actually finished or not.

If we have succeeded and aborted states, then I see value in having a user-defined return code, but I don't understand why it can't just be in the action definition. That's where application-specific stuff goes, isn't it? Even if we do have one at the API level, I think it should be in addition to the built-in return code, not overwrite it.

@SteveMacenski
Copy link
Contributor

SteveMacenski commented Jul 9, 2020

I'm thinking we want to add an additional "user" return code to GoalStatus.

To back up a little, in this case, do you imagine GoalStatus and ResultCode are identical? They seem to service a very similar, if not same, purpose. I'm just wanting to make sure as we're talking about GoalStatus that this is also being given to the clients in the form of result codes as well when they receive a "finished" notification on the on_done callbacks.

In this case, the ING states aren't terminal states (terminal I mean completed, no implication on success or a form of failure) that are included here. It seems like we really should be chatting about ResultCode not GoalStatus, or at least just talk about completed states. The ING states I don't imagine change, the feedback should still tell the client if it was accepted / processing.

Both of these examples lack a preemption status. I don't understand what is achieved here.


I am against having a user defined return code if there is only one "finished" state. It would make it hard for tools to know in a generic way whether a goal actually finished or not.

I think we need to be really explicit on what we're talking about. When I'm thinking about 1 "finished" [thing], what I'm talking about is 1 finished API for the server, e.g. action_server_->terminate(code). I do not support having a single finished state in the form of the codes and results. The codes and results should be the same + preemption, but leave it to the action server to tell the terminate() function how it terminates (success, cancel, abort, preemption) with an ActionT::Result for a function signature of void terminate(const ResultCode & code, const ActionT::Result result = ActionT::Result()).

@jacobperron
Copy link
Member

do you imagine GoalStatus and ResultCode are identical?

To clarify, a goal result is made up of two pieces: 1) the GoalStatus message and 2) the user-defined result message. So, when the user is accessing the result for a finished goal, they are ultimately getting the GoalStatus (which should hopefully be in a terminal state 🤞). Concretely, in the rclcpp_action implementation, GoalStatus.status is cast to a rclcpp_action::ResultCode

When I'm thinking about 1 "finished" [thing], what I'm talking about is 1 finished API for the server, e.g. action_server_->terminate(code). I do not support having a single finished state in the form of the codes and results. The codes and results should be the same + preemption, but leave it to the action server to tell the terminate() function how it terminates (success, cancel, abort, preemption) with an ActionT::Result for a function signature of void terminate(const ResultCode & code, const ActionT::Result result = ActionT::Result()).

Sorry, I think I misinterpreted the above discussion. I was thinking about reducing the number of hard-coded terminal states and use the "return code" as a way to pass additional information. But, I believe @sloretz's proposal was to relax the possible number of terminal states (which makes way more sense). I.e. we can keep the existing definition of GoalStatus and change the implementation to not enforce that the status field be one of the existing enum values. Then for example, navigation2 could introduce it's own status code for "preempt", while we decide if it's worth adding as a general terminal state.

@jacobperron
Copy link
Member

Regardless of implementation details, the concept of supporting preemption was important in ROS1 and missing in ROS2 & its useful context.

I'm still confused by this point. When we designed actions for ROS 2, we kept the "PREEMPTED" and "PREEMPTING" states from ROS 1 (ROS 1 defintion), but chose to rename them "CANCELED" and "CANCELING" as a preference.

I guess maybe the real difference boils down to in ROS 1, the action server could trigger the transition to PREEMPTING, whereas in ROS 2 it must be triggered by an action client request?

@gbiggs
Copy link
Member

gbiggs commented Jul 10, 2020

An action server is free to cancel a currently-executing goal if it wants to. What is lacking is the ability to report to clients that a goal was cancelled because a new goal is being executed instead. In other words, a new goal preempted a currently-executing goal.

@SteveMacenski
Copy link
Contributor

Exactly what @gbiggs stated.

OK GoalStatus == ResultCode, just making sure.

There are 3 terminal failure statuses in my opinion that we've been discussing:

  • Failed because it failed due to internal processing (abort)
  • Failed due to external user asking it to stop (cancel)
  • Failed due to server stopping to process due to non-processing related cases (preempt)

@fujitatomoya
Copy link
Collaborator

Failed because it failed due to internal processing (abort)
Failed due to external user asking it to stop (cancel)
Failed due to server stopping to process due to non-processing related cases (preempt)

That is exactly our requirement!

@jacobperron
Copy link
Member

jacobperron commented Jul 13, 2020

Failed due to external user asking it to stop (cancel)
Failed due to server stopping to process due to non-processing related cases (preempt)

I guess I'm missing how these two failure modes were communicated differently in ROS 1 actionlib.

Edit: I guess this feature doesn't exist in ROS 1.

@sloretz
Copy link
Contributor

sloretz commented Jul 17, 2020

There has been a lot of good discussion so far about implementation. I don't see a description of the use case yet. Why does the Action client need to know its goal was preempted?

Existing uses in ROS 1 haven't answered that question. While action clients had a preempted state in ROS 1, I didn't find any ROS 1 client side use of this knowledge that was different from the aborted state.

you want to distinguish between preempted and cancelled-for-other-reasons?

yes, at least the client has to be notified that the goal is preempted (not aborted).

@fujitatomoya in your use case, why does the action client need to know the goal was preempted? What does the client do when the goal is preempted that is different from what it would do if it was aborted?

Action servers can already implement preemption in ROS 2 by aborting the current goal when they accept a new goal. If I understand correctly this issue is about making action clients aware that preemption is the reason for their goal being aborted.

Action servers can abort old goal when a new one comes in, lets call the old goal Goal 17 and the new one Goal 23.

For the situation with 1 action client: When Goal 17 result comes back, you're then requiring the action client application to be aware of the state of every recent goal so that it returns a failed state code, that its because a new one was processed. You're asking for the application to have to retain state about the nature of many goals because the state returned by the action server is incomplete. I think that breaks some important concepts of encapsulation of individual goals as independent actions. It also basically requires an application to have to maintain its own separate state for goals because it can't trust the actual results, which is pretty hacky.

@SteveMacenski I'm not sure we're talking about the same thing. It's true action action clients in ROS 2 aren't able to tell if their goal was replaced by another goal on the server. I was only saying action servers can implement preemption. The server may choose to abort 17 when it accepts 23, and the client for 17 would get notice that its goal was aborted. Why does BTActionNode need client side knowledge that the server accepted a newer goal?

Failed due to external user asking it to stop (cancel)
Failed due to server stopping to process due to non-processing related cases (preempt)

I guess I'm missing how these two failure modes were communicated differently in ROS 1 actionlib.

@jacobperron Good point, I does look like they're indistinguishable once the goal became active. Once the goal reaches the active state in ROS 1 it does not seem possible to tell the difference between the action client canceling the goal and the action server preempting the goal because they both get reported as preempted.

@SteveMacenski
Copy link
Contributor

SteveMacenski commented Jul 18, 2020

I'll just respond to my tagged element. It needs to know because in the case that it was cancelled due to failure, we will enter a recovery state which would involve probably stopping the robot if not already stopped and clearing state or taking aggressive measures to resolve. If its just a preemption because there's a new goal, then we don't want to do anything and keep chugging along.

Our architecture makes substantial use of action servers preempting each other with new information when required. Every time a new plan comes from the planner server, the controller server is preempted and given it to now track. That means without it, every 1 hz or something the robot would stop which is really bad. Without specifying preemption vs failure, its then not possible for us to process outright failures properly because the failures are currently being assumed to only be from preemption because the APIs are making us misuse them.

@fujitatomoya
Copy link
Collaborator

@sloretz

thanks for checking on this,

in your use case, why does the action client need to know the goal was preempted? What does the client do when the goal is preempted that is different from what it would do if it was aborted?

  • abort
    application assumes that failure happens on server. this becomes error state for clients, and the clients will check the system status from the other nodes to recover.
  • preempt
    this can be expected to happen with priority, if preemption happens, that means higher priority request needs to run. in that case, the client will check attention level and behavior status, then wait until it can request again.

as application perspective, above is really different i guess.

@fujitatomoya
Copy link
Collaborator

@sloretz

friendly ping;

@SteveMacenski
Copy link
Contributor

Another friendly ping :-)

@gavanderhoorn
Copy link
Contributor

And a friendly ping from me.

@doisyg
Copy link

doisyg commented Jul 14, 2023

And another warm ping from me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants