Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design actions API and implementation #183

Closed
wants to merge 3 commits into from
Closed

Design actions API and implementation #183

wants to merge 3 commits into from

Conversation

gbiggs
Copy link
Member

@gbiggs gbiggs commented Aug 1, 2018

This is a continuation of #143, but without the messed up git history.

I have cleaned up the formatting and revised some of the information, but there is still a lot of work to be done on design. My hope is that we can fill it out based on prototypes and the needs of the navstack2 work.

@esteve esteve added the in review Waiting for review (Kanban column) label Aug 1, 2018
@paulbovbel
Copy link

paulbovbel commented Aug 1, 2018

A grab-bag of opinions based on my experience with actionlib(1):


  1. From (Start article defining actions #143 (comment)), I'm not sure if it needs to go as deep as rmw, but actions should probably live at the https://github.com/ros2/rcl layer. One of the biggest quality of life improvements in ROS2 is that a new client library doesn't have to be a ground-up port, and can take advantage of a central implementation. Actions being such a big part of what makes ROS useable (long-running pre-emptable RPC), while also being fairly complex - it would make sense for the implementation to be shared between rcl* libraries.

  1. ROS2 is designed to run on unreliable networks, and this is something that wasn't really accounted for in ROS1 actionlib.

There's quite a few interesting cases that come up when the link is not reliable:

  • server never ACKs client goal (already addressed by making goal submission a service rather than a topic)
  • client receives a result but not a status, or vice versa
  • client loses comms with the server entirely for X cycles and reappears, long after status/result has been broadcast

It may be worthwhile to consider how actionlib(2) should work in these situations. Even if a base ActionServer implementation just has a large queue and a heavy disclaimer 'this will lose goals, sometimes', it would be helpful to expose the appropriate interfaces so that a user could extend/compose a base ROS2 ActionServer into a PersistentActionServer, where an absentee client could determine a goalstatus long after the server has completed it.


  1. Something that comes up occasionally is the idea of pausing/resuming an action (Extending ActionServer with Pausing capabilities ros/actionlib#109), which IMO makes sense for a long-running thing.
  • while I'd hate to make the action state machine spec even more complicated, maybe this could fall under a general purpose mechanism? Could we allow clients to issue a goal 'update' under an existing goal_id, where a 'paused' boolean flag could be toggled and handled in user code?

  1. If you're going to implement long-running pre-emptible callbacks running on a spinner, it would be seems useful to be able to trigger those from client code without having to go through RPC, period, using non-message types as arguments and bypassing serialization cost.
  • To that end it would be great if the 'action' business logic was useable in a ROS2 node, without dealing with action servers/clients. Something submitting a piece of work to a node and getting a 'pre-emptible' future back.
  • Perhaps between type masquerading and intra-process comms, this is irrelevant?

@mkhansenbot
Copy link

I agree with point 1 in your list @paulbovbel That was basically what I was thinking in my "option 2" in the #143 thread. I don't know of any benefit of plumbing down to the rmw layer, but would also want @wjwwood or @tfoote to comment if they know of a better way to implement it.

@dirk-thomas
Copy link
Member

The action feature doesn't require any knowledge in the rmw layer. It is basically "syntactic sugar" which provides a combination of a preemptable service with a feedback topic. So this should be implement in the client libraries and not touch rmw.

@wjwwood
Copy link
Member

wjwwood commented Aug 1, 2018

I agree, rcl is ok, but rmw is too low level for actions.

@davetcoleman
Copy link
Contributor

I've reviewed this design proposal and, aside from some missing sections, it seems okay to me. A few thoughts:

Can you add a section specifically calling out how actionlib2 will be different than 1? Aside from using ROS2 under the hood of course? It appears to be the same from my initial read, which I think is a good thing. I see it will be treated as a "first class citizen" but how does that actually pan out? I've never seen any issues using actionlib1 in ROS as a second class citizen except that its missing some of the introspection tools that messages and services have.

I'd like to mention one thing I didn't like about actionlib1 is that it had two different server interfaces - the SimpleActionServer and the not simple one. This always felt hacky and complicated and I'd prefer just one approach.

@paulbovbel
Copy link

paulbovbel commented Aug 1, 2018

I'd like to mention one thing I didn't like about actionlib1 is that it had two different server interfaces - the SimpleActionServer and the not simple one. This always felt hacky and complicated and I'd prefer just one approach.

+1

This re-opened some memories. I feel like the SimpleActionClient/Server came baked in with some surprises, due to the implicit background thread and goal preemption handling.

IMO what would be a much 'simpler' experience for a ROS beginner is a small tutorial on 'How to write an action server callback to only have one goal active at a time', along the lines of:

active_goal_handle = None

void callback(goal_handle)
{
  if active_goal_handle_ is not None:
    active_goal_handle.preempt()
    active_goal_handle = goal_handle
  // do work
}

treated as a "first class citizen" but how does that actually pan out

I always interpreted this as a commitment to having actions in the rcl common libraries alongside topics and services, so as to avoid actions being reimplemented (see rosjava, rosnodejs)

@mkhansenbot
Copy link

So if I understand correctly the implementation of an Action service would be to create action structs at the rcl layer that contain at least one service for the async call / response, and at least one feedback topic. How would the cancellation work? Is that via the same service interface (a null for the goal_handle in the callback example given above) or is there a third connection needed to cancel a pending request?

@wjwwood
Copy link
Member

wjwwood commented Aug 2, 2018

Either is possible, I had always imagined having a separate service for canceling, perhaps just one per node, no matter how many action servers there were or perhaps one for each action server.

But you could also "inject" an "action" enum into the request message, so it looked something like (roughly .msg pseudo code):

## Start "inserted" control fields and enums.
# Make an initial action request (user defined request payload).
ACTION_REQUEST=1
# Cancel an on going action request (must set request_id).
ACTION_CANCEL=2

# Controls purpose of service call
# (must be set, as default value of 0 will produce an error)
uint8 actionlib_control_signal = 0
# Only used when canceling.
RequestID actionlib_request_id
## End "inserted" control fields and enums.

## Start user defined request fields.
Pose goal
# ...
## End user defined request fields.

----

# User defined feedback.

----

## Start "inserted" control fields and enums.
# Used to signal success or failure for canceling.
bool actionlib_success = false
# Optional text reason for success of failure of action.
string actionlib_reason = ""
## End "inserted" control fields and enums.

## Start user defined request fields.
## End user defined request fields.

Ideally we would never do this, since each Service type would ideally only deal with a single task, because that makes the types easier to understand and easier to reuse. It also reduces the inserted "magic" on top of what the user expressed in their request/feedback/response definition.

However, we may be compelled to do this for performance reasons (yet to be seen how much overhead each Service causes).

@mkhansenbot
Copy link

So an action server would have 2 services, one for executing the action, one for cancelling it, and one feedback topic that it publishes.

@wjwwood
Copy link
Member

wjwwood commented Aug 2, 2018

@mkhansen-intel Yes, that's how I imagined it.

But as I said, we might have to be pragmatic if we run into scaling issues or something.

In the past we had thought perhaps that all Services would be preemptible, but that was before we decided to delegate to DDS based RPC calls for the first implementation. In which case actions would have just been one (preemptible) service and a topic.

@mkhansenbot
Copy link

So what should the behavior be if a second command is issued while one is in progress, should it pre-empt the first, same as cancelling? Or should it queue the second command? Or should that be a configurable option for each action?

@mkhansenbot
Copy link

Also, I'm not very familiar with the message generation / parsing code.

I think the changes needed for parsing would be in https://github.com/ros2/rosidl/blob/master/rosidl_parser/rosidl_parser/__init__.py

  • Something like the "class ServiceSpecification" code would need to be added, like "class ActionSpecification" and "parse_action_string" to parse the action files.

And for generation, changes would be here: https://github.com/ros2/rosidl/blob/master/rosidl_generator_cpp/rosidl_generator_cpp/__init__.py

  • generate_cpp would need to be updated

Am I missing more that would need to happen to generate the messages?

What about tools changes?

@mkhansenbot
Copy link

As an alternative to what I said above, I guess we wouldn't have to define an action message type at all, it could be kept separate as 2 services and a message. It wouldn't have the syntactic niceness that the current ActionLib has of defining it all in one .action file though. Thoughts?

@mikaelarguedas
Copy link
Member

Also, I'm not very familiar with the message generation / parsing code.

I think the changes needed for parsing would be in https://github.com/ros2/rosidl/blob/master/rosidl_parser/rosidl_parser/__init__.py

Something like the "class ServiceSpecification" code would need to be added, like "class ActionSpecification" and "parse_action_string" to parse the action files.

This sounds about right 👍

One thing to figure out is how we expect users to interact with the generated datastructures (in what namespace the generated structure should live etc).
In ROS 1, an action definition results in a set of messages (no services) so they all end up (in Python) in the namespace <MY_PKG>.msg with message names like" <ActionName>Goal, <ActionName>Feedback....
If in ROS 2 these will be based on services for some parts and topics for others, should some be generated in the <MY_PKG>.msg and some in <MY_PKG>.srv? or all in a new namespace <MY_PKG>.action? or something else?

Am I missing more that would need to happen to generate the messages?

The other generators should be updated accordingly (rosidl_generator_c and rosidl_generator_py).

@wjwwood
Copy link
Member

wjwwood commented Aug 7, 2018

So what should the behavior be if a second command is issued while one is in progress, should it pre-empt the first, same as cancelling? Or should it queue the second command? Or should that be a configurable option for each action?

I would say it should be possible to have multiple actions in flight simultaneously and have an option to only have one at a time. You should also consider if that's requests from all requesters or from individual requesters.

In ROS 1, it's possible to have more than one at a time (you basically just provide callbacks for when a goal is received and when a goal is canceled):

https://docs.ros.org/api/actionlib/html/classactionlib_1_1ActionServer.html

But the simple action server class (which some people complained about being a separate thing) provides the "single goal at a time" behavior which is desirable in many cases, and maybe should even be the default, but I'm not sure:

https://wiki.ros.org/actionlib#SimpleActionServer_Goal_Policies

@gbiggs
Copy link
Member Author

gbiggs commented Aug 7, 2018

Thanks for all the feedback. I'm going to update the article over the next couple of days.

@mkhansen-intel Do you have a repository where prototyping work is being done?

@mkhansenbot
Copy link

@gbiggs - no I don't have a repo for this. I am mainly trying to figure out how to spur it along to keep it from blocking the ros-planning/navigation2 project. I'd be willing to do some of the work but not sure I can take it all on myself, especially since I haven't made changes in most of these layers before, I'm used to using the rclcpp layer interface.

My thinking right now is that to provide what the navigation2 project needs in the short term, I would probably implement the rclcpp node interface, with ActionServer and ActionClient classes that wrap the Service and Client classes at that level, and leave the other work for OSRF to implement when they have time. I know that's not ideal in the long term.

Alternately, if someone else was willing to do the rcl layer work, I would be willing to do the rclcpp layer to use that instead, and we could work together on a fork until it was ready for submitting the PRs for both.

I'm just not confident I could do everything in a timely fashion, as I'm not an expert in these layers.

@mkhansenbot
Copy link

So, how compatible with the existing ROS ActionLib interface should this be? Does it need to be an exact match?

@wjwwood
Copy link
Member

wjwwood commented Aug 23, 2018

So, how compatible with the existing ROS ActionLib interface should this be? Does it need to be an exact match?

Do you mean at the API level or on the wire? For the API level, similar is good but I wouldn't hesitate to change the API if it makes sense (sometimes you have to do so, especially where assumptions about the node singleton in ROS 1 are involved).

We should also give some thought to how this could be mapped over the bridge. We don't need to implement that now, but if actions are conceptually different in ROS 2 somehow, then it might become impossible to bridge them across the ros1_bridge and that would be unfortunate. So we should have a good reason if we decide to do something with that consequence.

@mkhansenbot
Copy link

@wjwwood - Here is a branch where I've added a node->create_action_server() and node->create_action_client(). https://github.com/mkhansen-intel/rclcpp/tree/actions

And here is a branch for examples for both. https://github.com/mkhansen-intel/examples/tree/action_examples

These are both purely in the rclcpp layer and not plumbed through to the rcl layer.

Also, I've not implemented the tools changes to generate the actions message types from an actions file. So for right now, the API requires a Service (ie. AddTwoInts) and a Feedback message type (ie. std_msgs/String). See the examples. Later this could be combined into one template argument.

I think this is enough to start with for our team doing Navigation2, with the expectation that when the tools changes could be implemented and *.action files supported, we could easily change the template interface to one .

In order to send a request to the server, it's using a client->send_async_request() and to cancel it is client->cancel_request(). Right now cancel_request() is not returning any status, if we want it to have a status, we would need to talk about how to do that.

Also, I haven't implemented any state machine for the client. This is all purely callbacks, and the expectation right now is that the node itself is implementing some state machine if it is needed. Do we want a state machine? If so wouldn't that be implemented in the rcl layer?

Please take a look and let me know your thoughts and whether this is a useful starting point. If you want I can submit a PR to allow for comments there too.

@mkhansenbot
Copy link

@wjwwood, @mikaelarguedas @gbiggs - please take a look at my post above and provide some feedback.

@mkhansenbot
Copy link

All, I want to continue work on this but need some guidance / direction. I know that ActionLib implemented state machines for the Client and Server: http://wiki.ros.org/actionlib/DetailedDescription Is the design goal to make Actions in ROS2 fully equivalent to the ActionLib Actions? If so, these state machines are needed, right? Also, what about the APIs? I see this: http://docs.ros.org/lunar/api/actionlib/html/classactionlib_1_1SimpleActionClient.html
and this:
http://docs.ros.org/lunar/api/actionlib/html/classactionlib_1_1SimpleActionServer.html

Are we aiming for API compatability with equivalent functions for all of these in ROS2?

@davetcoleman
Copy link
Contributor

I might not be the best person to answer these questions, but here are my thoughts:

Is the design goal to make Actions in ROS2 fully equivalent to the ActionLib Actions?

I would grealy prefer this so that current usages of actionlib (e.g. everywhere in MoveIt!) are easy to port to ROS2

If so, these state machines are needed, right?

Yes, even being able to re-use a lot of this documentation would be great

Are we aiming for API compatability with equivalent functions for all of these in ROS2?

I don't think API compatibility is necessary... it should instead look and feel like the new ROS2 API for services and messages. Hopefully we can find-replace a lot of the old API to port to the new API

@wjwwood
Copy link
Member

wjwwood commented Sep 12, 2018

@wjwwood - Here is a branch where I've added a node->create_action_server() and node->create_action_client(). https://github.com/mkhansen-intel/rclcpp/tree/actions

And here is a branch for examples for both. https://github.com/mkhansen-intel/examples/tree/action_examples

Those look reasonable on their own. I haven't tried comparing them to ROS 1's examples for actions to see how they differ.

Also, I've not implemented the tools changes to generate the actions message types from an actions file. So for right now, the API requires a Service (ie. AddTwoInts) and a Feedback message type (ie. std_msgs/String). See the examples. Later this could be combined into one template argument.

I guess that's ok, but you could also go ahead and create a structure which contains the composition of the two message/service types as an "action". Then in the future you'd just pass a different struct (provided by the generated code).

@wjwwood
Copy link
Member

wjwwood commented Sep 12, 2018

I think this is enough to start with for our team doing Navigation2, with the expectation that when the tools changes could be implemented and *.action files supported, we could easily change the template interface to one .

Hmm, I'm concerned about this because you're not going to be able to use the existing action definitions and instead will have to create new messages and services based on them in the meantime. I suppose that's fine, but it would obviously be better to get action file generation going.

Right now cancel_request() is not returning any status, if we want it to have a status, we would need to talk about how to do that.

It also appears to be the same type as the request/response service... I would have thought it would be a standard Type like https://github.com/ros2/common_interfaces/blob/master/std_srvs/srv/Trigger.srv.

Is the design goal to make Actions in ROS2 fully equivalent to the ActionLib Actions?

Equivalent in usage and functionality, yes, but not necessarily in implementation details.

This is all purely callbacks, and the expectation right now is that the node itself is implementing some state machine if it is needed.

If so, these state machines are needed, right?

I don't think we will need the state machine as-is from actionlib in ROS 1 because we're using services in ROS 2 whereas in ROS 1 actionlib used topics for everything (IIRC). But I think we might need a state machine of some sort. I can't answer this without digging into the details (http://wiki.ros.org/actionlib/DetailedDescription) and figuring out what applies to our approach in ROS 2 and what doesn't anymore. Someone will need to do that and ideally them summarize their conclusions in the design document.

Perhaps there's a reason buried in the details description of actionlib which gives a reason that they went with all topics and a state machine rather than using two services (request/reply and cancel). Honestly using topics for everything just like it was done in ROS 1 is still on the table for me, but naively it appears the services would make it simpler.

Are we aiming for API compatability with equivalent functions for all of these in ROS2?

As @davetcoleman mentioned, we should have equivalent API's so that porting is straight forward, but updating them to match the style of ROS 2 or to take advantage of a new feature is fine.

We already discussed above that there's a "simple" and full API for actions in ROS 1, but I honestly don't know if we need the same in ROS 2. My initial thought is to implement the normal API and at that point we can consider whether or not we need to have something equivalent to the simple API, because we should be able to add on the simpler API at any point since my impression is that it's only syntactic sugar.

@mkhansenbot
Copy link

Thanks for the input, I can easily make the change to the cancel to use the Trigger service type, I wasn't aware of that one. I think we need more clarity of the design, who is leading this from OSRF?

@wjwwood
Copy link
Member

wjwwood commented Sep 19, 2018

No one is spending time on actions right now. Originally we planned to have 4-ish people on Actions in October, but I don't know if that's still the plan or not.

@mkhansenbot
Copy link

Maybe we can discuss in person at ROSCon?

@gbiggs
Copy link
Member Author

gbiggs commented Sep 25, 2018

I'm sorry for leaving this for so long. My time is not (yet) my own. I'd like to discuss at ROSCon and come up with a plan of attack so we can deal with actions on all fronts in a coordinated way: the design document, the rclc layer, and the client libraries built on top of that.

@mkhansenbot
Copy link

@gbiggs - It would be great to meet at ROSCon and discuss, let me know when and where

@gbiggs
Copy link
Member Author

gbiggs commented Sep 25, 2018

Let's set up a birds of a feather meeting during the time allocated for those.

@mkhansenbot
Copy link

Unfortunately there are no BOF sessions this year. All I saw was this post on discourse: https://discourse.ros.org/t/roscon-2018-informal-meetings-of-special-interest-groups/6151

We could potentially meet during the lunch hour somewhere.

@gbiggs
Copy link
Member Author

gbiggs commented Sep 26, 2018

Lunch on the first day seems open.

This was referenced Oct 9, 2018
@sloretz
Copy link
Contributor

sloretz commented Nov 8, 2018

@gbiggs What would you like to see happen to this PR and #193? Here are a couple options for your consideration, there may be more.

@gbiggs
Copy link
Member Author

gbiggs commented Nov 11, 2018

I'm happy for this to be closed and #193 called a continuation of it.

@sloretz
Copy link
Contributor

sloretz commented Nov 12, 2018

Understood, I'll close it then. Continued in #193

@sloretz sloretz closed this Nov 12, 2018
@sloretz sloretz removed the in review Waiting for review (Kanban column) label Nov 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants