Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding extra time counter to exit while loop so that controller manager doesn't freeze #458

Merged
merged 12 commits into from
Jun 16, 2021

Conversation

krishnachaitanya7
Copy link
Contributor

@krishnachaitanya7 krishnachaitanya7 commented May 25, 2020

This PR addresses issue #443. When I switch the controllers sometimes the controller manager is stuck here at this unending while loop. The problem I am facing is similar to #265. The only way out of this loop is either when

  • switch_params_.do_switch is False
  • ROS core exits
  • when there is a timeout.

In the first case, switch_params_.do_switch becomes False if switching is successful which is the ideal case

Exiting ROS Core means either simulation is shutdown or ROS core exited with non zero exit code

If the timeout is specified after the timeout the switching is stopped and the response code returned back is False. The timeout is calculated using the ROS time. When I ran my simulation using Gazebo the ROS time freezes hence the timeout is never possible. Due to this issue, the code is stuck at the aforementioned while loop and never comes out. This gives rise to Denial of Service. The simulation seems to be stuck. No further switching of controllers happen. Even the controller manager does not list the current controllers which are stopped or running because the controller manager is in denial of service and it's lock is not released. The only way out of this is to restart the entire simulation.

Hence I have added a time check in the while loop which uses system time contrary to ROS time and if something like this happens at least the code can safely exit rather than go into denial of service. The code edits suggested in this PR will give every opportunity for ROS time to function but it will also monitor the timeout from a system clock perspective and after timeout+1 seconds if the while loop didn't break, it will forcibly break out of the while loop. Finally when the while loop is exited by forcibly breaking out "False" is returned signaling the end-user that controller switching wasn't successful.

@krishnachaitanya7 krishnachaitanya7 changed the title Fix ROS Control Denial of Service Issue Adding extra time counter to exit while 1 loop so that controller manager doesn't freeze Jun 28, 2020
@krishnachaitanya7
Copy link
Contributor Author

Hi @bmagyar! Can you please look into the PR? You commented on the previous issue I posted which I fixed in this PR. Thank you. Kindly let me know!

@peasant98
Copy link

@bmagyar can you let us know the status of this? This is a known bug in the controller manager.

@krishnachaitanya7 krishnachaitanya7 changed the title Adding extra time counter to exit while 1 loop so that controller manager doesn't freeze Adding extra time counter to exit while loop so that controller manager doesn't freeze Jul 10, 2020
@peasant98
Copy link

peasant98 commented Oct 23, 2020

@bmagyar -- still no updates on this. Have you been able to verify that this bug occurs on your end?

@bmagyar bmagyar changed the base branch from melodic-devel to noetic-devel June 16, 2021 09:08
@bmagyar bmagyar changed the base branch from noetic-devel to melodic-devel June 16, 2021 09:14
@bmagyar bmagyar merged commit 317dcc9 into ros-controls:melodic-devel Jun 16, 2021
@bmagyar
Copy link
Member

bmagyar commented Jun 16, 2021

Thanks for the fix, I'm releasing this now to melodic and noetic!

@krishnachaitanya7
Copy link
Contributor Author

Thank you so much @bmagyar!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants