Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ODrive Node Does Not Feed Watchdog Timer Fast Enough #32

Open
ansarid opened this issue Oct 26, 2024 · 2 comments
Open

ODrive Node Does Not Feed Watchdog Timer Fast Enough #32

ansarid opened this issue Oct 26, 2024 · 2 comments

Comments

@ansarid
Copy link

ansarid commented Oct 26, 2024

I'm using the ODrive node in an application where I want to set the watchdog timeout to something around 100ms. But no matter how quickly I send control messages the axis state gets set back to IDLE. I'm currently sending control messages at 100Hz and the timeout is set to 0.1s. The only way I can prevent the ODrive from leaving the CLOSED_LOOP_CONTROL state is by setting the watchdog timeout to something larger than 1.0s. Anything lower than 1.0s and for some reason the watchdog does not stay fed even if I'm sending control messages at 100Hz. Not sure if this is a related issue but I'm suspicious it may be. When I continuously send control messages and then set the requested axis state to CLOSED_LOOP_CONTROL the odrive node will print many "Failed to send CAN frame" messages. I can kind of fix this by setting the QoS for the control message subscriber to only keep the last message instead of keeping all the messages as it is written right now. But this does not fix the watchdog issue and I'm not totally sure the implications of modifying the message queue length.

@ansarid ansarid changed the title ODrive Node Does Not Feed Watchdog Fast Enough ODrive Node Does Not Feed Watchdog Timer Fast Enough Oct 26, 2024
@yushi-minemura
Copy link
Contributor

yushi-minemura commented Nov 1, 2024

@ansarid

auto call_time = std::chrono::steady_clock::now();
fresh_heartbeat_.wait(guard, [this, &call_time]() {
bool complete = (this->ctrl_stat_.procedure_result != 1) && // make sure procedure_result is not busy
(std::chrono::steady_clock::now() - call_time >= std::chrono::seconds(1)); // wait for minimum one second
return complete;
}); // wait for procedure_result

/request_axis_service includes a 1-second wait process in service callback. I think it is timing out because the subscriber is not receiving the speed command topics while waiting.

related to #26 .

@samuelsadok
Copy link
Member

Yeah from looking at it I think "/request_axis_service" is probably the reason for the watchdog timeout. The ROS-side of the node runs on a single thread, so when ODriveCanNode::service_callback() is invoked and blocks for 1 second, it prevents any other incoming ROS messages from being handled.
The "Failed to send CAN frame" messages might be because lots of "ControlMessage" queued up while the ROS thread was blocked, and once it unblocks, trying to send them all at once over CAN made the SocketCAN queue fill up and reject some of the CAN frames.

The solution is to make service_callback() asynchronous, that is, return immediately and send the response later from a different function (e.g. from a timer or from the heartbeat RX handler).

Here's a draft how it can be done: https://github.com/odriverobotics/ros_odrive/compare/async-service
However I don't have the time to finish and test it right now (see TODO). If someone wants to have a go at it, you're welcome to open a PR.

(An alternative solution would be to use a rclcpp::executors::MultiThreadedExecutor, but I prefer to avoid unnecessary threads to keep reasoning about synchronization simple.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants