Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set received_timestamp to system_clock::now() in message_info #491

Conversation

@MichaelOrlov
Copy link
Contributor Author

cc: @clalancette

Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One really minor fix, then this looks great to me.

rmw_cyclonedds_cpp/src/rmw_node.cpp Outdated Show resolved Hide resolved
Co-authored-by: Chris Lalancette <[email protected]>
Signed-off-by: Michael Orlov <[email protected]>
Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me with green CI! Thanks for doing this.

@MichaelOrlov
Copy link
Contributor Author

Pulls: #491
Gist: https://gist.githubusercontent.com/MichaelOrlov/ff54ec8c3c5bc15af4efb81f790fa4e3/raw/10e2986bf8ca550de03356ae87e0366ae5fcea20/ros2.repos
BUILD args: --packages-above-and-dependencies rmw_cyclonedds ros2bag rosbag2_transport rosbag2_tests
TEST args: --packages-above rmw_cyclonedds ros2bag rosbag2_transport rosbag2_tests
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/13833

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

rmw_cyclonedds_cpp/src/rmw_node.cpp Outdated Show resolved Hide resolved
@MichaelOrlov
Copy link
Contributor Author

MichaelOrlov commented May 7, 2024

Pulls: #491
Gist: https://gist.githubusercontent.com/MichaelOrlov/b4eb11ed0ac1fce5e8295432d7428ffe/raw/10e2986bf8ca550de03356ae87e0366ae5fcea20/ros2.repos
BUILD args: --packages-above-and-dependencies rmw_cyclonedds_cpp ros2bag rosbag2_transport rosbag2_tests
TEST args: --packages-above rmw_cyclonedds_cpp ros2bag rosbag2_transport rosbag2_tests
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/13834

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@clalancette
Copy link
Contributor

Ah, so the tests failing in CI are related.

The problem is that now that we have timestamp support, we need to update https://github.com/ros2/rcl/blob/f19067726b9bc2d0b90ac0d0f246d1d7821d41ee/rcl/test/CMakeLists.txt#L298-L316 to enable RMW_TIMESTAMPS_SUPPORTED for rmw_cyclonedds_cpp. I can open a PR for that in a little while here.

@clalancette
Copy link
Contributor

Here is new CI with both this and ros2/rcl#1156:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@MichaelOrlov MichaelOrlov changed the title Set received_timestamp to steady_clock::now() in message_info Set received_timestamp to system_clock::now() in message_info May 8, 2024
@clalancette
Copy link
Contributor

CI revealed that we also have to update the receive_timestamp for services. That should fix some of the failing tests in CI, but @MichaelOrlov there are quite a number of failing tests for rosbag2 as well. Can you take a look at those?

@MichaelOrlov
Copy link
Contributor Author

@clalancette I will take a look

@MichaelOrlov
Copy link
Contributor Author

@clalancette It is a bit unexpected error in rosbag2 player

[ERROR] [1715177950.046068203] [rosbag2_player]: Failed to publish message on 'loop_test_topic' topic. 
9: Error: failed to borrow loaned msg: error not set
9: [ERROR] [1715177950.046081143] [rosbag2_player]: Failed to publish message on 'loop_test_topic' topic. 
9: Error: failed to borrow loaned msg: error not set

@MichaelOrlov
Copy link
Contributor Author

In rosbag2 Player, we use rclcpp::GenericPublisher::publish_as_loaned_msg if the publisher allows to loan messages
https://github.com/ros2/rosbag2/blob/91468782eccd6b11a99e2da80f313fb7ea722d40/rosbag2_transport/src/rosbag2_transport/player.cpp#L242-L253

    explicit PlayerPublisher(
      std::shared_ptr<rclcpp::GenericPublisher> pub,
      bool disable_loan_message)
    : publisher_(std::move(pub))
    {
      using std::placeholders::_1;
      if (disable_loan_message || !publisher_->can_loan_messages()) {
        publish_func_ = std::bind(&rclcpp::GenericPublisher::publish, publisher_, _1);
      } else {
        publish_func_ = std::bind(&rclcpp::GenericPublisher::publish_as_loaned_msg, publisher_, _1);
      }
    }

It seems, for some reason, the .borrow_loaned_message() inside rclcpp::GenericPublisher::publish_as_loaned_msg fails. However, I have no idea why for a while.

@MichaelOrlov
Copy link
Contributor Author

@clalancette I've tried to checkout changes from this PR and from dependent ros2/rcl#1156 and build everything from scratch on my local machine. All rosbag2 tests passing on my local setup either with rmw_cyclonddds and rmw_fastrps.
May be we are missing some flags for CI run?

@clalancette
Copy link
Contributor

@clalancette I've tried to checkout changes from this PR and from dependent ros2/rcl#1156 and build everything from scratch on my local machine. All rosbag2 tests passing on my local setup either with rmw_cyclonddds and rmw_fastrps.
May be we are missing some flags for CI run?

Hm, that's weird. On my local setup, I can reproduce the errors with no problem.

That said, we definitely have a bug in rclcpp_lifecycle caused by ros2/rclcpp#2450 . In particular, this is causing errors to happen in CI ever since it has been merged:

      Start  7: test_lifecycle_publisher
 
7: Test command: /usr/bin/python3 "-u" "/home/jenkins-agent/workspace/nightly_linux_release/ws/install/ament_cmake_test/share/ament_cmake_test/cmake/run_test.py" "/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml" "--package-name" "rclcpp_lifecycle" "--output-file" "/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/ament_cmake_gtest/test_lifecycle_publisher.txt" "--command" "/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_lifecycle_publisher" "--gtest_output=xml:/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml"
7: Working Directory: /home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle
7: Test timeout computed to be: 60
7: -- run_test.py: invoking following command in '/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle':
7:  - /home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_lifecycle_publisher --gtest_output=xml:/home/jenkins-agent/workspace/nightly_linux_release/ws/build/rclcpp_lifecycle/test_results/rclcpp_lifecycle/test_lifecycle_publisher.gtest.xml
7: Running main() from /home/jenkins-agent/workspace/nightly_linux_release/ws/install/gtest_vendor/src/gtest_vendor/src/gtest_main.cc
7: [==========] Running 4 tests from 1 test suite.
7: [----------] Global test environment set-up.
7: [----------] 4 tests from PerTimerType/TestLifecyclePublisher
7: [ RUN      ] PerTimerType/TestLifecyclePublisher.publish_managed_by_node/wall_timer
7: [WARN] [1715154999.943244122] [LifecyclePublisher]: Trying to publish message on the topic '/topic', but the publisher is not activated
7: [ERROR] [1715154999.943921662] [node]: Unable to start transition 6 from current state shuttingdown: Could not publish transition: publisher's context is invalid, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl/src/rcl/publisher.c:423, at /home/jenkins-agent/workspace/nightly_linux_release/ws/src/ros2/rcl/rcl_lifecycle/src/rcl_lifecycle.c:368
7: [WARN] [1715154999.943982879] [rclcpp_lifecycle]: Shutdown error in destruction of LifecycleNode: final state(inactive)

I think this happens because the LifecycleNode destructor cannot assume that the publisher is still valid at this point. @fujitatomoya FYI, I may propose reverting that change for now.

@clalancette
Copy link
Contributor

In any case, since we are getting really weird behavior here, I'm going to try another CI to see what happens with my latest change:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@MichaelOrlov
Copy link
Contributor Author

@clalancette I figured out that I have a bit outdated versions of the other packages (about 3 weeks old). It might explain why test passes on my local setup.

I am trying to update the rclcpp package and reproduce the failure again. Perhaps I will be able to dissect with which PR we interfere if the issue starts reproducing on my local setup.

@clalancette
Copy link
Contributor

I'll also point out eProsima/Fast-DDS#4681 , which is suspiciously in this same area. I suspect maybe that one has a problem; if we fail again in CI, I'm going to try going earlier than that commit and seeing if that is the problem.

@MichaelOrlov
Copy link
Contributor Author

No luck in reproducing failures with the latest rclcpp on my local setup.
@clalancette BTW I think that eProsima/Fast-DDS#4681 is unrelated because you have a pinned 2.14.x version of the Fast-DDS in your gist

 eProsima/Fast-CDR:
    type: git
    url: https://github.com/eProsima/Fast-CDR.git
    version: 2.2.x
  eProsima/Fast-DDS:
    type: git
    url: https://github.com/eProsima/Fast-DDS.git
    version: 2.14.x

And i have the same version in my local setup.

@MichaelOrlov
Copy link
Contributor Author

Hm, from another hand, the rosbag2 tests failed on CI only with rmw_fastrtps_cpp

@clalancette
Copy link
Contributor

@clalancette BTW I think that eProsima/Fast-DDS#4681 is unrelated because you have a pinned 2.14.x version of the Fast-DDS in your gist

That's not pinned; it takes the latest from the 2.14.x branch.

@MichaelOrlov
Copy link
Contributor Author

@clalancette Ok this is certainly the Fast-DDS package. I've just updated it locally to the latest master and got the same errors as on CI.

@clalancette
Copy link
Contributor

clalancette commented May 8, 2024

All right, I'm going to try backing up to an earlier version of Fast-DDS in my repos file. Here's another CI try with that:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@MichaelOrlov
Copy link
Contributor Author

@clalancette It seems you were right about eProsima/Fast-DDS#4681.
I've just checked-out locally revision c2a4523a21a94302298bc995057719546b72df2d which is right before eProsima/Fast-DDS#4681 and issues gone on my lcoal setup.
Now building with eProsima/Fast-DDS#4681 to verify that it is a source of the failures.

@MichaelOrlov
Copy link
Contributor Author

@clalancette I confirm that eProsima/Fast-DDS#4681 is a source of errors in rosbag2 Player with messages:

 [ERROR] [1715197871.773078660] [rosbag2_player]: Failed to publish message on 'topic1' topic. 
2: Error: failed to borrow loaned msg: error not set
2: [ERROR] [1715197871.872948366] [rosbag2_player]: Failed to publish message on 'topic1' topic. 
2: Error: failed to borrow loaned msg: error not set

We can start preparing reverting PR. Don't need to wait until CI will finish.

@clalancette
Copy link
Contributor

Here's a PR to pin us back to a known working version of Fast-DDS: ros2/ros2#1560

MichaelOrlov added a commit to ros2/rosbag2 that referenced this pull request May 8, 2024
- The fix will be done on the `rmw_cyclonedds` layer in the scope of the
ros2/rmw_cyclonedds#491

Signed-off-by: Michael Orlov <[email protected]>
@clalancette
Copy link
Contributor

All right, CI in #491 (comment) was successful, and we've now pinned that version in both Rolling and Jazzy. With that, this is approved alongside ros2/rcl#1156 , so I'm going to merge both of them and backport both to jazzy. Thanks everyone for the work and reviews here.

@clalancette clalancette merged commit 76c9d8f into ros2:rolling May 9, 2024
2 checks passed
@clalancette
Copy link
Contributor

@Mergifyio backport jazzy

Copy link

mergify bot commented May 9, 2024

backport jazzy

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request May 9, 2024
* Set received_timestamp to steady_clock::now() in message_info

* Use 'system_clock' instead of 'steady_clock'

* Also update receive_timestamp for services.

Signed-off-by: Michael Orlov <[email protected]>
Signed-off-by: Chris Lalancette <[email protected]>
(cherry picked from commit 76c9d8f)
MichaelOrlov added a commit to ros2/rosbag2 that referenced this pull request May 9, 2024
… recording (#1641)

* Workaround for zeros in a message's received timestamp with CycloneDDS

- CycloneDDS doesn't support 'received_timestamps' and settling up
zeros in message_info for that field.
- Fallback to the timestamp from node clock in case of the CycloneDDS.

Signed-off-by: Michael Orlov <[email protected]>

* Fix typo in rmw_has_received_timestamp_support

Signed-off-by: Michael Orlov <[email protected]>

* Add one more check in tests that "recv_timestamp >= send_timestamp"

- Also fixed typo in 'rwm_has_send_timestamp_support' in test

Signed-off-by: Michael Orlov <[email protected]>

* Revert workaround for `rmw_cyclonedds` in recorder.cpp

- The fix will be done on the `rmw_cyclonedds` layer in the scope of the
ros2/rmw_cyclonedds#491

Signed-off-by: Michael Orlov <[email protected]>

---------

Signed-off-by: Michael Orlov <[email protected]>
clalancette pushed a commit that referenced this pull request May 10, 2024
…493)

* Set received_timestamp to steady_clock::now() in message_info

* Use 'system_clock' instead of 'steady_clock'

* Also update receive_timestamp for services.

Signed-off-by: Michael Orlov <[email protected]>
Signed-off-by: Chris Lalancette <[email protected]>
(cherry picked from commit 76c9d8f)

Co-authored-by: Michael Orlov <[email protected]>
mergify bot pushed a commit to ros2/rosbag2 that referenced this pull request May 23, 2024
… recording (#1641)

* Workaround for zeros in a message's received timestamp with CycloneDDS

- CycloneDDS doesn't support 'received_timestamps' and settling up
zeros in message_info for that field.
- Fallback to the timestamp from node clock in case of the CycloneDDS.

Signed-off-by: Michael Orlov <[email protected]>

* Fix typo in rmw_has_received_timestamp_support

Signed-off-by: Michael Orlov <[email protected]>

* Add one more check in tests that "recv_timestamp >= send_timestamp"

- Also fixed typo in 'rwm_has_send_timestamp_support' in test

Signed-off-by: Michael Orlov <[email protected]>

* Revert workaround for `rmw_cyclonedds` in recorder.cpp

- The fix will be done on the `rmw_cyclonedds` layer in the scope of the
ros2/rmw_cyclonedds#491

Signed-off-by: Michael Orlov <[email protected]>

---------

Signed-off-by: Michael Orlov <[email protected]>
(cherry picked from commit b7d9b5b)
MichaelOrlov added a commit to ros2/rosbag2 that referenced this pull request May 25, 2024
… recording (#1641) (#1673)

* Workaround for zeros in a message's received timestamp with CycloneDDS

- CycloneDDS doesn't support 'received_timestamps' and settling up
zeros in message_info for that field.
- Fallback to the timestamp from node clock in case of the CycloneDDS.

Signed-off-by: Michael Orlov <[email protected]>

* Fix typo in rmw_has_received_timestamp_support

Signed-off-by: Michael Orlov <[email protected]>

* Add one more check in tests that "recv_timestamp >= send_timestamp"

- Also fixed typo in 'rwm_has_send_timestamp_support' in test

Signed-off-by: Michael Orlov <[email protected]>

* Revert workaround for `rmw_cyclonedds` in recorder.cpp

- The fix will be done on the `rmw_cyclonedds` layer in the scope of the
ros2/rmw_cyclonedds#491

Signed-off-by: Michael Orlov <[email protected]>

---------

Signed-off-by: Michael Orlov <[email protected]>
(cherry picked from commit b7d9b5b)

Co-authored-by: Michael Orlov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Rosbag2 does not work at all with CycloneDDS on Jazzy
4 participants