Enhancement: allow dynamic scan message size to reduce latency #94

drwnz · 2023-11-16T06:49:56Z

Description

Currently, the scan message containing raw UDP packets is sized to be the same as the number of packets in the scan.
Together with adding nebula_messages that has a generic udp packet message and packets message (currently "scan" message), enabling an arbitrary number of packets per packets message would allow tuning to optimize throughput.

Purpose

Dynamic number of packets in each packets message would allow a potential reduction in latency
Tuning can be performed to maximize based on a particular users bandwidth/network/ros configuration

Details

The reasoning behind this change is as follows:

Currently, the decoder has to wait for a whole scan's worth of packets to be collected before it starts decoding them
By reducing the number of packets in a scan or "packets" message, the latency could be decreased drastically as the decoder is always busy rather than waiting
Depending on network load, whether recording packets to ROSbag or not, and overhead from pub/sub there may be an optimal number of packets per packets message which is probably >1 (as a single packet message will reduce latency, but may increase message handling overhead)
The implementation of each decoder will have to be modified such that the decoder is entirely responsible for splitting scans at the appropriate place

Possible approaches

Create generic nebula_messages with nebula_packet and nebula_packets, which allow any number of packets (but keep track of the number of packets contained in the message with a field)
Add a parameter to the ros hardware interface wrappers to include number of packets per packets message
Modify the decoder ros wrappers to be agnostic to the number of packets in each received packets message

The text was updated successfully, but these errors were encountered:

mojomex · 2023-11-16T08:37:01Z

In my experimentation, sending 1 PandarScanMsg for each single PandarPacket reduced the latency between the final packet of a scan arriving in the HW interface, and the decoder wrapper publishing the pointcloud, from ≈9.0 ms to ≈1.4 ms for AT128 with around 70k output points per pointcloud.

At least for Hesai, the decoder and decoder wrapper are already agnostic to number of packets.

xmfcx · 2023-12-04T14:24:45Z

I recommend getting rid of ScanMsg altogether.
Maybe it can be published additionally, just for specific logging purposes.
But it shouldn't be a part of the runtime processing pipeline.

mojomex · 2023-12-05T03:09:42Z

I agree, this would additionally allow us to send smaller packets without padding (currently, PandarPacket etc. are always MTU_SIZE in length, even if the packet itself is smaller).

Using the scan message for logging would require us to implement a mechanism to replay the scan contents in a timing-accurate manner, i.e. we would additionally need to store packet timestamps (the ones in the packets themselves could be used but are vendor-specific and thus a pain to parse in the HW interface).

(also see discussion autowarefoundation#4024)

mojomex · 2024-03-21T15:29:51Z

I'm currently working on this along with refactoring Nebula towards being a single node:

refactor(hesai)!: combine Hesai ROS wrappers into a single node #127

mojomex · 2024-09-26T02:50:28Z

Fixed by #127, #147, #148.

Packets are now decoded on arrival, one-by-one. Scan messages are still containing one scan's worth of packets as recording with ros2 bag is still using sqlite3 on some systems, which has huge overhead. Once everyone moved to mcap, we can scale the scan messages to 1 packet per message.

drwnz added the enhancement New feature or request label Nov 16, 2023

drwnz assigned mojomex and drwnz Nov 16, 2023

xmfcx mentioned this issue Dec 4, 2023

perf: unpack velodyne packets in parallel #62

Closed

5 tasks

xmfcx mentioned this issue Dec 18, 2023

Improve the overall performance #109

Closed

mojomex mentioned this issue Apr 23, 2024

refactor(hesai)!: combine Hesai ROS wrappers into a single node #127

Merged

5 tasks

This was referenced May 14, 2024

refactor(robosense)!: combine Robosense ROS wrappers into a single node #147

Merged

refactor(velodyne)!: combine Velodyne ROS wrappers into a single node #148

Merged

mojomex mentioned this issue Jun 12, 2024

Performance Enhancement: Parallelized Unpacking of Velodyne Packets #61

Closed

mojomex closed this as completed Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: allow dynamic scan message size to reduce latency #94

Enhancement: allow dynamic scan message size to reduce latency #94

drwnz commented Nov 16, 2023

mojomex commented Nov 16, 2023 •

edited

Loading

xmfcx commented Dec 4, 2023 •

edited

Loading

mojomex commented Dec 5, 2023 •

edited

Loading

mojomex commented Mar 21, 2024

mojomex commented Sep 26, 2024 •

edited

Loading

Enhancement: allow dynamic scan message size to reduce latency #94

Enhancement: allow dynamic scan message size to reduce latency #94

Comments

drwnz commented Nov 16, 2023

Description

Purpose

Details

Possible approaches

mojomex commented Nov 16, 2023 • edited Loading

xmfcx commented Dec 4, 2023 • edited Loading

mojomex commented Dec 5, 2023 • edited Loading

mojomex commented Mar 21, 2024

mojomex commented Sep 26, 2024 • edited Loading

mojomex commented Nov 16, 2023 •

edited

Loading

xmfcx commented Dec 4, 2023 •

edited

Loading

mojomex commented Dec 5, 2023 •

edited

Loading

mojomex commented Sep 26, 2024 •

edited

Loading