Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SENS: SF45: Improved Datahandling and fixes #23918

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

Claudio-Chies
Copy link
Contributor

@Claudio-Chies Claudio-Chies commented Nov 11, 2024

General Improvement to the Sampling Rate and the Data handling of the Driver.

Basic Structure
We updated the internal logic as follows:

  • If during a sweep of the sensor, a bin gets skipped, we assume the last measurement to be valid for that bin as well.
  • If multiple measurements have been made in a bin, the lowest one is selected.
  • To avoid issues with the message processing structure in CollisionPrevention, the sensor publishes all bins he has measured in the past 0.1s. This is because the sensor is publishing messages at a faster rate than CollisionPrevention is reading them, which might cause lost readings.

Other notable changes:

  • greatly improved the Sampling Rate (tested up to 500hz)
  • Fixed boot up issues
  • Added Timeout functionality to measurements.
  • removed the publishing to distance_sensor

Important note
During testing we came across some power issues, if you are having boot up issues, make sure to power the Sensor separately. And/or shorten the shipped cable. See docs for more info.

Changelog Entry

For release notes:

Bugfix SF45 rotary lidar driver for Collision Prevention
Documentation: docs.px4.io

Relevant PR's

Test coverage

Note:
If there is a move towards a callback based internal obstacle map in collision prevention, then it would be best to publish a distance_sensor message instead of a obstacle_distance one.
here is a branch where this is already implemented:
https://github.com/PX4/PX4-Autopilot/tree/SF45-Rangefinder_approach

@Claudio-Chies
Copy link
Contributor Author

@dirksavage88 Would be nice if you could also do a quick review.

@Claudio-Chies
Copy link
Contributor Author

@dirksavage88
I added logic to remove stale values. Previously, all data was published with the current timestamp, so if a sector had an impossible reading, it would persist indefinitely (which has caused issues in the past). Now, there's an internal array that tracks timeouts for each sector and overwrites any data that's older than 0.5 seconds. While this threshold might seem high, it matches the timeout value used in Collision Prevention.

Copy link

github-actions bot commented Nov 20, 2024

🔎 FLASH Analysis

px4_fmu-v5x [Total VM Diff: 64 byte (0 %)]
    FILE SIZE        VM SIZE    
--------------  -------------- 
+0.0%     +64  +0.0%     +64    .text
  +0.9%     +48  +0.9%     +48    ../../src/modules/logger/logged_topics.cpp
  +0.0%     +13  +0.0%     +13    [section .text]
  +0.2%      +3  +0.2%      +3    ../../src/systemcmds/ver/ver.cpp
+0.0%     +56  [ = ]       0    .debug_abbrev
   +11%     +56  [ = ]       0    ../../src/lib/version/version.c
-0.0%      -8  [ = ]       0    .debug_aranges
  -5.0%      -8  [ = ]       0    ../../src/lib/version/version.c
+0.0%      +4  [ = ]       0    .debug_frame
+0.0%    +205  [ = ]       0    .debug_info
  -0.2%      -4  [ = ]       0    ../../src/lib/version/version.c
  +0.7%    +209  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
+0.0%     +35  [ = ]       0    .debug_line
  -1.3%     -25  [ = ]       0    ../../src/lib/version/version.c
  +0.9%     +54  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
  +0.5%      +6  [ = ]       0    task/task_cancelpt.c
+0.0%    +139  [ = ]       0    .debug_loc
  +1.4%    +139  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
+0.0%     +49  [ = ]       0    .debug_ranges
  -2.6%      -8  [ = ]       0    ../../src/lib/version/version.c
  +4.0%     +56  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
  +1.5%      +1  [ = ]       0    task/task_cancelpt.c
-0.4%     -64  [ = ]       0    [Unmapped]
+0.0%    +480  +0.0%     +64    TOTAL

px4_fmu-v6x [Total VM Diff: 56 byte (0 %)]
    FILE SIZE        VM SIZE    
--------------  -------------- 
+0.0%     +56  +0.0%     +56    .text
  +0.9%     +48  +0.9%     +48    ../../src/modules/logger/logged_topics.cpp
  +0.0%      +5  +0.0%      +5    [section .text]
  +0.2%      +3  +0.2%      +3    ../../src/systemcmds/ver/ver.cpp
+0.0%     +56  [ = ]       0    .debug_abbrev
   +11%     +56  [ = ]       0    ../../src/lib/version/version.c
-0.0%      -8  [ = ]       0    .debug_aranges
  -5.0%      -8  [ = ]       0    ../../src/lib/version/version.c
+0.0%      +4  [ = ]       0    .debug_frame
+0.0%    +205  [ = ]       0    .debug_info
  -0.2%      -4  [ = ]       0    ../../src/lib/version/version.c
  +0.7%    +209  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
+0.0%     +27  [ = ]       0    .debug_line
  -1.3%     -25  [ = ]       0    ../../src/lib/version/version.c
  +0.9%     +54  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
  -0.2%      -2  [ = ]       0    task/task_cancelpt.c
+0.0%    +139  [ = ]       0    .debug_loc
  +1.4%    +139  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
+0.0%     +49  [ = ]       0    .debug_ranges
  -2.6%      -8  [ = ]       0    ../../src/lib/version/version.c
  +4.0%     +56  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
  +1.5%      +1  [ = ]       0    task/task_cancelpt.c
-0.1%     -56  [ = ]       0    [Unmapped]
+0.0%    +472  +0.0%     +56    TOTAL

Updated: 2024-12-13T12:28:50

@PX4 PX4 deleted a comment from github-actions bot Nov 21, 2024
@PX4 PX4 deleted a comment from github-actions bot Nov 21, 2024
@PX4 PX4 deleted a comment from github-actions bot Nov 21, 2024
@dirksavage88
Copy link
Contributor

@Claudio-Chies these changes look good, but I have not had a chance to test on the sensor.

Also I am not an approving reviewer. Maybe @alexcekay would be willing to since he has looked through this code before.

@Claudio-Chies
Copy link
Contributor Author

another issue i found, which i'll fix in the next commit.
There can be bins which have no measurements as the sampling rate is too slow (here at 100hz, which seems currently be the limit the driver still properly works at according to @alexcekay.
to not run into the issue that the drone can not move into that direction, i'm filling in all bins between the last bin and the current one.
The problem can be seen below where bin 50, 49 and others are missing, but bin 46 gets 3 measurements. (point of change of rotation)

measurement happend
scaled_yaw:     -68,    current_bin:    58,     distance:         0.6900
measurement happend
scaled_yaw:     -77,    current_bin:    57,     distance:         0.6500
measurement happend
scaled_yaw:     -94,    current_bin:    53,     distance:         1.1900
measurement happend
scaled_yaw:     -103,   current_bin:    51,     distance:         2.8700
measurement happend
scaled_yaw:     -118,   current_bin:    48,     distance:         1.2900
measurement happend
measurement happend
scaled_yaw:     -123,   current_bin:    47,     distance:         1.2000
measurement happend
measurement happend
measurement happend
scaled_yaw:     -129,   current_bin:    46,     distance:         0.9800
measurement happend
measurement happend
scaled_yaw:     -126,   current_bin:    47,     distance:         1.0200

@Claudio-Chies
Copy link
Contributor Author

Claudio-Chies commented Nov 22, 2024

Screencast.from.11-22-2024.11.25.34.AM.webm

Attached we can see a replay of a log, with low and high angle set to -130, 130 respectively, and at 100hz.
The obstacle_distance message publishes the smallest measurement (if multiple exist) for all sectors between the last and current sector (if no measurements were made for certain sectors)

Copy link

FLASH Analysis

px4_fmu-v5x
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +0.0%    +209  [ = ]       0    .debug_info
    +0.7%    +209  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
  +0.0%    +139  [ = ]       0    .debug_loc
    +1.4%    +125  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
    +0.0%     +14  [ = ]       0    [section .debug_loc]
  +0.0%     +72  +0.0%     +72    .text
    +0.9%     +48  +0.9%     +48    ../../src/modules/logger/logged_topics.cpp
    +0.0%     +24  +0.0%     +24    [section .text]
  +0.0%     +56  [ = ]       0    .debug_ranges
    +4.0%     +56  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
  +0.0%     +54  [ = ]       0    .debug_line
    +0.9%     +54  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
  +0.0%      +4  [ = ]       0    .debug_frame
  -0.3%     -74  [ = ]       0    [Unmapped]
  +0.0%    +460  +0.0%     +72    TOTAL

px4_fmu-v6x
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +0.0%    +209  [ = ]       0    .debug_info
    +0.7%    +209  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
  +0.0%    +139  [ = ]       0    .debug_loc
    +1.4%    +125  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
    +0.0%     +14  [ = ]       0    [section .debug_loc]
  +0.0%     +72  +0.0%     +72    .text
    +0.9%     +48  +0.9%     +48    ../../src/modules/logger/logged_topics.cpp
    +0.0%     +24  +0.0%     +24    [section .text]
  +0.0%     +56  [ = ]       0    .debug_ranges
    +4.0%     +56  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
  +0.0%     +54  [ = ]       0    .debug_line
    +0.9%     +54  [ = ]       0    ../../src/modules/logger/logged_topics.cpp
  +0.0%      +4  [ = ]       0    .debug_frame
  -0.1%     -70  [ = ]       0    [Unmapped]
  +0.0%    +464  +0.0%     +72    TOTAL

@dirksavage88
Copy link
Contributor

That makes sense. Yeah at the rotating point where it shifts to the opposite direction I do notice there are more samples
lightware_snip

@Claudio-Chies Claudio-Chies force-pushed the sf45_uart_fix branch 2 times, most recently from 1d4dcbf to 457e67c Compare November 29, 2024 12:39
@alexcekay
Copy link
Member

alexcekay commented Dec 3, 2024

Due to some reports of the driver not starting up in some cases and the update rate being too slow, I have made the following changes:

  • Use non-blocking UART reads. Prior to this change, the driver used V_TIME / V_MIN to achieve this. This is not supported on all platforms. For example, on a v5x / v6x, V_TIME / V_MIN won't have any effect. This can cause the driver to lock up completely in some cases.
  • Checking the response of the Product Name, Update Rate, Distance Output messages to avoid being out of sync with the real sensor state.
  • Improved the parser to handle cases where the read does not return the exact number of bytes requested with the start byte at index 0. This was necessary to achieve a stable, higher parsing rate for sensor data.
  • Increased the output rate to avoid gaps in the bins.
  • Changed the timeout logic from counter based to time based. This is useful as the counter increases faster/slower depending on the number of reads. The timeout-time does not depend on the update rate.

I did test the changes on the bench

  • The driver started correctly every time in a test of 300 v5x starts
  • The bin data looks good

It would be great to test the changes in the field.

@Claudio-Chies
Copy link
Contributor Author

Ok, so after having quite some issues with data not getting properly into the collision prevention obstacle map, i figured out the issue is that collision prevention has a sampling rate compared to this driver, causing messages to get lost, and by this some sectors would remain empty.
to fix this i went back with the timeout approach whereby i only discard measured sectors after a timeout which is currently hardcoded to 0.1s. this way we can avoid sectors getting lost.
tested on hardware, and looks good.

@dirksavage88
Copy link
Contributor

Ok, so after having quite some issues with data not getting properly into the collision prevention obstacle map, i figured out the issue is that collision prevention has a sampling rate compared to this driver, causing messages to get lost, and by this some sectors would remain empty. to fix this i went back with the timeout approach whereby i only discard measured sectors after a timeout which is currently hardcoded to 0.1s. this way we can avoid sectors getting lost. tested on hardware, and looks good.

Was the SF45 update rate too fast for collision prevention ?

I'm also working on an offboard-based ST Lidar array (4 i2c sensors) that publishes over microdds to /fmu/in/obstacle_distance and was experimenting with having each sensor be it's own publisher to the obstacle distance array, and how each sensor publishes to a specific set of bins on the obstacle array (for only it's specific zonal coverage), but not sure if collision prevention can handle the varying bin-filled obstacle messages (depending on the timing and each sensor's FoV)

@Claudio-Chies
Copy link
Contributor Author

Claudio-Chies commented Dec 5, 2024

e SF45 update rate too fast for collision prevention ?

Yes, seems like it. Collision prevention updates the obstacle map each time it gets called by StickAccelerationXY::generateSetpoints and uses the last published message.
it might be an idea to split out the update of the internal obstacle map into a separate functionality which gets called with each sensor update. If you have 4 distance sensors in an array, it might be best to either merge them into one obstacle map, where you can also add some timeout functionality, or publish them as distance sensors. There you'll have to add distance sensors to dds_topics.yaml, maybe more configuration is necessary, but there I'm not certain.
image
The image above is the SF45 publishing as a DistanceSensor at 500hz, and you can see the timing difference between it and obstacle_distance_fused

below you can see how the SF45 now published the data

Screencast.from.12-05-2024.09.41.40.AM.webm

@alexcekay
Copy link
Member

it might be an idea to split out the update of the internal obstacle map into a separate functionality which gets called with each sensor update

I think this is generally a good solution. Because now there is a lot of logic in the sensor (like the timeout logic for the obstacle map) that will most likely be needed by any kind of rotating LIDAR that wants to publish an obstacle map. From my point of view, the whole obstacle map is not really directly usable with a rotating LIDAR.

There it might be better to publish the obstacle map directly with only one array entry updated (as before) and have some intermediate module that builds a usable obstacle map for collision detection (and other modules that might need it as input). Or use the DistanceSensor message and its quaternion q instead (as you also mentioned some time ago 👍).

@Claudio-Chies
Copy link
Contributor Author

@alexcekay yes, I absolutely agree, some module which updates the internal map of collision prevention which gets called based on new measurements would be ideal, then sending the SF45 Data as Distance_Sensor messages would be the perfect solution.
for future reference: the branch with the DistanceSensor approach
https://github.com/PX4/PX4-Autopilot/tree/SF45-Rangefinder_approach

Copy link
Contributor

@sfuhrer sfuhrer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks generally very clean and reasonable.
Would appreciate if you could update the PR description. I think it's quite outdated, especially with Alex's contributions. Including the release note section.

} else {
_parsed_state = 3;
while (restart_flag != true) {
switch (_parsed_state) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could _parsed_state be an enum?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcekay could you maybe help here? I'm not sure what the different states actually are

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsed state could be an enum yes. The different states are really the different fields of the sf45 packet that px4 receives and performs a crc on the bytes in each field and processes the data bytes in the payload field field (the raw data bytes, which includes distance and direction)

E.g.:
Header | Payload | Checksum

| Start | Flags Low | Flags High | ID | Data 0 .. N | CRC Low | CRC High

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dirksavage88 could you quickly the latest commit to check if that is how you meant bfcb08d

Copy link
Member

@alexcekay alexcekay Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Claudio-Chies, just a quick note:
This expression SF45_PARSED_STATE::FLG_LOW; and the similar ones are not assignements, they basically do nothing. You need something of the following type: _parsed_state = <xyz>, when you want to modify the parsed state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups, that was a Find and Replace mistake... pushed the commit with the fix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enum looks correct

Claudio-Chies and others added 2 commits December 13, 2024 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Sensors All the sensors!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants