-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the performance of computing sin and cos functions #234
base: melodic-devel
Are you sure you want to change the base?
Improve the performance of computing sin and cos functions #234
Conversation
This is awesome, thanks a lot. We will test this soon. |
I just noticed that there are some other places that use |
6cd9cc7
to
abbdbee
Compare
Sorry for taking so long to get back to this PR. I have exhaustively tested the changes and decided to enable |
Just commenting here as a user of TEB. |
Yes and no. There is a benefit in computing sin and cos at the same time, so I must take them as references. I then used the same interface for computing only the sine just for consistency.
What do you have in mind?
I have not benchmarked since the change cleared the performance bottleneck altogether. |
To me the API is not comfortable and very prone to errors.
Just reasoning on which approximation to use. For instance, the implementation linked above seems very compact - also with respect to the number of operations performed. But not sure on the performance. |
It is really compact. Thanks for the reference btw. I have tested here the precision and it is better, the maximum error is 0.001 using the approximation you suggested compared to 0.003 with the one I implemented. Besides, the error is zero at zero, which I assume is desirable for numerical conditioning. I'll compare the performance just for the sake of it, but it should be equally good (if not better). |
I have changed the approximation method to the one suggested by @RainerKuemmerle. Although the maximum absolute approximation error is smaller, I've noticed a more pronounced convergence deterioration if using the approximation in the edges. Since I have disabled this and the PR uses the approximations only for distance calculations, I think it doesn't really matter. Besides, the API is the same as the one in the standard library, as pointed out by @RainerKuemmerle. |
Have you experienced any overhead when using your wrapper for std::sin and std::cos compared to the plain implementation? |
The overhead is unnoticeable given that the other methods called while solving the optimization are much more expensive. Profiling shows no difference as well. |
I forgot about this one for a while. Would this PR be interesting? Should I consider modifications to improve it (besides resolving the conflicts)? |
If it delivers the performance improvement you describe, yes, definitely is interesting. |
…ove-sin-cos-performance
9dd56fe
to
6dd3639
Compare
@@ -376,6 +383,9 @@ class TebConfig | |||
recovery.oscillation_recovery_min_duration = 10; | |||
recovery.oscillation_filter_duration = 10; | |||
|
|||
// Recovery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Recovery | |
// Performance |
grp_performance.add( | ||
"use_sin_cos_approximation", | ||
bool_t, | ||
0, | ||
"Use sin and cos approximations to improve performance. The maximum absolute error for these approximations is 1e-3.", | ||
False | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep same format as for other params
I didn't notice any speedup on computeVelocityCommands methods. Those are the times for 100 consecutive calls w/o and w/ fast sin/cos doing the same route: std: fast Neither I notice any reduction on CPU usage, but I did that with top, what obviously is not at all a sound method. Where the speedup gets reflected? PS: sorry, I broke the PR again with the latest merge 😞 |
If I remember it correctly, the
I have left the changes out of the edge calculations because adding them there caused the optimization to oscillate. The macro is also an idea but it would
That's alright =). I will see if I can make the changes asap. The full disclaimer is that I am not compiling or running since my company uses ROS2 now. |
Once again, this PR affects quite a lot of files, but the change is actually not that big; besides, it is optional. 🤓
TL;DR
For me, this PR saves 40% CPU without any noticeable impact on convergence. The maximum absolute error between the analytic
sin
andcos
functions and the approximations is 0.003.Why
In the application I work on, the major performance bottleneck we faced when using TEB was computing three simple functions:
sin
,cos
, andpow
. Initially, that came as a surprise for me, but it does make sense since these functions must be used all over to compute the graph edges.Summary
There are basically two ways of speeding up the computation of trigonometric functions: look-up tables and mathematical approximations. I did not benchmark any of the other ways, but I am confident the approximation I used is more than enough to disappear with the performance bottleneck due to the trigonometric functions.
Comparison
As an example, this is the typical flame graph I get without the trigonometric approximations (sorry, but I had to blur the image where proprietary code is called).
And this one is with approximations for
sin
andcos
.So, for me, the improvement was roughly 40% less CPU usage.
Parameters
use_sin_cos_approximation
under a new tab called "Performance". This toggles the approximations and defaults tofalse
.Formula
It is a pretty simple approximation, I take a second-order approximation in the 0 to 45 degrees interval, and then use trigonometric identities to replicate the result to the other quadrants. For comparison, these are the analytic functions and their approximations.
And these are the absolute errors.