Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(ci): run tarpaulin in release mode #4139

Closed
wants to merge 2 commits into from
Closed

Conversation

bnjbvr
Copy link
Member

@bnjbvr bnjbvr commented Oct 16, 2024

This is a small experiment: it turns out compiling is relatively quick, on a PR I just did (3 minutes), but running the tarpaulin build took around 8 minutes. I'm wondering if increasing the compile times and lowering the run time is worth it, hence this experiment.

Copy link

codecov bot commented Oct 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.00%. Comparing base (30f3a3c) to head (3082fc8).
Report is 60 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4139      +/-   ##
==========================================
- Coverage   84.72%   78.00%   -6.73%     
==========================================
  Files         269      268       -1     
  Lines       28779    24773    -4006     
==========================================
- Hits        24384    19325    -5059     
- Misses       4395     5448    +1053     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bnjbvr
Copy link
Member Author

bnjbvr commented Oct 17, 2024

Results of the experiment:

  • baseline:
    • incremental build: ~3min30s
    • run: ~7min
  • compiling with O1:
    • non-incremental build: 15min12s
    • run: ~5min24s
  • compiling with O2:
    • non-incremental build: 29min19s
    • run: ~5min30s

Since the build times are not fair comparisons (this PR could only have non-incremental build times, while for the baseline we get incremental compile times, I'm curious if others would like to enable the O1 level on main for a day or two, compare how it fares in terms of incremental compile times, and revert it if it's definitely not an improvement? cc @poljar

@poljar
Copy link
Contributor

poljar commented Oct 17, 2024

Results of the experiment:

* baseline:
  
  * incremental build: ~3min30s
  * run: ~7min

* compiling with O1:
  
  * non-incremental build: 15min12s
  * run: ~5min24s

* compiling with O2:
  
  * non-incremental build: 29min19s
  * run: ~5min30s

Since the build times are not fair comparisons (this PR could only have non-incremental build times, while for the baseline we get incremental compile times, I'm curious if others would like to enable the O1 level on main for a day or two, compare how it fares in terms of incremental compile times, and revert it if it's definitely not an improvement? cc @poljar

Did you measure this locally or on CI? If we enable this on the CI how do we measure and compare this to our previous way, surely not via eyeballing?

@bnjbvr
Copy link
Member Author

bnjbvr commented Oct 17, 2024

Eyeballing the wall clock times reported in the log statements 👀 Even if not precise at the very second, these give us the right orders of magnitude IMO. What I'd really want is likely to get an idea of the cost of an incremental build, and that could be done locally if someone were very bored…

@bnjbvr
Copy link
Member Author

bnjbvr commented Oct 22, 2024

Ran some things during launch, to have a rough idea of how this would behave on my machine, in terms of compile times at least:

Task Wall-clock time User-time
tarpaulin -O0 initial 116 seconds 15.8 minutes
tarpaulin -O0 rebuild 20.23 seconds 65.85 seconds
tarpaulin -O1 initial 272 seconds 72.44 minutes
tarpaulin -O1 rebuild 73 seconds 15.44 minutes

So, while this doesn't tell us what kind of compile-time slowdown would be caused on CI (especially since my machine has lots of CPU cores, and I suspect CI doesn't), it tells us a rebuild might be between > 3 and 14 times slower than it is right now, and I suspect this would cancel the effects of having a faster runtime.

I'd still be inclined to try and merge this in production for a short while (a few days), rather than making back-of-the-envelope calculation like this, if others are interested. Otherwise I think it's ok to close this PR, there are other build-time optimizations to chase.

@bnjbvr bnjbvr requested a review from poljar October 22, 2024 14:01
This is a small experiment: it turns out compiling is relatively quick,
on a PR I just did (3 minutes), but running the tarpaulin build took
around 8 minutes. I'm wondering if increasing the compile times and
lowering the run time is worth it, hence this experiment.
@bnjbvr bnjbvr force-pushed the bnjbvr/tarpaulin-release branch from 4f90b01 to 1659038 Compare October 22, 2024 14:02
@bnjbvr bnjbvr marked this pull request as ready for review October 22, 2024 14:02
@bnjbvr bnjbvr requested a review from a team as a code owner October 22, 2024 14:02
@bnjbvr
Copy link
Member Author

bnjbvr commented Oct 22, 2024

We chatted briefly about this, and we think it's an easy revert in case it's a regression. Let's find out :)

@bnjbvr bnjbvr enabled auto-merge (rebase) October 22, 2024 14:14
@bnjbvr
Copy link
Member Author

bnjbvr commented Oct 22, 2024

A NEW RACE CONDITION APPEARS!

thread 'tests::timeline::test_enabling_backups_retries_decryption' panicked at testing/matrix-sdk-integration-testing/src/tests/timeline.rs:375:10:
We should be able to send a message to our new room: UnknownError("EncryptionNotEnabled")

@bnjbvr bnjbvr disabled auto-merge October 23, 2024 15:54
@bnjbvr
Copy link
Member Author

bnjbvr commented Oct 24, 2024

  1. The bug observed in the codecov task was an Heisenbug, since as soon as I added logging for debug, it disappeared.
  2. The regression in terms of coverage is because we start losing track of some lines that are effectively tested. Here's one example: https://app.codecov.io/gh/matrix-org/matrix-rust-sdk/pull/4139/blob/crates/matrix-sdk-base/src/rooms/normal.rs#L1420 It can't be that line 1421 be tested without the other lines now marked as missed being reached (we have tests that will exercise these code paths indirectly, e.g. here).

Because of (2), I think we should close this. It's possible that enabling O1 enables some extra inlining that confuses the codecov instrumentation, thus losing track of locations in the coverage. It's a bit of an unfortunate outcome, but at least we learned a thing 🥲

@bnjbvr bnjbvr closed this Oct 24, 2024
@bnjbvr bnjbvr deleted the bnjbvr/tarpaulin-release branch November 5, 2024 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants