Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(utd_hook): Fix regression causing retry to report false late decrypt #4252

Merged
merged 5 commits into from
Nov 19, 2024

Conversation

BillCarsonFr
Copy link
Member

@BillCarsonFr BillCarsonFr commented Nov 12, 2024

There has been a recent change on Decryptor#decrypt_event_impl causing the function to return an TimelineEvent of kind unable to decrypt instead of failing with an error.

The late_decrypt detection code was not changed, causing any retry to mark UTDs as late decrypt.

  • Public API changes documented in changelogs (optional)

Signed-off-by:

@BillCarsonFr BillCarsonFr requested a review from richvdh November 12, 2024 16:57
Copy link

codecov bot commented Nov 12, 2024

Codecov Report

Attention: Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.

Project coverage is 84.91%. Comparing base (9dd2d5e) to head (c45f039).
Report is 45 commits behind head on main.

Files with missing lines Patch % Lines
...rates/matrix-sdk-ui/src/timeline/controller/mod.rs 77.77% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4252      +/-   ##
==========================================
- Coverage   84.95%   84.91%   -0.04%     
==========================================
  Files         274      274              
  Lines       29768    29772       +4     
==========================================
- Hits        25288    25282       -6     
- Misses       4480     4490      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@BillCarsonFr BillCarsonFr marked this pull request as ready for review November 12, 2024 17:21
@BillCarsonFr BillCarsonFr requested a review from a team as a code owner November 12, 2024 17:21
@richvdh
Copy link
Member

richvdh commented Nov 12, 2024

According to @BillCarsonFr: this bug should have caused the UTD rate reported to Posthog to (incorrectly) drop significantly. We haven't actually observed that -- possibly because the relevant change f4e8222 (#4070) hasn't rolled out very far yet?

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks correct to me; sorry for missing this before.

It would be good if we had a regression test.

@BillCarsonFr
Copy link
Member Author

BillCarsonFr commented Nov 13, 2024

According to @BillCarsonFr: this bug should have caused the UTD rate reported to Posthog to (incorrectly) drop significantly. We haven't actually observed that -- possibly because the relevant change f4e8222 (#4070) hasn't rolled out very far yet?

Actually there is no grace period on rust sdk UTD reporting like on other platform; On other platforms, an event decrypted in less than 4s is graced out and not reported. It is not the case in rust sdk, so the UTD would be reported immediatly but as a temporary UTD (with timeToDecryptMillis set to a value).
So we shouldn't see a drop in posthog. What could be expected is a raise of temporary UTD.

@BillCarsonFr
Copy link
Member Author

This change looks correct to me; sorry for missing this before.

It would be good if we had a regression test.

Added a regression test

@BillCarsonFr BillCarsonFr requested a review from richvdh November 15, 2024 16:29
Comment on lines +305 to +313
match olm_machine
.try_decrypt_room_event(raw.cast_ref(), room_id, &decryption_settings)
.await?
{
RoomEventDecryptionResult::Decrypted(decrypted) => Ok(decrypted.into()),
RoomEventDecryptionResult::UnableToDecrypt(utd_info) => {
Ok(TimelineEvent::new_utd_event(raw.clone(), utd_info))
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed, out of curiosity? This is for the test timeline; is it operating that differently from the non-test timeline?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed to make the regression test ~"fails correctly".
If not the test never detects the behavior change because it is using the mock Decryptor that continues to fail with an error when there is an UTD instead of return a OK UTDEvent type.

The underlying problem is that we use a big catch everything Error for all possible errors of the SDK. And here the change that created the regression should have resulted in a change of signature of the trait. It is not anymore Result<, ALLERRORS> but a Result <,SubsetOfErrors>?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'm slightly scared that we might be reproducing an error only in testing conditions, but this code is likely going away at some point…

@bnjbvr
Copy link
Member

bnjbvr commented Nov 18, 2024

(Great catch and thanks for fixing, btw)

@BillCarsonFr BillCarsonFr requested a review from bnjbvr November 19, 2024 10:32
Copy link
Member

@bnjbvr bnjbvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment on lines +305 to +313
match olm_machine
.try_decrypt_room_event(raw.cast_ref(), room_id, &decryption_settings)
.await?
{
RoomEventDecryptionResult::Decrypted(decrypted) => Ok(decrypted.into()),
RoomEventDecryptionResult::UnableToDecrypt(utd_info) => {
Ok(TimelineEvent::new_utd_event(raw.clone(), utd_info))
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'm slightly scared that we might be reproducing an error only in testing conditions, but this code is likely going away at some point…

@bnjbvr bnjbvr merged commit 6b80055 into main Nov 19, 2024
40 checks passed
@bnjbvr bnjbvr deleted the valere/fix_late_utd_report_regression branch November 19, 2024 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants