Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Don't waste time processing backfill events that we never show to the client #15632

Open
MadLittleMods opened this issue May 18, 2023 · 2 comments
Labels
A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.

Comments

@MadLittleMods
Copy link
Contributor

MadLittleMods commented May 18, 2023

As discovered while working on #15617 and a natural extension of #15585 to process previously failed backfill events in the background (follow-up to #13623),

It turns out, when we make a /messages request and trigger a backfill, Synapse will happily waste time trying hard to process events (_process_pulled_events) in the foreground that we end up never showing to clients. The one particularly bad culprit is the org.matrix.dummy_event which Synapse automatically puts in the DAG to resolve outstanding forward extremities. Since it has 10x random prev_events, it takes a while to fetch the state for each prev_event and persist. We determine whether some events should be shown to clients in the _check_filter_send_to_client(...) function

def _check_filter_send_to_client(
event: EventBase,
clock: Clock,
retention_policy: RetentionPolicy,
sender_ignored: bool,
) -> _CheckFilter:
"""Apply checks for sending events to client
Returns:
True if might be allowed to be sent to clients, False if definitely not.
"""
if event.type == EventTypes.Dummy:
return _CheckFilter.DENIED
if not event.is_state() and sender_ignored:
return _CheckFilter.DENIED
# Until MSC2261 has landed we can't redact malicious alias events, so for
# now we temporarily filter out m.room.aliases entirely to mitigate
# abuse, while we spec a better solution to advertising aliases
# on rooms.
if event.type == EventTypes.Aliases:
return _CheckFilter.DENIED
# Don't try to apply the room's retention policy if the event is a state
# event, as MSC1763 states that retention is only considered for non-state
# events.
if not event.is_state():
max_lifetime = retention_policy.max_lifetime
if max_lifetime is not None:
oldest_allowed_ts = clock.time_msec() - max_lifetime
if event.origin_server_ts < oldest_allowed_ts:
return _CheckFilter.DENIED
return _CheckFilter.MAYBE_ALLOWED

org.matrix.dummy_event can be particularly bad because they usually include a lot of disparate prev_events which take a while for us to work out the state for (long _get_state_groups_from_groups times).

Potential solution

We should use this same _check_filter_send_to_client(...) criteria when determining whether we should backfill a given event in the background or foreground.

We should be mindful an event that is filtered from the client, might still be used as a prev_event from another event in the backfill response that we will show to clients. And if we kick off the function to process org.matrix.dummy_event in the background, while the other event in the foreground, we will probably end up duplicating the work depending on the race.

We don't really have these same race -> duplicate work concerns with #15585 (events which previously failed to backfill) because it's a "fool me once" sort of situation and they are already potentially disconnected from what we were trying to backfill anyway given they failed before.


Also of interest, filter_events_for_server(...)

@MadLittleMods MadLittleMods added the A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) label May 18, 2023
@MadLittleMods MadLittleMods added the T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. label May 18, 2023
@reivilibre
Copy link
Contributor

I can't say I follow this issue — maybe it would help elaborating on what you mean by 'process' as I'm not familiar with the code in question and process could mean so many things :).
If you're intending to tackle this issue yourself soon then there's not really any need, but otherwise I can only note that personally I'd find it hard to say what this issue was getting at.

@MadLittleMods
Copy link
Contributor Author

@reivilibre I've updated the issue to link to _process_pulled_events

/messages -> /backfill -> _process_pulled_events

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

No branches or pull requests

2 participants