-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Performance issue, 100% CPU for some time related to _process_new_pulled_events_with_failed_pull_attempts #16635
Comments
What version of Synapse did you upgrade from? That particular error comes while backfilling events and comes from: synapse/synapse/handlers/federation_event.py Lines 1736 to 1748 in 43d1aa7
It looks like all those events are from the room |
I upgraded from 1.92.3 to 1.95.0 then later to 1.95.1
How did you know which room these events are related to? After a quick look, I won't be sad if I have to purge this room from the server.
I have it enabled for a long time yes, since Tue Aug 10 18:00:55 2021 exactly. And I was also afraid that it might be related. |
Event IDs are globally unique, so I did: SELECT room_id FROM events WHERE event_id = '...';
There were known bugs in the feature until Synapse ~1.94, so I think this might be the cause. You could purge the room entirely and then rejoin it. I know that's not an ideal solution though. |
But if the event has been dropped as the log suggests, can I still try this query on my own homeserver?
That's what I saw by following a little bit the project on GitHub.
Thank you for your advice, I just purged the room. We only had a single user from club1.fr in this room so that's not a big loss. |
Feel free to shout if you see it come back! |
Description
Since the upgrade to Synapse 1.95 and 1.95.1 we some time have around 20minutes of unresponsiveness, due to the CPU usage going up to 100%. We have a small instance no workers and have enabled retention policy.
In the logs we have a LOT of these messages when this happens (thousands):
Not only from matrix.org, but from a lot of different servers.
Steps to reproduce
Sorry I really don't know what to add in my case.
Homeserver
club1.fr
Synapse Version
1.95.1
Installation Method
Debian packages from packages.matrix.org
Database
PostgreSQL single server
Workers
Single process
Platform
Configuration
I just disabled presence but it was enabled until now.
Retention policy is ON:
Relevant log output
Anything else that would be useful to know?
(the purple one is
synapse- _process_new_pulled_events_with_failed_pull_attempts
)The text was updated successfully, but these errors were encountered: