This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Could not serialize access due to concurrent DELETE
from presence_stream
#15827
Labels
A-Database
DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db
A-Presence
O-Uncommon
Most users are unlikely to come across this or unexpected workflow
S-Tolerable
Minor significance, cosmetic issues, low or no impact to users.
T-Defect
Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
So this is annoying. And while it doesn't always occur so regularly, it does seem to come in batches. I took a little while to investigate this and found out:
PresenceStore.update_presence()
, takes between 35-47ms to complete when this happens, but only 5-25ms when it doesn't(on my two user workerized-homeserver). This appears to be due to the rollback of the transaction before retrying.(These numbers should not be considered standardized, as variance across several factors can influence them. e.g. Monolith Postgres would sometimes take 11-95ms without any contention. The point is that a rollback transaction can almost double the time it takes to complete it, or more if it has to try multiple times.)PresenceStore
function is called:_on_shutdown()
- Fairly self-evident what this does._persist_unpersisted_changes()
- used like a drag net to make sure anything that still needs to be written to the database gets done, as otherwise it will stack up and affect shutdown times. Run as a looping call every 60 seconds.handle_presence_timeouts
(which is actually_handle_timeouts()
) ->_update_states()
->_persist_and_notify()
- which is the main point of entry for changes in presence states, and is responsible for sending changes over federation and persisting those changes. Run as a looping call every 5 seconds.The main reason this seems to occur is that
_persist_unpersisted_changes()
and_persist_and_notify()
sometimes run at the same time, due to overlapping timeouts. The process of persisting presence changes includes running aDELETE
before theINSERT
, in order to clean up previous Stream ID's. When_persist_unpersisted_changes()
runs at the same time as_persist_and_notify()
it has a lower Stream ID which seems to be confusing the database/transaction. The solution seems to be deal with this in a way that allows the transaction to serialize.The text was updated successfully, but these errors were encountered: