-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Reduce replication traffic due to reflected cache stream POSITION #16557
Conversation
synapse/replication/tcp/resource.py
Outdated
# echo each write, and b) nothing cares if a given | ||
# worker's caches stream position lags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that workers don't need to track other workers cache stream positions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is nothing uses the global "persisted upto position" for the caches stream, as everything that uses the caches streams only care about each streams position, not any sort of "combined" position. The reason we care about that for other streams is there is some code that still thinks that all streams can be considered as a single integer (e.g. federation and application sending code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
global "persisted upto position" for the caches stream
Just to refresh my memory -- this is the minimum persisted position across all workers, correct?
everything that uses the caches streams only care about each streams position
Because the caches stream gets blasted to everyone and they just clear their caches but we don't really care about tracking it? Do we even care about each stream's position in it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note I've just pushed a commit that changes things a bit)
global "persisted upto position" for the caches stream
Just to refresh my memory -- this is the minimum persisted position across all workers, correct?
Yup.
everything that uses the caches streams only care about each streams position
Because the caches stream gets blasted to everyone and they just clear their caches but we don't really care about tracking it? Do we even care about each stream's position in it?
There are two usages of the caches stream:
- Reliably receiving updates about cache invalidations that happened on other workers. We use the stream position here purely to detect when gaps happened.
- We sometimes want to wait for the caches stream to get to a particular position (this where we do a HTTP request and we want to wait for the action to replicate to the requesting worker after receiving a response).
4d8735a
to
22b2cca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fairly certain this is OK.
Follow on from / actually correctly does #16557
This was exacerbated by #16473