vdire: plug the remaining vcl temperature change races #4259
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The context for this change is #4142, but while it builds upon it, it is still causally unrelated. This PR replaces #4205
Problem
To motivate this change, let's look at
vcl_set_state()
as of before the patch, and consider some non-issues and issues:transition to COLD
non-issues:
At point A, backends could get added. For
VCL_TEMP_COOLING
, this fails (seeVRT_AddDirector()
). ForVCL_TEMP_COLD
, the state is consistent, because backends get created cold and that's it.At point A, backends could get removed.
vdire_resign()
from #4142 ensures that the temperature passed tovcldir_retire()
is consistent with the events the backends have received.transition to WARM (success)
issues:
(1) In region B, backends could get removed. They will not have received a WARM event, but will be called with a WARM temperature, so they will receive a bogus COLD event.
(2) Also in region B, backends could get added. They will implicitly receive a WARM event (see
VDI_event()
inVRT_AddDirector()
), and then another fromvcl_BackendEvent()
transition to WARM (failed)
issues:
(3) Backends added in region B will have received the implicit WARM event, and thus need a COLD event for the "cancel".
Solution
To solve these issues, we need to do two things:
There needs to be some kind of transaction which begins with the temperature change and ends when all backends have been notified appropriately. Backends can not get deleted while the transaction is in progress.
We need a notion of "backends from before the temperature change" and "backends from after".
The first part is already delivered by #4142: The vdire facility already provides the transaction during which backends do not actually get deleted and it ensures that the right temperature gets passed when the deletion is carried out. So for this part, we only need to use vdire.
But issues (2) and (3) are not yet covered. For these, we add a checkpoint, such that we know which directors from the list are the "base" from before the temperature change and which are the "delta" added after it.
That's this patch.
vdire_start_event()
atomically (under thevcl_mtx
) sets the checkpoint and the new temperature.vdire_end_event()
just uses the existingvdire_end_iter()
and clears the checkpoint.vcl_BackendEvent()
gets split into two:backend_event_base()
notifies backends from before the checkpoint.backend_event_delta()
atomically sets a new temperature and notifies backends from after the checkpoint (but not from after its temperature change).Fixes #4199