You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Failover Procedure will wait for the 40s if the Datanode is unreachable. If the unreachable Datanode restarts at the 23rd sec, the Datanode will open the old failed region (due to the table route not being updated after the failure occurred); a RegionAliveKeeper will wait for the 20s before the first lease arrives or closes the region (with flush). However, The procedure might step into the next stage, and the failed region will be opened in another Datanode. It incurs a region opened by two Datanodes, and the newer Datanode will overwrite the existing manifest, which may let us lose the indexes of the flushed files.
Implementation challenges
Maybe we need to introduce Intermediate states.
Remove the failed region in the TableRoute
Deactivate the failed region
Update TableRoute
Activate the region
The text was updated successfully, but these errors were encountered:
When a node is not granted permission (the first lease), we should prohibit it from accessing shared resources.
The problem now is that the restarted node accessed the shared resource (region) without permission. This is a wrong way. The correct approach is to first obtain permission (obtain the first lease), then access the shared resource.
What type of enhancement is this?
Tech debt reduction
What does the enhancement do?
The Failover Procedure will wait for the 40s if the Datanode is unreachable. If the unreachable Datanode restarts at the 23rd sec, the Datanode will open the old failed region (due to the table route not being updated after the failure occurred); a
RegionAliveKeeper
will wait for the 20s before the first lease arrives or closes the region (with flush). However, The procedure might step into the next stage, and the failed region will be opened in another Datanode. It incurs a region opened by two Datanodes, and the newer Datanode will overwrite the existing manifest, which may let us lose the indexes of the flushed files.Implementation challenges
Maybe we need to introduce Intermediate states.
TableRoute
TableRoute
The text was updated successfully, but these errors were encountered: