[Feature Request] Auto-recovery of cluster after hardware failure w/ remote store #11921
Labels
enhancement
Enhancement or improvement to existing feature or request
Storage:Durability
Issues and PRs related to the durability framework
Is your feature request related to a problem? Please describe
Today on auto-restore we aren't able to auto-recover due to cases like isolated primaries #3706 espl cases like no-replica where we need to build a robust mechanism to ensure we don't have divergent writes.
Describe the solution you'd like
Once such mechanism to support zero replica is to use a empty replica that hosts no data, only metadata of the shard to ensure it doesn't lead to additional storage costs. This replica would perform continuous no-op replication on every indexing request and on failure of the primary can be promoted to the primary after the data has been synced from the S3. This simplifies problems with isolated writers and makes the replication protocol easy to reason about
Related component
Storage:Durability
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: