-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
evaluate new ES-ILM backup / redundancy strategy #235
Comments
Notes from mtg: sounds doable via an API call (to test), perhaps use 90 days as rough time limit, validate how easy/hard it is to restore an archived index, double-check shard size spec, make sure changes for this don't require re-indexing |
We can do the ILM policy update via the ILM put API When the policy is updated, changes what take effect on our current index As per this image, our current max shard size is 17.5GB, so we should anticipate rollover when we ingest about triple our current data (this should be sooner than the alternate rollover action of 365 days) |
My proposal for setting up backups is:
Closing this and I'll set up different issues accordingly to capture these two tasks to be done at different times. This supports a two-prong an overall strategy for catastrophic index failure recovery:
|
With the ILM ES index architecture, @philbudne raised a question about reconsidering our redundancy approach. We now know that we can restore 2-3 months from WARC files in ~2 days. What if we roll-over via ILM to a new index every 2 months, and immediately backup the rolled-over index off-site. Then if we crash restoration is 2-ish days of downloading indexes and recreating the latest (un-backed up) index from .WARC files. I think this is an acceptable downtime, and we can always later add some kind of "hot" duplicate of the latest index if we want. The task here is to consider how to design and implementation for this, whether it would really work, and to make sure it is a good idea.
Related to #157, #231, #54.
The text was updated successfully, but these errors were encountered: