Autoscaling Ingesters #10607
Unanswered
Meydan-Ozeri
asked this question in
Help and support
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all i am trying to implement auto scaling for Ingesters.
For now i am trying to enable that using HPA. (later i will change it to Keda if i will see it's better since i see it being used here instead of the normal k8s HPA)
Before i say what i need help with i just want to mention i am aware of the experimental ingest storage using Kafka and making Ingesters stateless, i don't want to deploy another component just to be able to auto scale the Ingesters, also i prefer Apache Pulsar rather than Kafka since it's better suited for k8s and scaling in the cloud + this solution is already deployed in my cluster regardless of Mimir (so consider maybe adding ingest storage with Pulsar as well and not only Kafka ?).
Ok so what i understand from the docs, is that in order to downscale Ingesters i need to put the Ingester into read only mode, wait the time of query_store_after (default is 12h), then flush blocks and only then to shutdown.
So i added a sidecar to the ingesters to detect termination, block it and perform this process while the main container is blocked from shutting down using a lifecycle pre stop hook to allow the wait period, after the process is done both containers will shut down and the Ingester will be terminated.
This works as intended and no issues here.
The issue i am facing is that 12h (the default query_store_after) is too big of a wait period for me to wait for termination, i am trying to reduce it to 10m so that when the HPA downscales a pod, it will finish after 10m and not 12h.
All the default values are tuned to work together and i cannot just reduce it from 12h to 10m without changing a bunch of other stuff to, so i changed sync_interval and cleanup_interval to 5m, and then after reading further i see there are many other parameters which will need to change. (block_ranges, compaction_interval, deletion_delay, query_ingesters_within and maybe some others i am not aware about ?)
I would really appreciate if someone could point me in the right direction or help me with understanding how to tune it correctly to my needs, unfortunately the docs didn't help me with understanding how to do it.
Thank you in advance.
Beta Was this translation helpful? Give feedback.
All reactions