-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include some optional prometheus alert rules in the chart #221
Comments
@sunng87 I made some research on this topic. Most of the critical issues like high cpu+memory, missing nodes, crashing pods etc are already covered by the Prometheus Alert Rules that comes witht the kube-prometheus-stack: The only greptime specific alerts that make sense are then use case specific. In my case i would like to have alerts when there is no ingestion of new rows or no prometheus queries (prometheus compatible api) are made against the cluster. But i dont rely on the mysql endpoint or the native promql http endpoint for example, so alerts for them would make no sense for me. Providing them nevertheless leads to firing alerts in my monitoring solution. I guess the better approach in this case would be to either have each individual alerts rule be toggable through the values.yaml or provide a way to easily add custom alert rules. Attached an example for alert rules:
|
@Stephan3555 I agree most system level rules can be covered by kubernetes level alerts. For GreptimeDB, the traffic pattern may vary according to different use-cases. Since we want to offer a convenient approach for user to add alerting, there are several levels
I think we can get Level 2 as a start. But it still requires significant effort to setup. Let's see if there are levels between |
@sunng87 I can provide the helm chart component to generate dynamically the necessary configmaps. Would be nice if the rules itself can come from the Greptime team |
It would be nice to have some pre-built alert rules in this chart so user will have greptimedb specific alerts by default.
The text was updated successfully, but these errors were encountered: