diff --git a/charts/istio-alerts/Chart.yaml b/charts/istio-alerts/Chart.yaml index da659fb7..1e88fd0e 100644 --- a/charts/istio-alerts/Chart.yaml +++ b/charts/istio-alerts/Chart.yaml @@ -2,7 +2,7 @@ apiVersion: v2 name: istio-alerts description: A Helm chart that provisions a series of alerts for istio VirtualServices type: application -version: 0.3.2 +version: 0.4.0 appVersion: 0.0.1 maintainers: - name: diranged diff --git a/charts/istio-alerts/README.md b/charts/istio-alerts/README.md index 043598c5..78a4530e 100644 --- a/charts/istio-alerts/README.md +++ b/charts/istio-alerts/README.md @@ -1,6 +1,6 @@ # istio-alerts -![Version: 0.3.2](https://img.shields.io/badge/Version-0.3.2-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 0.0.1](https://img.shields.io/badge/AppVersion-0.0.1-informational?style=flat-square) +![Version: 0.4.0](https://img.shields.io/badge/Version-0.4.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 0.0.1](https://img.shields.io/badge/AppVersion-0.0.1-informational?style=flat-square) A Helm chart that provisions a series of alerts for istio VirtualServices @@ -13,6 +13,15 @@ A Helm chart that provisions a series of alerts for istio VirtualServices ## Upgrade Notes +### 0.3.x -> 0.4.x + +**BREAKING: http5XXMonitor no longer alerts per source client workload.** + +In version 0.2.x, there was a change to the default `http5XXMonitor` which +introduced calculation of the error rate per source workload. This 0.4.x +release reverts this behavior by default while allowing you to opt in to custom +selectors via the `monitorGroupingLabels` option. + ### 0.2.x -> 0.3.x **BREAKING: The DestinationServiceSelectorValidity alert rule requires kube-state-metrics.** @@ -22,6 +31,12 @@ you do not have kube-state-metrics installed, you will need to disable the alert `serviceRules.destinationServiceSelectorValidity.enabled` to `false`. This alert is used to detect if the destinationServiceSelector is actually selecting series for a service that exists. +### 0.2.x + +**BREAKING: http5XXMonitor now calculcates the 5XX error rate for each client** +source workload using the `source_workload` label, and will alert if any +`source_workload`'s error rate exceeds the specified `threshold`. + ## Values | Key | Type | Default | Description | @@ -48,10 +63,10 @@ if the destinationServiceSelector is actually selecting series for a service tha | serviceRules.highRequestLatency.percentile | float | `0.95` | Which percentile to monitor - should be between 0 and 1. Default is 95th percentile. | | serviceRules.highRequestLatency.severity | string | `"warning"` | Severity of the latency monitor | | serviceRules.highRequestLatency.threshold | float | `0.5` | The threshold for considering the latency monitor to be alarming. This is in seconds. | -| serviceRules.http5XXMonitor | object | `{"enabled":true,"for":"5m","monitorGroupingLabels":["destination_service_name","reporter","source_workload"],"severity":"critical","threshold":0.0005}` | Configuration related to the 5xx monitor for the VirtualService. | +| serviceRules.http5XXMonitor | object | `{"enabled":true,"for":"5m","monitorGroupingLabels":["destination_service_name","reporter"],"severity":"critical","threshold":0.0005}` | Configuration related to the 5xx monitor for the VirtualService. | | serviceRules.http5XXMonitor.enabled | bool | `true` | Whether to enable the monitor on 5xxs returned by the VirtualService. | | serviceRules.http5XXMonitor.for | string | `"5m"` | How long to evaluate the rate of 5xxs over. | -| serviceRules.http5XXMonitor.monitorGroupingLabels | list | `["destination_service_name","reporter","source_workload"]` | The set of labels to use when evaluating the ratio of the 5XX. | +| serviceRules.http5XXMonitor.monitorGroupingLabels[0] | string | `"destination_service_name"` | The set of labels to use when evaluating the ratio of the 5XX. | | serviceRules.http5XXMonitor.severity | string | `"critical"` | Severity of the 5xx monitor | | serviceRules.http5XXMonitor.threshold | float | `0.0005` | The threshold for considering the 5xx monitor to be alarming. Default is 0.05% error rate, i.e 99.95% reliability. | diff --git a/charts/istio-alerts/README.md.gotmpl b/charts/istio-alerts/README.md.gotmpl index 8878aca0..0026d281 100644 --- a/charts/istio-alerts/README.md.gotmpl +++ b/charts/istio-alerts/README.md.gotmpl @@ -8,6 +8,15 @@ ## Upgrade Notes +### 0.3.x -> 0.4.x + +**BREAKING: http5XXMonitor no longer alerts per source client workload.** + +In version 0.2.x, there was a change to the default `http5XXMonitor` which +introduced calculation of the error rate per source workload. This 0.4.x +release reverts this behavior by default while allowing you to opt in to custom +selectors via the `monitorGroupingLabels` option. + ### 0.2.x -> 0.3.x **BREAKING: The DestinationServiceSelectorValidity alert rule requires kube-state-metrics.** @@ -17,6 +26,12 @@ you do not have kube-state-metrics installed, you will need to disable the alert `serviceRules.destinationServiceSelectorValidity.enabled` to `false`. This alert is used to detect if the destinationServiceSelector is actually selecting series for a service that exists. +### 0.2.x + +**BREAKING: http5XXMonitor now calculcates the 5XX error rate for each client** +source workload using the `source_workload` label, and will alert if any +`source_workload`'s error rate exceeds the specified `threshold`. + {{ template "chart.requirementsSection" . }} {{ template "chart.valuesSection" . }} diff --git a/charts/istio-alerts/values.yaml b/charts/istio-alerts/values.yaml index 80666fb2..47eec170 100644 --- a/charts/istio-alerts/values.yaml +++ b/charts/istio-alerts/values.yaml @@ -99,11 +99,10 @@ serviceRules: # -- Severity of the 5xx monitor severity: critical - # -- The set of labels to use when evaluating the ratio of the 5XX. monitorGroupingLabels: + # -- The set of labels to use when evaluating the ratio of the 5XX. - destination_service_name - reporter - - source_workload # -- Configuration related to the latency monitor for the VirtualService. highRequestLatency: