Sloth APP

This is the Giant Swarm managed app for Sloth SLO framework.

Sloth generates understandable, uniform and reliable Prometheus SLOs for any kind of service. Using a simple SLO spec that results in multiple metrics and multi window multi burn alerts.

Visualize the SLOs

Sloth-app provides a Grafana dashboard allowing simple and quick visualization of the SLOs defined. Those dashboards are defined in the customizations/templates folder.

For more detailed information concerning the Sloth SLOs dashboard, check the official documentation

Example

Part of a Sloth SLO dasboard :

Part of a Sloth SLO Overview dashboard :

Rule management

Sloth allow to create PrometheusRules CR as well as plain prometheus rules and can run as an operator in a Kubernetes cluster (for the PrometheusRules CR). It also provides a CLI tool for plain prometheus rules.

Rule configuration

Sloth provides a lot of configurability concerning the rules, allowing for example the user to prevent a rule from paging by adding inhibition labels in the alerting section.

Sloth rule example :

apiVersion: sloth.slok.dev/v1
kind: PrometheusServiceLevel
metadata:
  name: kaas-phoenix-controller-manager-latency
  namespace: monitoring
  labels:
    release: prometheus
spec:
  service: "controller-manager"
  labels:
    component: "controller-manager"
  slos:
    - name: "latency"
      objective: 99
      description: Reconciliation time for each resource controlled by controller manager
      sli:
        events:
          errorQuery: |-
            clamp_min(sum(rate(workqueue_queue_duration_seconds_count{}[{{.window}}])) by (cluster_id) - sum(rate(workqueue_queue_duration_seconds_bucket{le="10"}[{{.window}}])) by (cluster_id), 0)
          totalQuery: |-
            sum(rate(workqueue_queue_duration_seconds_count{}[{{.window}}])) by (cluster_id)
      alerting:
        name: ControllerManagerReconciliationLatencyTooHigh
        labels:
          team: phoenix
          area: kaas
        annotations: {}
        pageAlert:
          labels:
            cancel_if_cluster_status_creating: "true"
            cancel_if_cluster_status_deleting: "true"
            severity: page
            team: phoenix
        ticketAlert:
          labels:
            severity: "slack"
            slack_channel: "#responsible-team"

Update to the latest version

Run

bash bin/import_upstream_chart

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
.abs		.abs
.ats		.ats
.circleci		.circleci
.github/workflows		.github/workflows
assets		assets
bin		bin
customizations/templates		customizations/templates
helm/sloth		helm/sloth
tests		tests
.nancy-ignore.generated		.nancy-ignore.generated
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODEOWNERS		CODEOWNERS
DCO		DCO
LICENSE		LICENSE
Makefile		Makefile
Makefile.gen.app.mk		Makefile.gen.app.mk
Makefile.gen.go.mk		Makefile.gen.go.mk
README.md		README.md
SECURITY.md		SECURITY.md
lintconf.yaml		lintconf.yaml
renovate.json5		renovate.json5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sloth APP

Visualize the SLOs

Example

Rule management

Rule configuration

Update to the latest version

Credit

About

Releases 16

Packages

Contributors 10

Languages

License

giantswarm/sloth-app

Folders and files

Latest commit

History

Repository files navigation

Sloth APP

Visualize the SLOs

Example

Rule management

Rule configuration

Update to the latest version

Credit

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases 16

Packages 0

Contributors 10

Languages

Packages