Mimir testing #3578

Rotfuks · 2024-07-15T15:46:36Z

Motivation

n order to raise our confidence in the stability of our observability platform and be sure that our ongoing work and releases won't negatively impact our observability platform operations we need to create extensive tests giving us early feedback loops. As mimir is one of our core components, we should make sure it's thoroughly tested.

Todo

Investigate some good initial test cases to give us feedback on the stability of a release on mimir on CI. This can be:
- Validate Helm chart templating
- Deploy chart on Kind (single instance)
Implement those CI test cases
Investigate some good initial test cases to give us feedback on the stability of a release before releasing. This can be:
- Deploy chart on AWS, Azure, CAPI (later)
- Integration test; Verify all component work together
- Using canary to generate traffic
Implement those test cases

Make sure to stay with a set of minimal but valuable test cases, nothing to detailed and fancy.

Outcome

We have early, automatic feedback about the impact of a release of Mimir before releasing.

QuentinBisson · 2024-08-29T08:47:38Z

I think it would be nice to run mimir continous testing (similar to loki canary) for e2e tests

QuantumEnigmaa · 2024-08-29T08:52:26Z

Yeah that's a nice idea :)

QuentinBisson · 2024-09-04T21:19:14Z

I am running mimir continuous_test on grizzly with the following config:

mimir:
  continuous_test:
    enabled: true
    auth:
      tenant: anonymous

and this renders the following metrics:

and we could just use those alerts https://github.com/grafana/mimir/blob/f52911d917c8c52e0da6a59348a64dd7f7622072/operations/mimir-mixin-compiled/alerts.yaml#L1097

The only downside is that we need to wait for the next minor helm chart release or use a weekly version because this grafana/mimir#8654 is not yet released

QuantumEnigmaa · 2024-09-09T13:57:18Z

So what's the best plan of action IMO is to wait for the continuous testing to be a default config for our mimir before doing anything else.
In the meantime, I'll create a dashboard using the metrics from the rules' mixins and if it's good enough, I'll think about pushing it upstream as a mixins dashboard.

QuentinBisson · 2024-09-09T22:28:06Z

I think so yes, maybe we can have a pr ready with the alerts? The mixins contains some that could be useful

QuentinBisson · 2024-10-10T13:46:40Z

@QuantumEnigmaa we decided in retro to use the chart version rc0 for now but keep the old image of mimir 2.13

QuantumEnigmaa · 2024-10-10T13:59:07Z

All good with me 👍

QuentinBisson · 2024-10-15T08:22:01Z

We can start this again once we're done with multi-tenancy :)

QuentinBisson · 2024-11-06T14:56:36Z

Taken over the dashboard PR giantswarm/dashboards#624 to close the epic

QuentinBisson · 2024-11-07T08:23:26Z

Blocked waiting for reviews

QuentinBisson · 2024-11-14T17:13:56Z

Continuous test is enabled on all MCs

Added chart testing:

https://github.com/giantswarm/mimir-app/pull/133

Dashboard PR has been merged:

add mimir continous test dashboard dashboards#624

Alert based on failures under review:

add MimirContinuousTestFailingOnWrites and MimirContinuousTestFailing… prometheus-rules#1355

Test procedure tbd:

https://github.com/giantswarm/mimir-app/pull/138

QuentinBisson · 2024-11-15T11:06:30Z

All is done. Thanks @hervenicol for the reviews

Rotfuks mentioned this issue Jul 15, 2024

Early Feedback Loops #3565

Closed

github-project-automation bot added this to Roadmap Jul 15, 2024

github-project-automation bot moved this to Inbox 📥 in Roadmap Jul 15, 2024

Rotfuks added the team/atlas Team Atlas label Jul 15, 2024

QuantumEnigmaa self-assigned this Aug 27, 2024

This was referenced Sep 10, 2024

add MimirContinuousTestFailingOnWrites and MimirContinuousTestFailing… giantswarm/prometheus-rules#1355

Merged

add mimir continous test dashboard giantswarm/dashboards#624

Merged

QuentinBisson self-assigned this Nov 6, 2024

QuentinBisson closed this as completed Nov 15, 2024

github-project-automation bot moved this from Inbox 📥 to Done ✅ in Roadmap Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mimir testing #3578

Mimir testing #3578

Rotfuks commented Jul 15, 2024 •

edited

Loading

QuentinBisson commented Aug 29, 2024

QuantumEnigmaa commented Aug 29, 2024

QuentinBisson commented Sep 4, 2024

QuantumEnigmaa commented Sep 9, 2024

QuentinBisson commented Sep 9, 2024

QuentinBisson commented Oct 10, 2024

QuantumEnigmaa commented Oct 10, 2024

QuentinBisson commented Oct 15, 2024

QuentinBisson commented Nov 6, 2024

QuentinBisson commented Nov 7, 2024

QuentinBisson commented Nov 14, 2024 •

edited

Loading

QuentinBisson commented Nov 15, 2024

Mimir testing #3578

Mimir testing #3578

Comments

Rotfuks commented Jul 15, 2024 • edited Loading

Motivation

Todo

Outcome

QuentinBisson commented Aug 29, 2024

QuantumEnigmaa commented Aug 29, 2024

QuentinBisson commented Sep 4, 2024

QuantumEnigmaa commented Sep 9, 2024

QuentinBisson commented Sep 9, 2024

QuentinBisson commented Oct 10, 2024

QuantumEnigmaa commented Oct 10, 2024

QuentinBisson commented Oct 15, 2024

QuentinBisson commented Nov 6, 2024

QuentinBisson commented Nov 7, 2024

QuentinBisson commented Nov 14, 2024 • edited Loading

QuentinBisson commented Nov 15, 2024

Rotfuks commented Jul 15, 2024 •

edited

Loading

QuentinBisson commented Nov 14, 2024 •

edited

Loading