Define a curated list of priority classes for Giant Swarm workload #3483

QuentinBisson · 2024-06-05T12:50:25Z

We came to the conclusion in @giantswarm/sig-architecture that we need to define a list of curated priority classes for giant swarm workload so that we do not have too many (let's not have 1 per app like we have right now with crossplane and flux), but enough to make sure that highly critical (kyverno, prometheus-agents), critical (promtail) and a bit less critical components (fluent-bit) are scheduled with more priority than other workloads.

We currently have the following classes:

Workload clusters

NAME VALUE GLOBAL-DEFAULT

giantswarm-critical 1000000000 false
system-cluster-critical 2000000000 false
system-node-critical 2000001000 false

Management clusters

NAME VALUE GLOBAL-DEFAULT
crossplane-critical 600000000 false
flux-giantswarm-flux-giantswarm 1000000000 false
giantswarm-critical 1000000000 false
prometheus 500000000 false
system-cluster-critical 2000000000 false
system-node-critical 2000001000 false

Goals of this issue is to:

Establish a curated set of priority classes for all apps
Ensure we have the same priority classes on management and workload clusters and that they are deployed the same way (giantswarm-critical being deployed by the chart-operator)

@giantswarm/sig-architecture Do you have the a list of components that we run in CAPI clusters so I can create a table with their priority classes?

piontec · 2024-06-06T06:48:46Z

OK, so it seems flux-giantswarm should just use giantswarm-critical. I think we need something like ginatswarm-high, lower prio than critical, but still for, well, important stuff, like crossplane.
Maybe to get started we keep the 2 system*, as they don't really apply to "normal" apps, I believe, but really critical system components. Then, for important apps, we could use something like:

giantswarm-critical
giantswarm-very-high
giantswarm-high

WDYT?

QuentinBisson · 2024-06-06T09:14:09Z

I would be more enclined to go with:

giantswarm-critical
giantswarm-high
giantswarm-medium

That way we can add giantswarm-low if we need to. I'm not sure I would add things inbetween those.

We could instead use the following (yes the migration effort will take time so we could have giantswarm-critical = giantswarm-high):

giantswarm-high
giantswarm-medium
giantswarm-low

Now I'm not sure what component would be in each though. Are we fine with prometheus being in the high priority ?

QuentinBisson · 2024-07-24T12:37:00Z

It seems this issue does not have a lot of tractions @piontec :D

QuentinBisson · 2024-09-03T14:47:11Z

@giantswarm/sig-architecture does anyone have thoughts on this? I don't think it's a really huge topic :)

JosephSalisbury · 2024-09-20T15:27:27Z

@QuentinBisson want to throw this on the sig arch agenda for next week, and we can just thrash it out?

QuentinBisson · 2024-09-20T17:09:22Z

Hey i'm always up for closing issues

JosephSalisbury · 2024-09-25T12:14:49Z

from sig architecture:

we're generally fine with:

system-cluster-critical - upstream cluster critical
system-node-critical - upstream node critical
giantswarm-critical - stuff that the cluster absolutely requires to run and giant swarm is adding (i.e: kyverno)
giantswarm-high - stuff that should preempt customer workloads

next steps:

define delivery for priority classes
move apps that use priority classes to use our new priority classes (and get rid of custom priority classes)
look over all apps and see if they should use a priority class (i.e: if they don't already)

JosephSalisbury · 2024-09-25T13:56:53Z

hola @yulianedyalkova - we had a bit of a chat about this in sig architecture, i reckon it makes most sense for tenet

there's basically no urgency on it (atlas is unblocked from their original issue), it just would be good to clean up these priority classes eventually. we had some ideas from sig arch above, but those can be entirely changed depending

plz ping me here or on slack if you have any other questions <3

QuentinBisson added the feature-request label Jun 5, 2024

QuentinBisson self-assigned this Jun 5, 2024

architectbot added the team/atlas Team Atlas label Jun 5, 2024

JosephSalisbury unassigned QuentinBisson Sep 25, 2024

JosephSalisbury added team/tenet Team Tenet and removed team/atlas Team Atlas labels Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define a curated list of priority classes for Giant Swarm workload #3483

Define a curated list of priority classes for Giant Swarm workload #3483

QuentinBisson commented Jun 5, 2024

piontec commented Jun 6, 2024

QuentinBisson commented Jun 6, 2024

QuentinBisson commented Jul 24, 2024

QuentinBisson commented Sep 3, 2024

JosephSalisbury commented Sep 20, 2024

QuentinBisson commented Sep 20, 2024

JosephSalisbury commented Sep 25, 2024 •

edited

Loading

JosephSalisbury commented Sep 25, 2024

Define a curated list of priority classes for Giant Swarm workload #3483

Define a curated list of priority classes for Giant Swarm workload #3483

Comments

QuentinBisson commented Jun 5, 2024

Workload clusters

Management clusters

piontec commented Jun 6, 2024

QuentinBisson commented Jun 6, 2024

QuentinBisson commented Jul 24, 2024

QuentinBisson commented Sep 3, 2024

JosephSalisbury commented Sep 20, 2024

QuentinBisson commented Sep 20, 2024

JosephSalisbury commented Sep 25, 2024 • edited Loading

JosephSalisbury commented Sep 25, 2024

JosephSalisbury commented Sep 25, 2024 •

edited

Loading