-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rollout operator with mimir-distributed helm chart not upgrading Pods #14
Comments
With rollout operator killed off, after another update of the config: apiVersion: apps/v1
kind: StatefulSet
metadata:
annotations:
checksum/config: eb54c06d95c2e592f6c00fef442070c26c355f3178d03cbaab32c149534b0b3a
meta.helm.sh/release-name: krajo
meta.helm.sh/release-namespace: dev
rollout-max-unavailable: "10"
creationTimestamp: "2022-04-28T16:16:58Z"
generation: 4
labels:
app.kubernetes.io/component: store-gateway
app.kubernetes.io/instance: krajo
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: mimir
app.kubernetes.io/part-of: memberlist
app.kubernetes.io/version: 2.0.0
helm.sh/chart: mimir-distributed-2.0.9
rollout-group: store-gateway
zone: zone-a
name: krajo-mimir-store-gateway-zone-a
namespace: dev
resourceVersion: "2905246"
selfLink: /apis/apps/v1/namespaces/dev/statefulsets/krajo-mimir-store-gateway-zone-a
uid: 88d76d77-f8e2-4023-8775-b460b078d4a2
spec:
podManagementPolicy: OrderedReady
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/component: store-gateway
app.kubernetes.io/instance: krajo
app.kubernetes.io/name: mimir
rollout-group: store-gateway
zone: zone-a
serviceName: krajo-mimir-store-gateway-headless
template:
metadata:
annotations:
checksum/config: eb54c06d95c2e592f6c00fef442070c26c355f3178d03cbaab32c149534b0b3a
creationTimestamp: null
labels:
app.kubernetes.io/component: store-gateway
app.kubernetes.io/instance: krajo
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: mimir
app.kubernetes.io/part-of: memberlist
app.kubernetes.io/version: 2.0.0
helm.sh/chart: mimir-distributed-2.0.9
rollout-group: store-gateway
zone: zone-a
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: target
operator: In
values:
- store-gateway
- key: target
operator: NotIn
values:
- store-gateway-zone-a
topologyKey: kubernetes.io/hostname
containers:
- args:
- -target=store-gateway
- -config.file=/etc/mimir/mimir.yaml
- -store-gateway.sharding-ring.instance-availability-zone=zone-a
image: grafana/mimir:2.0.0
imagePullPolicy: IfNotPresent
name: store-gateway
ports:
- containerPort: 8080
name: http-metrics
protocol: TCP
- containerPort: 9095
name: grpc
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /ready
port: http-metrics
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 100m
memory: 512Mi
securityContext:
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/mimir
name: config
- mountPath: /var/mimir
name: runtime-config
- mountPath: /data
name: storage
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: krajo-mimir
serviceAccountName: krajo-mimir
terminationGracePeriodSeconds: 240
volumes:
- name: config
secret:
defaultMode: 420
secretName: krajo-mimir-config
- configMap:
defaultMode: 420
name: krajo-mimir-runtime
name: runtime-config
updateStrategy:
type: OnDelete
volumeClaimTemplates:
- apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
name: storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: microk8s-hostpath
volumeMode: Filesystem
status:
phase: Pending
status:
availableReplicas: 1
collisionCount: 0
currentRevision: krajo-mimir-store-gateway-zone-a-6b64ccdc98
observedGeneration: 4
readyReplicas: 1
replicas: 1
updateRevision: krajo-mimir-store-gateway-zone-a-764d89475 |
So it turns out to be an issue of a missing "name" label in the statefulset template (not object name, but actual label) required by the operator here: https://github.com/grafana/rollout-operator/blob/main/pkg/controller/controller.go#L402 User suggestions and questions:
" |
I think we can remove the |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Reproduction steps:
Install mimir from grafana/helm-charts#1205 , enable for example store-gateway zone aware replication , i.e. via custome values.yaml:
After installation, write a letter into the
mimir.config
, just to alter its checksum.Expected (works without rollout op): store-gateway Pods are restarted to take in the new configuration.
Actual: nothing happens, Pods are not restarted.
Additional info:
Rollout operator prints reconciled store-gateway statefulsets messages.
Before change to config, the statefullset state is:
After the upgrade:
I've added the checksum on statefulset itself as annotation but didn't help.
The text was updated successfully, but these errors were encountered: