Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change in Block Device Mappings does not create drift and nodes are not replaced #5874

Closed
ob-uk opened this issue Mar 15, 2024 · 3 comments
Closed
Labels
bug Something isn't working needs-triage Issues that need to be triaged

Comments

@ob-uk
Copy link

ob-uk commented Mar 15, 2024

Description

Observed Behavior:
When I change the underlying EC2NodeClass for the nodes by changing the spec.blockDeviceMappings (disk size specifically), Karpenter does not detect (or ignores) drift and nodes with previous disk size stick around.

Expected Behavior:
Karpenter should detect drift and start replacing the nodes that do not match the current EC2NodeClass, as is the case for other fields.

Reproduction Steps (Please include YAML):

  • Apply the following EC2NodeClass (use the appropriate values)
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
  namespace: karpenter
spec:
  amiFamily: AL2
  role: REPLACEME
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: REPLACEME
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: REPLACEME
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: 100Gi
        volumeType: gp3
        iops: 3000
  • Use a NodePool that references this EC2NodeClass.
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
  namespace: karpenter
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: kubernetes.io/os
          operator: In
          values: ["linux"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["2"]
      nodeClassRef:
        name: default
  limits:
    cpu: 1000
  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: 720h # 30 * 24h = 720h
  • Submit some pods to the cluster for Karpenter to provision nodes.
  • Edit the disk size in the EC2NodeClass
  • Karpenter will not remove old nodes.

Versions:

  • Karpenter Version: v0.33.1
  • Kubernetes Version: 1.28

If this is indeed a bug and not a design decision, I'd be happy to take it up. From my point of view, there are certain cases where blockDeviceMappings should cause immediate replacement of nodes, so there should be a way to allow Karpenter to do that.

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@ob-uk ob-uk added bug Something isn't working needs-triage Issues that need to be triaged labels Mar 15, 2024
@garvinp-stripe
Copy link
Contributor

@jonathan-innis
Copy link
Contributor

Might fix our issue

Unfortunately, that change doesn't fix it. There's a PR out here #5454 to fix this but we needed some of the Drift Hash Versioning work to go in to enable us to change this in the API. That's done now so I'd keep an eye on #5454 since it should be merged soon.

@jonathan-innis
Copy link
Contributor

Also, this appears to be a duplicate of #5447 so closing this one out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage Issues that need to be triaged
Projects
None yet
Development

No branches or pull requests

3 participants