Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to AL2023 ami for nvidia and inferentia nodes #988

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

chiragjn
Copy link
Member

No description provided.

SOCI_KUBELET_CONFIG_PATH="$KUBELET_CONFIG_DIR/99-soci.conf"
CONTAINERD_CONFIG_FILEPATH="/etc/containerd/config.toml"
BACKUP_CONTAINERD_CONFIG_FILEPATH="$CONTAINERD_CONFIG_FILEPATH.bak"
SOCI_RELEASE_VERSION="0.7.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we try out the version 0.8..0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we try out the version 0.8..0

We should pick this up seperatly. Let us release this first.

@@ -23,7 +23,7 @@ karpenter:
## @param karpenter.defaultNodeTemplate.extraTags [object] Additional tags for the node template.
extraTags: {}
## @param karpenter.defaultNodeTemplate.amiFamily AMI family to use for node template
amiFamily: ""
amiFamily: "AL2023"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't apply this, When karpenter was migrated to v1 the webhook migrated itself to use amiSelectorTerms and not amiFamily so if you apply this a drift will be detected in the nodetemplate causing all nodes to disrupt

@@ -97,7 +97,7 @@ karpenter:
# Set this to true to enable EC2 detailed cloudwatch monitoring
detailedMonitoring: false
## @param karpenter.gpuDefaultNodeTemplate.amiFamily AMI family to use for node template
amiFamily: ""
amiFamily: "AL2023"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't apply this

@@ -106,7 +106,7 @@ karpenter:
# - name: my-ami
# - id: ami-123
amiSelectorTerms:
- alias: al2@latest
- alias: al2023@latest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to release this carefully. This will disrupt all the GPU nodes

@@ -183,7 +183,7 @@ karpenter:
## @param karpenter.controlPlaneNodeTemplate.extraTags [object] Additional tags for the node template.
extraTags: {}
## @param karpenter.controlPlaneNodeTemplate.amiFamily AMI family to use for node template
amiFamily: ""
amiFamily: "AL2023"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't apply this

@@ -366,7 +366,7 @@ karpenter:
# Set this to true to enable EC2 detailed cloudwatch monitoring
detailedMonitoring: false
## @param karpenter.critical.nodeclass.amiFamily AMI family to use for node template
amiFamily: ""
amiFamily: "AL2023"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't apply this

Copy link

This PR will be automatically closed as it has been stale for 4 days. Please comment if you feel this is still relevant.

@github-actions github-actions bot added the stale label Jan 27, 2025
Copy link

This PR has been closed as it has been stale for 7 days. Please reopen if you feel this is still relevant.

@github-actions github-actions bot closed this Jan 30, 2025
@chiragjn chiragjn reopened this Jan 30, 2025
@chiragjn chiragjn removed the stale label Jan 30, 2025
Copy link

github-actions bot commented Feb 3, 2025

This PR will be automatically closed as it has been stale for 4 days. Please comment if you feel this is still relevant.

@github-actions github-actions bot added the stale label Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants