Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define structure of var.worker_pools (instead of type=any) - Unable to set image_id of worker node pools #913

Open
lodotek opened this issue Apr 4, 2024 · 2 comments

Comments

@lodotek
Copy link

lodotek commented Apr 4, 2024

var.worker_pools is of type any:

description = "Tuple of OKE worker pools where each key maps to the OCID of an OCI resource, and value contains its definition."

It would be nice to see this defined more concretely. I'm unable to figure out how to make my TF state diff go away, as my cluster's worker nodes are using Oracle-Linux-8.9-2024.01.26-0-OKE-1.26.7-679, however - when I run terraform it wants to downgrade my node pool image to Oracle-Linux-8.8-2023.12.13-0-OKE-1.26.7-668.

I've tried setting var.worker_pools.my_node_pool_name.image_type to CUSTOM and setting var.worker_pools.my_node_pool_name.image_id to my desired image id, but no matter what I try, I get the following error:

│ Error: Invalid function argument
│ 
│   on .terraform/modules/oke/modules/workers/locals.tf line 104, in locals:
│  104:       image_id = (pool.image_type == "custom" ? pool.image_id : element(tolist(setintersection([
│  105:         pool.image_type == "oke" ?
│  106:         setintersection(
│  107:           lookup(var.image_ids, "oke", null),
│  108:           lookup(var.image_ids, trimprefix(lower(pool.kubernetes_version), "v"), null)
│  109:         ) :
│  110:         lookup(var.image_ids, "platform", null),
│  111:         lookup(var.image_ids, pool.image_type, null),
│  112:         length(regexall("GPU", pool.shape)) > 0 ? var.image_ids.gpu : var.image_ids.nongpu,
│  113:         length(regexall("A1\\.", pool.shape)) > 0 ? var.image_ids.aarch64 : var.image_ids.x86_64,
│  114:         lookup(var.image_ids, format("%v %v", pool.os, split(".", pool.os_version)[0]), null),
│  115:       ]...)), 0))
│ 
│ Invalid value for "other_sets" parameter: argument must not be null.

Details about my config:
Using source "oracle-terraform-modules/oke/oci" / version "5.1.3"

module "oke" {
  source                       = "oracle-terraform-modules/oke/oci"
  version                      = "5.1.3"
  allow_bastion_cluster_access = true
  allow_worker_ssh_access      = true
  api_fingerprint              = var.api_fingerprint
  api_private_key_path         = var.api_private_key_path
  assign_dns                   = false
  bastion_allowed_cidrs        = ["0.0.0.0/0"]
  cluster_name                 = "${var.label_prefix}-oke"
  cluster_type                 = "enhanced"
  cni_type                     = "flannel"
  compartment_id               = var.compartment_id
  control_plane_allowed_cidrs  = var.vcn_cidrs
  create_bastion               = false
  create_operator              = false
  create_vcn                   = false
  home_region                  = var.oci_regions["iam"] # The tenancy's home region. Required to perform identity operations.
  image_signing_keys           = []
  kubernetes_version           = var.kubernetes_version
  lockdown_default_seclist     = false
  nat_gateway_route_rules      = var.nat_gateway_route_rules
  region                       = var.oci_regions["local"]
  services_cidr                = null
  ssh_private_key_path         = var.ssh_private_key_path
  ssh_public_key_path          = var.ssh_public_key_path
  state_id                     = var.label_prefix
  subnets                      = var.subnets
  tenancy_id                   = var.tenancy_id
  vcn_id                       = module.vcn.vcn_id
  worker_pool_mode             = "node-pool"
  worker_pools                 = var.worker_pools
  worker_image_id              = var.worker_image_id
  worker_image_os_version      = var.worker_image_os_version
}

relevant set TFVARS (I've been trying all sorts of combinations to try and get the TF state diff to go away about the image_id:

worker_image_id         = "ocid1.image.oc1.us-chicago-1.aaaaaaaap5q6kp2lfc3acysrczhenm3v7jeavxu2tvas2jod3zkwymihrz7a"
worker_image_os_version = "8.9"
worker_pools = {
  apps = {
    autoscale                    = false,
    boot_volume_size             = 200,
    force_node_delete            = true,
    image_id                     = "ocid1.image.oc1.us-chicago-1.aaaaaaaap5q6kp2lfc3acysrczhenm3v7jeavxu2tvas2jod3zkwymihrz7a",
    image_os_version             = "8.9",
    image_os                     = "Oracle Linux",
    image_type                   = "oke",
    mode                         = "node-pool",
    node_cycling_enabled         = true
    node_cycling_max_surge       = 2
    node_cycling_max_unavailable = 0
    node_labels = {
      app      = "banyan",
      name     = "apps",
      nodetype = "OracleLinux",
    },
    ocpus         = 4,
    placement_ads = [1, 2, 3]
    shape         = "VM.Standard.E4.Flex",
    size          = 3,
  }
}

Please advise

@robo-cap
Copy link
Member

robo-cap commented Apr 5, 2024

image_type attribute should be set to custom.

@hyder
Copy link
Contributor

hyder commented May 6, 2024

@lodotek did @robo-cap's suggestion work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants