Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem replacing node in cluster when node is first in list of nodes #3073

Closed
seanrmurphy opened this issue Oct 17, 2022 · 1 comment
Closed

Comments

@seanrmurphy
Copy link

RKE version: 1.3.13

Docker version: (docker version,docker info preferred)

$ docker version
Client:
 Version:           20.10.12
 API version:       1.41
 Go version:        go1.17.3
 Git commit:        20.10.12-0ubuntu4
 Built:             Mon Mar  7 17:10:06 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.12
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.3
  Git commit:       20.10.12-0ubuntu4
  Built:            Mon Mar  7 15:57:50 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.5.9-0ubuntu3
  GitCommit:        
 runc:
  Version:          1.1.0-0ubuntu1
  GitCommit:        
 docker-init:
  Version:          0.19.0
  GitCommit:        
$ docker info
Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 3
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 
 runc version: 
 init version: 
 Security Options:
  apparmor
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.15.0-50-generic
 Operating System: Ubuntu 22.04.1 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 7.763GiB
 Name: master-personal-sean-murphy-0
 ID: KRTV:OJSX:NHGV:GIGV:7G3G:RLOM:HGOX:LW4U:W7DJ:VL37:AKHJ:BW65
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

$ cat /etc/os-release 
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
$ uname -r
5.15.0-50-generic

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)

Provisioned VMs on Openstack.

cluster.yml file:

addon_job_timeout: 0
addons: ""
addons_include: []
authentication:
  sans:
  - <REDACTED>
  strategy: x509
  webhook: null
authorization:
  mode: ""
  options: {}
bastion_host:
  address: <REDACTED>
  ignore_proxy_env_vars: false
  port: "2222"
  ssh_agent_auth: true
  ssh_cert: ""
  ssh_cert_path: ""
  ssh_key: ""
  ssh_key_path: ~/.ssh/id_rsa
  user: ubuntu
cloud_provider:
  name: ""
cluster_name: ""
dns: null
enable_cri_dockerd: false
ignore_docker_version: false
ingress:
  default_backend: true
  default_http_backend_priority_class_name: ""
  default_ingress_class: null
  dns_policy: ""
  extra_args: {}
  extra_envs: []
  extra_volume_mounts: []
  extra_volumes: []
  http_port: 80
  https_port: 443
  network_mode: hostPort
  nginx_ingress_controller_priority_class_name: ""
  node_selector: {}
  options:
    allow-snippet-annotations: "true"
  provider: nginx
  tolerations: []
  update_strategy: null
kubernetes_version: v1.23.8-rancher1-1
monitoring:
  metrics_server_priority_class_name: ""
  node_selector: {}
  options: {}
  provider: ""
  replicas: null
  tolerations: []
  update_strategy: null
network:
  mtu: 0
  node_selector: {}
  options: {}
  plugin: ""
  tolerations: []
  update_strategy: null
nodes:
- address: 192.168.0.97
  docker_socket: ""
  hostname_override: master-personal-sean-murphy-1
  internal_address: ""
  labels:
    id: 38e91026-1417-451d-867e-37599d084006
  nodeName: master-personal-sean-murphy-1
  port: ""
  role:
  - controlplane
  - etcd
  ssh_agent_auth: true
  ssh_cert: ""
  ssh_cert_path: ""
  ssh_key: ""
  ssh_key_path: ""
  taints: []
  user: ubuntu
- address: 192.168.0.111
  docker_socket: ""
  hostname_override: master-personal-sean-murphy-0
  internal_address: ""
  labels:
    id: c20ad1ec-342f-45fd-878a-206a395479a0
  nodeName: master-personal-sean-murphy-0
  port: ""
  role:
  - controlplane
  - etcd
  ssh_agent_auth: true
  ssh_cert: ""
  ssh_cert_path: ""
  ssh_key: ""
  ssh_key_path: ""
  taints: []
  user: ubuntu
- address: 192.168.0.96
  docker_socket: ""
  hostname_override: master-personal-sean-murphy-2
  internal_address: ""
  labels:
    id: 2ed300a6-0b7b-4ccd-9f15-03b49b490a01
  nodeName: master-personal-sean-murphy-2
  port: ""
  role:
  - controlplane
  - etcd
  ssh_agent_auth: true
  ssh_cert: ""
  ssh_cert_path: ""
  ssh_key: ""
  ssh_key_path: ""
  taints: []
  user: ubuntu
- address: 192.168.0.182
  docker_socket: ""
  hostname_override: worker-personal-sean-murphy-default-0
  internal_address: ""
  labels: {}
  nodeName: worker-personal-sean-murphy-default-0
  port: ""
  role:
  - worker
  ssh_agent_auth: true
  ssh_cert: ""
  ssh_cert_path: ""
  ssh_key: ""
  ssh_key_path: ""
  taints: []
  user: ubuntu
- address: 192.168.0.141
  docker_socket: ""
  hostname_override: worker-personal-sean-murphy-default-1
  internal_address: ""
  labels: {}
  nodeName: worker-personal-sean-murphy-default-1
  port: ""
  role:
  - worker
  ssh_agent_auth: true
  ssh_cert: ""
  ssh_cert_path: ""
  ssh_key: ""
  ssh_key_path: ""
  taints: []
  user: ubuntu
- address: 192.168.0.127
  docker_socket: ""
  hostname_override: worker-personal-sean-murphy-default-2
  internal_address: ""
  labels: {}
  nodeName: worker-personal-sean-murphy-default-2
  port: ""
  role:
  - worker
  ssh_agent_auth: true
  ssh_cert: ""
  ssh_cert_path: ""
  ssh_key: ""
  ssh_key_path: ""
  taints: []
  user: ubuntu
prefix_path: ""
private_registries: []
restore:
  restore: false
  snapshot_name: ""
rotate_encryption_key: false
services:
  etcd:
    backup_config: null
    ca_cert: ""
    cert: ""
    creation: 12h
    external_urls: []
    extra_args:
      election-timeout: "5000"
      heartbeat-interval: "500"
    extra_args_array: {}
    extra_binds: []
    extra_env: []
    gid: 0
    image: rancher/mirrored-coreos-etcd:v3.5.3
    key: ""
    path: ""
    retention: 72h
    snapshot: true
    uid: 0
    win_extra_args: {}
    win_extra_args_array: {}
    win_extra_binds: []
    win_extra_env: []
  kube-api:
    admission_configuration: null
    always_pull_images: false
    audit_log:
      configuration:
        format: json
        max_age: 30
        max_backup: 10
        max_size: 100
        path: /var/log/kube-audit/audit-log.json
        policy:
          apiVersion: audit.k8s.io/v1
          kind: Policy
          metadata:
            creationTimestamp: null
          rules:
          - level: Metadata
      enabled: true
    event_rate_limit: null
    extra_args: {}
    extra_args_array: {}
    extra_binds: []
    extra_env:
    - RKE_AUDITLOG_CONFIG_CHECKSUM=856f426399fb14a50b78e721d15c168c
    - RKE_AUDITLOG_CONFIG_CHECKSUM=856f426399fb14a50b78e721d15c168c
    - RKE_AUDITLOG_CONFIG_CHECKSUM=856f426399fb14a50b78e721d15c168c
    - RKE_AUDITLOG_CONFIG_CHECKSUM=856f426399fb14a50b78e721d15c168c
    - RKE_AUDITLOG_CONFIG_CHECKSUM=856f426399fb14a50b78e721d15c168c
    - RKE_AUDITLOG_CONFIG_CHECKSUM=856f426399fb14a50b78e721d15c168c
    image: rancher/hyperkube:v1.23.8-rancher1
    pod_security_policy: false
    secrets_encryption_config: null
    service_cluster_ip_range: 10.43.0.0/16
    service_node_port_range: 30000-32767
    win_extra_args: {}
    win_extra_args_array: {}
    win_extra_binds: []
    win_extra_env: []
  kube-controller:
    cluster_cidr: 10.42.0.0/16
    extra_args: {}
    extra_args_array: {}
    extra_binds: []
    extra_env: []
    image: rancher/hyperkube:v1.23.8-rancher1
    service_cluster_ip_range: 10.43.0.0/16
    win_extra_args: {}
    win_extra_args_array: {}
    win_extra_binds: []
    win_extra_env: []
  kubelet:
    cluster_dns_server: 10.43.0.10
    cluster_domain: cluster.local
    extra_args:
      enforce-node-allocatable: pods
      kube-reserved: cpu=500m,memory=1Gi
    extra_args_array: {}
    extra_binds: []
    extra_env: []
    fail_swap_on: false
    generate_serving_certificate: false
    image: rancher/hyperkube:v1.23.8-rancher1
    infra_container_image: rancher/mirrored-pause:3.6
    win_extra_args: {}
    win_extra_args_array: {}
    win_extra_binds: []
    win_extra_env: []
  kubeproxy:
    extra_args: {}
    extra_args_array: {}
    extra_binds: []
    extra_env: []
    image: rancher/hyperkube:v1.23.8-rancher1
    win_extra_args: {}
    win_extra_args_array: {}
    win_extra_binds: []
    win_extra_env: []
  scheduler:
    extra_args: {}
    extra_args_array: {}
    extra_binds: []
    extra_env: []
    image: rancher/hyperkube:v1.23.8-rancher1
    win_extra_args: {}
    win_extra_args_array: {}
    win_extra_binds: []
    win_extra_env: []
ssh_agent_auth: true
ssh_cert_path: ""
ssh_key_path: ""
system_images:
  aci_cni_deploy_container: ""
  aci_controller_container: ""
  aci_gbp_server_container: ""
  aci_host_container: ""
  aci_mcast_container: ""
  aci_opflex_container: ""
  aci_opflex_server_container: ""
  aci_ovs_container: ""
  alpine: ""
  calico_cni: ""
  calico_controllers: ""
  calico_ctl: ""
  calico_flexvol: ""
  calico_node: ""
  canal_cni: ""
  canal_controllers: ""
  canal_flannel: ""
  canal_flexvol: ""
  canal_node: ""
  cert_downloader: ""
  coredns: ""
  coredns_autoscaler: ""
  dnsmasq: ""
  etcd: ""
  flannel: ""
  flannel_cni: ""
  ingress: ""
  ingress_backend: ""
  ingress_webhook: ""
  kubedns: ""
  kubedns_autoscaler: ""
  kubedns_sidecar: ""
  kubernetes: ""
  kubernetes_services_sidecar: ""
  metrics_server: ""
  nginx_proxy: ""
  nodelocal: ""
  pod_infra_container: ""
  weave_cni: ""
  weave_node: ""
  windows_pod_infra_container: ""
upgrade_strategy:
  drain: false
  max_unavailable_controlplane: "1"
  max_unavailable_worker: 10%
  node_drain_input:
    delete_local_data: false
    force: false
    grace_period: 0
    ignore_daemonsets: false
    timeout: 0
win_prefix_path: ""

Steps to Reproduce:

  • Bring up HA cluster with 3 etcd nodes
  • Completely remove the first node in the list of nodes such that it is not accessible (we removed the VM completely) but do not modify rke state; cluster remains operational but degraded as one node has disappeared
  • Create a new replacement VM (in our case, this was with a new OS version) - it obtained a new IP address
  • Modify cluster.yml such that IP addr of the new node replaces the IP addr of the old node
  • Run rke up on the modified cluster.yml

Results:

  • The rke up process gets stuck removing the node.

Analysis:

I performed some troubleshooting and found the following:

  • rke reads in the set of nodes and compares with state as per the rkestate file. Note that in this case neither of these is fully representative of the current operational state of the system
  • rke identifies correctly that the old node must be removed
  • rke assumes the etcd cluster to be used is that defined in the cluster.yml; however this is not fully correct as etcd still has not been deployed on the first node
  • in the etcd node removal process, it attempts to connect to the first etcd node and never times out

More specifically, the RemoveEtcdMember function gets called with the member which should be removed and the desired set of etcd members in the cluster (which does not represent current state). It then iterates over this set here - if the first node in this set is not running etcd then the rke up gets blocked here and does not complete.

We have observed the same issue when adding an etcd member to the cluster and the first node in the list is not accessible.

Further Comments

  • Changing the order of the nodes fixes the issue, ie if the new node is anywhere but first in the list of nodes, the system figures out the current state and reconciles.
  • Performing an rke reconciliation after removing the Openstack VM would probably mean the problem does not manifest
  • In our case, we are using terraform provisioning which does not easily give us the option of removing the node from the cluster before adding the new node
  • this issue is the root cause of this issue: upgrading control plane / etcd nodes fails terraform-provider-rke#362
@github-actions
Copy link
Contributor

This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 14 days. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants