Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Playbook stuck while starting the RKE2 service on agents #135

Open
leon-andria opened this issue Feb 24, 2023 · 1 comment
Open

Playbook stuck while starting the RKE2 service on agents #135

leon-andria opened this issue Feb 24, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@leon-andria
Copy link

Summary

In the troubleshooting section here: https://github.com/lablabs/ansible-role-rke2#troubleshooting, it mentions that it might be a network limitation.

The problem is that the RKE2 script is never executed on the agent which has condition with the variable installed_rke2_version. While that variable is depends on condition "rke2-server.service" in ansible_facts.services.

Below is the changes I made to fix the issue:

Before the Run AirGap RKE2 scripttask (

- name: Run AirGap RKE2 script
), I added the following tasks by checking that the rke2 binary path exists and don't relying on this line
when: '"rke2-server.service" in ansible_facts.services'
.

- name: Check rke2 bin exists
  ansible.builtin.stat:
    path: "{{ rke2_bin_path }}"
  register: rke2_exists

- name: Check RKE2 version
  ansible.builtin.shell: |
    set -o pipefail
    {{ rke2_bin_path }} --version | grep -E "rke2 version" | awk '{print $3}'
  args:
    executable: /bin/bash
  changed_when: false
  register: installed_rke2_version
  when: rke2_exists.stat.exists

Issue Type

Bug Report

Ansible Version

ansible [core 2.14.2]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3/dist-packages/ansible
  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] (/usr/bin/python3)
  jinja version = 3.0.3
  libyaml = True

Steps to Reproduce

- name: Deploy RKE2
  hosts: all
  become: yes
  vars:
    rke2_version: v1.26.0+rke2r2    
    rke2_api_ip : 192.168.1.10
    rke2_download_kubeconf: true    
    rke2_server_node_taints:
      - 'CriticalAddonsOnly=true:NoExecute'
    rke2_cni:
      - cilium
  roles:
     - role: lablabs.rke2
[masters]
master-01 ansible_host=192.168.1.10 rke2_type=server
master-02 ansible_host=192.168.1.11 rke2_type=server
master-03 ansible_host=192.168.1.12 rke2_type=server

[workers]
worker-01 ansible_host=192.168.1.20 rke2_type=agent
worker-02 ansible_host=192.168.1.21 rke2_type=agent

[k8s_cluster:children]
masters
workers

Expected Results

Worker nodes should be provisioned if the rke2.sh script have been executed on the following task

- name: Run RKE2 script

Actual Results

It's just hanging until timeout.
@leon-andria leon-andria added the bug Something isn't working label Feb 24, 2023
@janonym1
Copy link

I tried your changes as described but they didn't work for me (airgapped, 3 workers, HA mode). I still had to run the RKE2 agent script by hand (on the workers). The install scripts runs correctly and the binaries exist but the execution of the agent service didnt (or its creation)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants