Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/ansible: Add fedora linux distros #318

Merged
merged 1 commit into from
Feb 16, 2024

Conversation

tylerfanelli
Copy link
Contributor

Fedora also can be used to run k8s clusters locally. Add the fedora requirements to allow for testing.

@bpradipt bpradipt requested review from wainersm and fidencio January 16, 2024 10:23
Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

@stevenhorsman
Copy link
Member

/test

@ldoktor
Copy link
Contributor

ldoktor commented Jan 16, 2024

Looks good although doesn't seem sufficient. I'm getting:

TASK [Check docker is installed] **************************************************************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "command -v docker >/dev/null 2>&1", "delta": "0:00:00.002397", "end": "2024-01-16 11:54:32.751388", "msg": "non-zero return code", "rc": 1, "start": "2024-01-16 11:54:32.748991", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

failure on Fedora 39. Let me take a look what needs to be added...

Copy link
Contributor

@ldoktor ldoktor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylerfanelli could you please additionally address the places where ansible_distribution == CentOS is used and add another branch to install docker via https://download.docker.com/linux/fedora/docker-ce.repo repo so this actually becomes usable?

@ldoktor
Copy link
Contributor

ldoktor commented Jan 16, 2024

@tylerfanelli could you please additionally address the places where ansible_distribution == CentOS is used and add another branch to install docker via https://download.docker.com/linux/fedora/docker-ce.repo repo so this actually becomes usable?

Note: there is also the karmab/kcli@16d424a issue that needs to be configured. I gave it a try (only added the cgroup fix) but still it wasn't really successful (F39), I guess it might require some kubeadm.conf changes as well:

[fedora@fedoralatest e2e]$ ./run-local.sh -r -u "kata-qemu"
INFO: Bootstrap the local machine
...
PLAY RECAP ************************************************************************************************************
localhost                  : ok=37   changed=10   unreachable=0    failed=0    skipped=21   rescued=0    ignored=0   

INFO: Bring up the test cluster
[init] Using Kubernetes version: v1.24.0
[preflight] Running pre-flight checks
        [WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
        [WARNING FileExisting-ebtables]: ebtables not found in system path
        [WARNING FileExisting-ethtool]: ethtool not found in system path
        [WARNING FileExisting-tc]: tc not found in system path
        [WARNING SystemVerification]: missing optional cgroups: blkio
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [fedoralatest kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.122.155]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [fedoralatest localhost] and IPs [192.168.122.155 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [fedoralatest localhost] and IPs [192.168.122.155 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
        - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

Have you succeeded in your attempts?

diff --git a/tests/e2e/ansible/group_vars/all b/tests/e2e/ansible/group_vars/all
index 43f145d..a81975a 100644
--- a/tests/e2e/ansible/group_vars/all
+++ b/tests/e2e/ansible/group_vars/all
@@ -6,6 +6,9 @@ build_pkgs:
     - make
     - gcc
     - qemu-user-static
+  fedora:
+    - make
+    - gcc
   centos:
     - make
     - gcc
@@ -16,6 +19,9 @@ kubeadm_pkgs:
   ubuntu:
     - conntrack
     - socat
+  fedora:
+    - conntrack
+    - socat
   centos:
     - conntrack
     - socat
@@ -23,6 +29,8 @@ k8s_version: v1.24.0
 test_pkgs:
   ubuntu:
     - jq
+  fedora:
+    - jq
   centos:
     - jq
 target_arch: "{{ 'amd64' if ansible_architecture == 'x86_64' else ansible_architecture }}"
diff --git a/tests/e2e/ansible/install_build_deps.yml b/tests/e2e/ansible/install_build_deps.yml
index e1621db..691ae8b 100644
--- a/tests/e2e/ansible/install_build_deps.yml
+++ b/tests/e2e/ansible/install_build_deps.yml
@@ -38,7 +38,7 @@
         - name: Install qemu-user-static
           shell: docker run --rm --privileged multiarch/qemu-user-static:7.2.0-1 --reset -p yes
           when: qemu_user_static_exist.rc != 0
-      when: ansible_distribution == "CentOS"
+      when: ansible_distribution in ["CentOS", "Fedora"]
     # Undo the installation.
     #
     - name: Uninstall build dependencies
diff --git a/tests/e2e/ansible/install_containerd.yml b/tests/e2e/ansible/install_containerd.yml
index 4f6d102..cb6cab4 100644
--- a/tests/e2e/ansible/install_containerd.yml
+++ b/tests/e2e/ansible/install_containerd.yml
@@ -31,6 +31,10 @@
         containerd config default > /etc/containerd/config.toml
       args:
         executable: /bin/bash
+    - name: Enable SystemdCgroups
+      shell: |
+        sed -i 's/SystemdCgroup = .*/SystemdCgroup = true/' /etc/containerd/config.toml
+      when: ansible_distribution == "Fedora"
     - name: Restart containerd service
       service:
         name: containerd
diff --git a/tests/e2e/ansible/install_docker.yml b/tests/e2e/ansible/install_docker.yml
index 9e3572d..7204e95 100644
--- a/tests/e2e/ansible/install_docker.yml
+++ b/tests/e2e/ansible/install_docker.yml
@@ -59,6 +59,24 @@
         name: docker
         state: present
   when: docker_exist.rc != 0 and ansible_distribution == "Ubuntu" and ansible_distribution_version == "22.04"
+- name: Handle docker installation on CentOS.
+  block:
+    - name: Install yum-utils
+      dnf:
+        name: yum-utils
+        state: present
+    - name: Add docker yum repo
+      shell: yum-config-manager --add-repo https://download.docker.com/linux/fedora/docker-ce.repo
+      args:
+        creates: /etc/yum.repos.d/docker-ce.repo
+    - name: Install docker packages
+      dnf:
+        name:
+          - containerd.io
+          - docker-ce
+          - docker-ce-cli
+        state: present
+  when: docker_exist.rc != 0 and ansible_distribution == "Fedora"
 - name: Handle docker installation on CentOS.
   block:
     - name: Install yum-utils

@bpradipt
Copy link
Member

@ldoktor hope you are not hitting the following issue on F39 kubeadm cluster - #319

@ldoktor
Copy link
Contributor

ldoktor commented Jan 16, 2024

@ldoktor hope you are not hitting the following issue on F39 kubeadm cluster - #319

I noticed that one (after initial attempts) but the zram removal did not really help. So the questions stays, @tylerfanelli have you succeeded in running the run_local on any Fedora? Which and what other tricks were required...

@tylerfanelli
Copy link
Contributor Author

tylerfanelli commented Jan 16, 2024

@ldoktor zram removal is what worked for me. Besides the changes I made in the PR and following the Vagrantfile CentOS additions, I'm not too sure what else i did different. This is for F39.

@bpradipt
Copy link
Member

I added the following changes on top of this PR (including removal of zram-generator-defaults) and could progress on a F39 VM.
I used ./run-local.sh -r kata-qemu -u
However operator deployment failed with insufficient cpus. I will retry on a bigger F39 VM and update.

Changes

diff --git a/tests/e2e/ansible/install_docker.yml b/tests/e2e/ansible/install_docker.yml
index 9e3572d..dc9a9ae 100644
--- a/tests/e2e/ansible/install_docker.yml
+++ b/tests/e2e/ansible/install_docker.yml
@@ -77,6 +77,25 @@
           - docker-ce-cli
         state: present
   when: docker_exist.rc != 0 and ansible_distribution == "CentOS"
+
+- name: Handle docker installation on Fedora 39
+  block:
+    - name: Install yum-utils
+      dnf:
+        name: yum-utils
+        state: present
+    - name: Add docker yum repo
+      shell: yum-config-manager --add-repo https://download.docker.com/linux/fedora/docker-ce.repo
+      args:
+        creates: /etc/yum.repos.d/docker-ce.repo
+    - name: Install docker packages
+      dnf:
+        name:
+          - containerd.io
+          - docker-ce
+          - docker-ce-cli
+        state: present
+  when: docker_exist.rc != 0 and ansible_distribution == "Fedora"  and ansible_distribution_version == "39"
 - name: Start docker service
   service:
     name: docker

Error

Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  2m2s  default-scheduler  0/1 nodes are available: 1 Insufficient cpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

@bpradipt
Copy link
Member

Now I hit this error after running on a F39 VM with 4vCPUs

+ for arch in ${supported_arches[@]}
+ setup_env_for_arch linux/s390x
+ case "$1" in
+ kernel_arch=s390x
+ golang_arch=s390x
+ echo 'Building containerd payload image for linux/s390x'
Building containerd payload image for linux/s390x
+ docker buildx build --build-arg ARCH=s390x --build-arg COCO_CONTAINERD_VERSION=1.6.8.2 --build-arg COCO_CONTAINERD_REPO=https://github.com/confidential-containers/containerd --build-arg OFFICIAL_CONTAINERD_VERSION=1.7.7 --build-arg OFFICIAL_CONTAINERD_REPO=https://github.com/containerd/containerd --build-arg VFIO_GPU_CONTAINERD_VERSION=1.7.0.0 --build-arg VFIO_GPU_CONTAINERD_REPO=https://github.com/confidential-containers/containerd --build-arg NYDUS_SNAPSHOTTER_VERSION=v0.13.3-multiarch --build-arg NYDUS_SNAPSHOTTER_REPO=https://github.com/fidencio/nydus-snapshotter -t localhost:5000/reqs-payload:s390x-e290f0f42ae63f67ab92d694cf380dc249d89c6d --platform=linux/s390x --load .
[+] Building 1.9s (12/22)                                                                                                                                                     docker:default
 => [internal] load .dockerignore                                                                                                                                                       0.1s
 => => transferring context: 2B                                                                                                                                                         0.0s
 => [internal] load build definition from Dockerfile                                                                                                                                    0.1s
 => => transferring dockerfile: 5.05kB                                                                                                                                                  0.0s
 => [internal] load metadata for docker.io/library/golang:1.19-alpine                                                                                                                   0.4s
 => [internal] load metadata for docker.io/library/alpine:3.18                                                                                                                          0.2s
 => [internal] load build context                                                                                                                                                       0.1s
 => => transferring context: 1.10kB                                                                                                                                                     0.0s
 => CACHED [base 1/1] FROM docker.io/library/alpine:3.18@sha256:34871e7290500828b39e22294660bee86d966bc0017544e848dd9a255cdf59e0                                                        0.0s
 => CANCELED [nydus-binary-downloader 1/2] FROM docker.io/library/golang:1.19-alpine@sha256:0ec0646e208ea58e5d29e558e39f2e59fccf39b7bda306cb53bbaff91919eca5                            1.0s
 => => resolve docker.io/library/golang:1.19-alpine@sha256:0ec0646e208ea58e5d29e558e39f2e59fccf39b7bda306cb53bbaff91919eca5                                                             0.2s
 => => sha256:ec90be18226e5c99d10161aed1a143f4134093c55b4d6979bbdbbe4b0683eb11 285.09kB / 285.09kB                                                                                      0.3s
 => => sha256:8a62238188658a20afe263e4174c269ab91e68e86bf02db75dac3bbdacfe253c 12.58MB / 120.93MB                                                                                       1.1s
 => => sha256:0ec0646e208ea58e5d29e558e39f2e59fccf39b7bda306cb53bbaff91919eca5 1.65kB / 1.65kB                                                                                          0.0s
 => => sha256:7b47cff2c98995690cd81fc6d18c0f7b726bf92d375268791e8bcd42553f7862 1.16kB / 1.16kB                                                                                          0.0s
 => => sha256:8ece8eb6ca448b9da6d6ac6a8ba791f0a5ad3ffb4feb9bf0ef0932da14896346 5.18kB / 5.18kB                                                                                          0.0s
 => => sha256:8bed2eae372fe236061920d89ae1ce89695a12df84989113bcc7ce4bd9774456 3.21MB / 3.21MB                                                                                          0.5s
 => => sha256:3ac2bd91a23c4e119cd9dc376dcfc3d3d5d128d7f9723eb0d4b575e3d810f15b 0B / 156B                                                                                                1.1s
 => => extracting sha256:8bed2eae372fe236061920d89ae1ce89695a12df84989113bcc7ce4bd9774456                                                                                               0.5s
 => ERROR [kubectl-binary-downloader 1/1] RUN  apk --no-cache add curl &&  curl -fL --progress-bar -o /usr/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/v1.26  0.9s
 => ERROR [coco-containerd-binary-downloader 1/1] RUN  mkdir -p /opt/confidential-containers-pre-install-artifacts/opt/confidential-containers &&  apk --no-cache add curl &&  curl -f  1.0s
 => ERROR [official-containerd-binary-downloader 1/1] RUN  mkdir -p /opt/confidential-containers-pre-install-artifacts/opt/confidential-containers &&  apk --no-cache add curl &&  cur  1.0s
 => CANCELED [vfio-gpu-containerd-binary-downloader 1/1] RUN  mkdir -p /opt/confidential-containers-pre-install-artifacts/opt/confidential-containers &&  apk --no-cache add curl &&    1.3s
 => CANCELED [stage-6  1/10] RUN apk --no-cache add bash gcompat                                                                                                                        1.2s
------
 > [kubectl-binary-downloader 1/1] RUN  apk --no-cache add curl &&      curl -fL --progress-bar -o /usr/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/v1.26.0/bin/linux/s390x/kubectl &&     chmod +x /usr/bin/kubectl:
0.633 exec /bin/sh: exec format error
------
------
 > [coco-containerd-binary-downloader 1/1] RUN  mkdir -p /opt/confidential-containers-pre-install-artifacts/opt/confidential-containers &&      apk --no-cache add curl &&      curl -fOL --progress-bar https://github.com/confidential-containers/containerd/releases/download/v1.6.8.2/containerd-1.6.8.2-linux-s390x.tar.gz &&   tar xvzpf containerd-1.6.8.2-linux-s390x.tar.gz -C /opt/confidential-containers-pre-install-artifacts/opt/confidential-containers &&         rm containerd-1.6.8.2-linux-s390x.tar.gz:
0.848 exec /bin/sh: exec format error
------
------
 > [official-containerd-binary-downloader 1/1] RUN      mkdir -p /opt/confidential-containers-pre-install-artifacts/opt/confidential-containers &&      apk --no-cache add curl &&      curl -fOL --progress-bar https://github.com/containerd/containerd/releases/download/v1.7.7/containerd-static-1.7.7-linux-s390x.tar.gz &&     tar xvzpf containerd-static-1.7.7-linux-s390x.tar.gz -C /opt/confidential-containers-pre-install-artifacts/opt/confidential-containers &&    rm containerd-static-1.7.7-linux-s390x.tar.gz:
0.727 exec /bin/sh: exec format error
------
Dockerfile:86
--------------------
  85 |
  86 | >>> RUN \
  87 | >>>      apk --no-cache add curl && \
  88 | >>>      curl -fL --progress-bar -o /usr/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/${KUBECTL_VERSION}/bin/linux/${ARCH}/kubectl && \
  89 | >>>      chmod +x /usr/bin/kubectl
  90 |
--------------------
ERROR: failed to solve: process "/bin/sh -c apk --no-cache add curl && \tcurl -fL --progress-bar -o /usr/bin/kubectl https://storage.googleapis.com/kubernetes-release/release/${KUBECTL_VERSION}/bin/linux/${ARCH}/kubectl && \tchmod +x /usr/bin/kubectl" did not complete successfully: exit code: 1
make: *** [Makefile:9: reqs-image] Error 1

@BbolroC @stevenhorsman any thoughts ?

@BbolroC
Copy link
Member

BbolroC commented Jan 17, 2024

@BbolroC @stevenhorsman any thoughts ?

Hi, @bpradipt Thanks for the notice. I've reproduced the error on my F39. Due to the clue exec /bin/sh: exec format error, I ran the following command and got:

$ sudo docker buildx ls
NAME/NODE DRIVER/ENDPOINT STATUS  BUILDKIT             PLATFORMS
default * docker
  default default         running v0.11.7+d3e6c1360f6e linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386

It seems that qemu-user-static is not packaged on Fedora like CentOS.

I got the build successfully completed again when I ran the following:

sudo docker run --rm --privileged multiarch/qemu-user-static:7.2.0-1 --reset -p yes

I think there should be a configuration step for qemu-user-static for Fedora along with CentOS at

- name: Handle qemu-user-static installation on CentOS.
block:
- name: Check qemu-user-static is installed
shell: docker run --rm -t s390x/ubuntu:22.04 uname -m
register: qemu_user_static_exist
ignore_errors: yes
- name: Install qemu-user-static
shell: docker run --rm --privileged multiarch/qemu-user-static:7.2.0-1 --reset -p yes
when: qemu_user_static_exist.rc != 0
when: ansible_distribution == "CentOS"

@BbolroC
Copy link
Member

BbolroC commented Jan 17, 2024

Oh, sorry. I had not gone over the comments for the PR when I wrote this. You guys have already mentioned an issue around zram. 😉 Just ignore the comment below. Thanks!


FYI: what made me struggle to run a cluster on my F39 while debugging the buildx issue above is that kubelet was not able to start due the following swap partition:

# swapon --show
NAME       TYPE      SIZE USED PRIO
/dev/zram0 partition 7.7G   0B  100

It seems that SwapOnZRAM is enabled by default on F39. I was able to start kubelet again after running:

# yum remove zram-generator-defaults
# swapoff -a

This would not be the case for others, but I just wanted to share this. Thanks!

@BbolroC
Copy link
Member

BbolroC commented Jan 17, 2024

I have managed to finish the test successfully after configuring the following manually:

  • docker installation
  • removal of zram-generator-defaults
  • qemu-user-static installation

@bpradipt
Copy link
Member

Thanks a lot @BbolroC . It might make sense to add the zram-generator-defaults removal and qemu-user-static installation as part of the ansible playbook itself. I'll do a rerun on my setup with your instructions and update

@bpradipt
Copy link
Member

I could run successfully as well with the following changes on top of this PR

diff --git a/tests/e2e/ansible/group_vars/all b/tests/e2e/ansible/group_vars/all
index 43f145d..26835b6 100644
--- a/tests/e2e/ansible/group_vars/all
+++ b/tests/e2e/ansible/group_vars/all
@@ -6,6 +6,10 @@ build_pkgs:
     - make
     - gcc
     - qemu-user-static
+  fedora:
+    - make
+    - gcc
+    - qemu-user-static
   centos:
     - make
     - gcc
@@ -16,6 +20,9 @@ kubeadm_pkgs:
   ubuntu:
     - conntrack
     - socat
+  fedora:
+    - conntrack
+    - socat
   centos:
     - conntrack
     - socat
@@ -23,6 +30,8 @@ k8s_version: v1.24.0
 test_pkgs:
   ubuntu:
     - jq
+  fedora:
+    - jq
   centos:
     - jq
 target_arch: "{{ 'amd64' if ansible_architecture == 'x86_64' else ansible_architecture }}"
diff --git a/tests/e2e/ansible/install_docker.yml b/tests/e2e/ansible/install_docker.yml
index 9e3572d..dc9a9ae 100644
--- a/tests/e2e/ansible/install_docker.yml
+++ b/tests/e2e/ansible/install_docker.yml
@@ -77,6 +77,25 @@
           - docker-ce-cli
         state: present
   when: docker_exist.rc != 0 and ansible_distribution == "CentOS"
+
+- name: Handle docker installation on Fedora 39
+  block:
+    - name: Install yum-utils
+      dnf:
+        name: yum-utils
+        state: present
+    - name: Add docker yum repo
+      shell: yum-config-manager --add-repo https://download.docker.com/linux/fedora/docker-ce.repo
+      args:
+        creates: /etc/yum.repos.d/docker-ce.repo
+    - name: Install docker packages
+      dnf:
+        name:
+          - containerd.io
+          - docker-ce
+          - docker-ce-cli
+        state: present
+  when: docker_exist.rc != 0 and ansible_distribution == "Fedora"  and ansible_distribution_version == "39"
 - name: Start docker service
   service:
     name: docker
diff --git a/tests/e2e/ansible/install_kubeadm.yml b/tests/e2e/ansible/install_kubeadm.yml
index f57c61a..0e2eed3 100644
--- a/tests/e2e/ansible/install_kubeadm.yml
+++ b/tests/e2e/ansible/install_kubeadm.yml
@@ -50,6 +50,11 @@
         - kubeadm
         - kubelet
         - kubectl
+    - name: Remove zram-generator-defaults in Fedora
+      ansible.builtin.yum:
+        name: zram-generator-defaults
+        state: absent
+      when: ansible_distribution == "Fedora"  and ansible_distribution_version == "39"
     - name: Disable swap
       shell: |
         [ -z "$(swapon --show)" ] && exit 0

@bpradipt
Copy link
Member

@tylerfanelli fyi

@ldoktor
Copy link
Contributor

ldoktor commented Jan 17, 2024

@tylerfanelli fyi

Yep, @bpradipt version works well for me as well, although I wouldn't recommend restricting the versions to 39.

@bpradipt
Copy link
Member

@tylerfanelli fyi

Yep, @bpradipt version works well for me as well, although I wouldn't recommend restricting the versions to 39.

I'll preferably stick to a specific version for repeatability.

@ldoktor
Copy link
Contributor

ldoktor commented Jan 17, 2024

@tylerfanelli fyi

Yep, @bpradipt version works well for me as well, although I wouldn't recommend restricting the versions to 39.

I'll preferably stick to a specific version for repeatability.

I just verified it works with F38 as well and I don't see any reason for it not to work on 40+

@tylerfanelli
Copy link
Contributor Author

tylerfanelli commented Jan 17, 2024

@bpradipt I've added your changes, and added a Co-authored-by: Pradipta Banerjee <[email protected]> to the commit message.

@tylerfanelli
Copy link
Contributor Author

@ldoktor Can you re-test this PR?

@tylerfanelli tylerfanelli force-pushed the run-local-fedora branch 2 times, most recently from 0165b17 to 3cf3b65 Compare January 17, 2024 18:05
@wainersm
Copy link
Member

@tylerfanelli indentation error:

[fedora@myvm e2e]$ ./run-local.sh -u
INFO: Bootstrap the local machine
ERROR! We were unable to read either as JSON nor YAML, these are the errors we got from each:
JSON: Expecting value: line 1 column 1 (char 0)

Syntax Error while loading YAML.
  did not find expected '-' indicator

The error appears to be in '/home/fedora/operator/tests/e2e/ansible/install_docker.yml': line 97, column 11, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

          - docker-ce-cli
          state: present
          ^ here

@tylerfanelli tylerfanelli force-pushed the run-local-fedora branch 2 times, most recently from 04f7d60 to 03b44cb Compare January 17, 2024 20:02
@tylerfanelli
Copy link
Contributor Author

@wainersm Fixed, thanks.

@bpradipt bpradipt requested a review from ldoktor January 18, 2024 03:30
Copy link
Contributor

@ldoktor ldoktor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works well for me although I'd really like to remove the distro-version check. It works on F38 and F40 will be released soon. We don't have such checks for ubuntu versions and everything works on several versions while with the checks it simply only works on F39.

@wainersm
Copy link
Member

I tested it in a Fedora 39 VM, it works.

How I did setup the VM:

  • On host, created the VM with kcli:
$ sudo kcli download image -u https://download.fedoraproject.org/pub/fedora/linux/releases/39/Cloud/x86_64/images/Fedora-Cloud-Base-39-1.5.x86_64.qcow2 fedora39
$ kcli create vm -i fedora39 -P numcpus=4 -P memory=$((1024*8)) -P disks=[30] myvm
  • On VM:
$ sudo dnf update -y && sudo dnf -y install ansible-core && ansible-galaxy collection install community.docker

@wainersm
Copy link
Member

Works well for me although I'd really like to remove the distro-version check. It works on F38 and F40 will be released soon. We don't have such checks for ubuntu versions and everything works on several versions while with the checks it simply only works on F39.

I agree with @ldoktor

@BbolroC
Copy link
Member

BbolroC commented Jan 18, 2024

FYI: Rebasing onto main is necessary because a structure of the workflow for e2e tests has been changed.

@bpradipt
Copy link
Member

bpradipt commented Jan 19, 2024

Works well for me although I'd really like to remove the distro-version check. It works on F38 and F40 will be released soon. We don't have such checks for ubuntu versions and everything works on several versions while with the checks it simply only works on F39.

I agree with @ldoktor

@ldoktor @wainersm Even for Ubuntu there are explicit version checks - https://github.com/confidential-containers/operator/blob/main/tests/e2e/ansible/install_docker.yml#L61

We rely on external packages for setting up the environment. Docker might decide not to ship packages for the latest Fedora version. Likewise, kubeadm packages might need different handling for different distro versions or might not be available for a specific distro version, resulting in a bad first-time experience. We have seen this cycle repeat numerous times. Pinned versions (to the extent practical and feasible) help with repeatability.

Additionally, we might not be able to keep testing and updating for every distro version. In future, if it makes sense to drop older distro versions and stick to new ones, we can make the change and suggest every dev to use the recommended distro version.

My understanding is that run-local.sh and associated ansible playbooks are not meant to be solutions for deploying K8s for every distro and version. run-local.sh is a tool with a fixed scope - for CI and for devs to quickly bring up an environment for test.

Anyway, if you still prefer supporting generic Fedora, I would request @tylerfanelli to make the necessary changes and let's merge this PR. May be additionally update the readme with a note on which Fedora version it's tested to reduce the probability of a bad initial dev experience.

@ldoktor
Copy link
Contributor

ldoktor commented Jan 19, 2024

Anyway, if you still prefer supporting generic Fedora, I would request @tylerfanelli to make the necessary changes and let's merge this PR. May be additionally update the readme with a note on which Fedora version it's tested to reduce the probability of a bad initial dev experience.

I understand your point but if I am to chose between I will only work on F39 or I will try to work on any fedora, add your fixes if needed I'd prefer the second even if it sometimes fails. Mentioning Tested on F39 could be an option, not sure how important.

@bpradipt
Copy link
Member

I understand your point but if I am to chose between I will only work on F39 or I will try to work on any fedora, add your fixes if needed I'd prefer the second even if it sometimes fails. Mentioning Tested on F39 could be an option, not sure how important.

For dev and CI environments my preference is to use tools with a fixed scope with repeatability since the primary intent is to not fix environment issues. Separation of concerns :-)

Anyways, since you prefer a generic approach I'm ok with it.

@tylerfanelli would you be able to work on the requested changes by @ldoktor and remove the distro version checks ?

@ldoktor
Copy link
Contributor

ldoktor commented Jan 24, 2024

I understand your point but if I am to chose between I will only work on F39 or I will try to work on any fedora, add your fixes if needed I'd prefer the second even if it sometimes fails. Mentioning Tested on F39 could be an option, not sure how important.

For dev and CI environments my preference is to use tools with a fixed scope with repeatability since the primary intent is to not fix environment issues. Separation of concerns :-)

Anyways, since you prefer a generic approach I'm ok with it.

@tylerfanelli would you be able to work on the requested changes by @ldoktor and remove the distro version checks ?

I understand, Pradipta, my preference is to do best-approach way and pin the CI versions, which should serve as a recommendation to developers on which env is supported while allowing some flexibility to use whatever they might want as long as they can adjust the scripts, which is actually the case here. We do not use Fedora in the CI at all, it's just a nice-to-have addition.

Fedora also can be used to run k8s clusters locally. Add the fedora
requirements to allow for testing.

Tested on Fedora 39.

Co-authored-by: Pradipta Banerjee <[email protected]>
Signed-off-by: Tyler Fanelli <[email protected]>
Copy link
Contributor

@ldoktor ldoktor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, looks good and works well with F39

@ldoktor ldoktor merged commit be9724a into confidential-containers:main Feb 16, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants