Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to create containers when using 4.13.4 microshift bundle #3772

Closed
jsliacan opened this issue Jul 27, 2023 · 7 comments
Closed

Not able to create containers when using 4.13.4 microshift bundle #3772

jsliacan opened this issue Jul 27, 2023 · 7 comments

Comments

@jsliacan
Copy link
Contributor

jsliacan commented Jul 27, 2023

General information

  • OS: Linux, macOS
  • Hypervisor: KVM, vfkit

CRC version

2.24.0

CRC status

CRC VM:          Running
MicroShift:      Starting (v4.13.4)
RAM Usage:       762.4MB of 11.98GB
Disk Usage:      4.46GB of 16.1GB (Inside the CRC VM)
Cache Usage:     78.75GB

CRC config

- consent-telemetry                     : no
- preset                                : microshift

Steps to reproduce

  1. crc setup
  2. crc start

Logs

Linux

NAMESPACE              NAME                                 READY   STATUS    RESTARTS   AGE
openshift-dns          node-resolver-v9xnl                  0/1     Pending   0          4m3s
openshift-ingress      router-default-7f7d59f8f9-nhfjq      0/1     Pending   0          4m21s
openshift-service-ca   service-ca-75946c5b6d-r4hhw          0/1     Pending   0          4m21s
openshift-storage      topolvm-controller-f58fcd7cb-4jzzm   0/4     Pending   0          4m21s

MacOS

NAMESPACE              NAME                                 READY   STATUS              RESTARTS   AGE
openshift-dns          node-resolver-ltts6                  0/1     ContainerCreating   0          11m
openshift-ingress      router-default-7f7d59f8f9-bg5zj      0/1     Pending             0          12m
openshift-ingress      routes-controller                    0/1     Pending             0          11m
openshift-service-ca   service-ca-75946c5b6d-g2dgv          0/1     Pending             0          12m
openshift-storage      topolvm-controller-f58fcd7cb-vj7ps   0/4     Pending             0          12m

$ crc start --log-level debug

DEBU Running SSH command: timeout 5s oc get nodes --context microshift --cluster microshift --kubeconfig /opt/kubeconfig
DEBU SSH command results: err: Process exited with status 1, output:
DEBU E0727 02:58:38.657559    2077 memcache.go:238] couldn't get current server API group list: Get "https://api.crc.testing:6443/api?timeout=32s": dial tcp 192.168.127.2:6443: connect: connection refused
E0727 02:58:38.658645    2077 memcache.go:238] couldn't get current server API group list: Get "https://api.crc.testing:6443/api?timeout=32s": dial tcp 192.168.127.2:6443: connect: connection refused
E0727 02:58:38.659512    2077 memcache.go:238] couldn't get current server API group list: Get "https://api.crc.testing:6443/api?timeout=32s": dial tcp 192.168.127.2:6443: connect: connection refused
E0727 02:58:38.661345    2077 memcache.go:238] couldn't get current server API group list: Get "https://api.crc.testing:6443/api?timeout=32s": dial tcp 192.168.127.2:6443: connect: connection refused
E0727 02:58:38.663270    2077 memcache.go:238] couldn't get current server API group list: Get "https://api.crc.testing:6443/api?timeout=32s": dial tcp 192.168.127.2:6443: connect: connection refused
The connection to the server api.crc.testing:6443 was refused - did you specify the right host or port?
DEBU error: Temporary error: ssh command error:
command : timeout 5s oc get nodes --context microshift --cluster microshift --kubeconfig /opt/kubeconfig
err     : Process exited with status 1
 - sleeping 1s
DEBU retry loop: attempt 4
DEBU Running SSH command: timeout 5s oc get nodes --context microshift --cluster microshift --kubeconfig /opt/kubeconfig
DEBU SSH command results: err: Process exited with status 124, output:
DEBU
DEBU error: Temporary error: ssh command error:
command : timeout 5s oc get nodes --context microshift --cluster microshift --kubeconfig /opt/kubeconfig
err     : Process exited with status 124
 - sleeping 1s
DEBU retry loop: attempt 5
DEBU Running SSH command: timeout 5s oc get nodes --context microshift --cluster microshift --kubeconfig /opt/kubeconfig
DEBU SSH command results: err: <nil>, output: NAME              STATUS     ROLES                         AGE   VERSION
api.crc.testing   NotReady   control-plane,master,worker   19s   v1.26.4
DEBU NAME              STATUS     ROLES                         AGE   VERSION
api.crc.testing   NotReady   control-plane,master,worker   19s   v1.26.4
DEBU Creating /tmp/routes-controller.json with permissions 0644 in the CRC VM
DEBU Running SSH command: <hidden>
DEBU SSH command succeeded
DEBU Running SSH command: timeout 30s oc apply -f /tmp/routes-controller.json --context microshift --cluster microshift --kubeconfig /opt/kubeconfig
DEBU SSH command results: err: <nil>, output: pod/routes-controller created
INFO Adding microshift context to kubeconfig...
DEBU Making call to close driver server
DEBU (crc) Calling .Close
DEBU Successfully made call to close driver server
DEBU Making call to close connection to plugin binary
DEBU (crc) DBG | time="2023-07-27T12:28:46+05:30" level=debug msg="Closing plugin on server side"
Started the MicroShift cluster.
@jsliacan jsliacan changed the title Not able to create containers when using 4.13.4 microshfit bundle Not able to create containers when using 4.13.4 microshift bundle Jul 27, 2023
@praveenkumar
Copy link
Member

Looking at the node description, looks like no cni plugin. will check with microshift eng.

Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Thu, 27 Jul 2023 13:58:56 +0530   Thu, 27 Jul 2023 13:48:56 +0530   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Thu, 27 Jul 2023 13:58:56 +0530   Thu, 27 Jul 2023 13:48:56 +0530   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Thu, 27 Jul 2023 13:58:56 +0530   Thu, 27 Jul 2023 13:48:56 +0530   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Thu, 27 Jul 2023 13:58:56 +0530   Thu, 27 Jul 2023 13:48:56 +0530   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/cni/net.d/. Has your network provider started?

@praveenkumar
Copy link
Member

Following service is failing.

$ systemctl status ovsdb-server.service 
× ovsdb-server.service - Open vSwitch Database Unit
     Loaded: loaded (/usr/lib/systemd/system/ovsdb-server.service; static)
    Drop-In: /etc/systemd/system/ovsdb-server.service.d
             └─microshift-cpuaffinity.conf
     Active: failed (Result: exit-code) since Thu 2023-07-27 05:34:17 EDT; 28s ago
    Process: 3436 ExecStartPre=/usr/bin/rm -f /run/openvswitch.useropts (code=exited, status=0/SUCCESS)
    Process: 3437 ExecStartPre=/usr/bin/chown ${OVS_USER_ID} /run/openvswitch /var/log/openvswitch (code=exited, status=0/SUCCESS)
    Process: 3438 ExecStartPre=/bin/sh -c /usr/bin/echo "OVS_USER_ID=${OVS_USER_ID}" > /run/openvswitch.useropts (code=exited, status=0/SUCCESS)
    Process: 3440 ExecStartPre=/bin/sh -c if [ "$${OVS_USER_ID/:*/}" != "root" ]; then /usr/bin/echo "OVS_USER_OPT=--ovs-user=${OVS_USER_ID}" >> /run/openvswitch.useropts; fi (code=exited, status=0/SUCCESS)
    Process: 3442 ExecStart=/usr/share/openvswitch/scripts/ovs-ctl --no-ovs-vswitchd --no-monitor --system-id=random ${OVS_USER_OPT} start $OPTIONS (code=exited, status=1/FAILURE)
        CPU: 64ms

Jul 27 05:34:17 api.crc.testing systemd[1]: ovsdb-server.service: Control process exited, code=exited, status=1/FAILURE
Jul 27 05:34:17 api.crc.testing systemd[1]: ovsdb-server.service: Failed with result 'exit-code'.
Jul 27 05:34:17 api.crc.testing systemd[1]: Failed to start Open vSwitch Database Unit.
Jul 27 05:34:17 api.crc.testing systemd[1]: ovsdb-server.service: Scheduled restart job, restart counter is at 4.
Jul 27 05:34:17 api.crc.testing systemd[1]: Stopped Open vSwitch Database Unit.
Jul 27 05:34:17 api.crc.testing systemd[1]: ovsdb-server.service: Start request repeated too quickly.
Jul 27 05:34:17 api.crc.testing systemd[1]: ovsdb-server.service: Failed with result 'exit-code'.
Jul 27 05:34:17 api.crc.testing systemd[1]: Failed to start Open vSwitch Database Unit.

Journalctl logs

Jul 27 05:36:28 api.crc.testing systemd[1]: ovsdb-server.service: Scheduled restart job, restart counter is at 4.
Jul 27 05:36:28 api.crc.testing systemd[1]: Stopped Open vSwitch Database Unit.
Jul 27 05:36:28 api.crc.testing systemd[1]: Starting Open vSwitch Database Unit...
Jul 27 05:36:29 api.crc.testing ovsdb-server[4771]: ovs|00001|daemon_unix|EMER|(null): Invalid --user option openvswitch:hugetlbfs (user openvswitch is not in group hugetlbfs), aborting.
Jul 27 05:36:29 api.crc.testing ovs-ctl[4771]: ovsdb-server: (null): Invalid --user option openvswitch:hugetlbfs (user openvswitch is not in group hugetlbfs), aborting.
Jul 27 05:36:29 api.crc.testing ovs-ctl[4724]: Starting ovsdb-server ... failed!
Jul 27 05:36:29 api.crc.testing systemd[1]: ovsdb-server.service: Control process exited, code=exited, status=1/FAILURE
Jul 27 05:36:29 api.crc.testing systemd[1]: ovsdb-server.service: Failed with result 'exit-code'.
Jul 27 05:36:29 api.crc.testing systemd[1]: Failed to start Open vSwitch Database Unit.
Jul 27 05:36:29 api.crc.testing systemd[1]: ovsdb-server.service: Scheduled restart job, restart counter is at 5.
Jul 27 05:36:29 api.crc.testing systemd[1]: Stopped Open vSwitch Database Unit.
Jul 27 05:36:29 api.crc.testing systemd[1]: ovsdb-server.service: Start request repeated too quickly.
Jul 27 05:36:29 api.crc.testing systemd[1]: ovsdb-server.service: Failed with result 'exit-code'.
Jul 27 05:36:29 api.crc.testing systemd[1]: Failed to start Open vSwitch Database Unit.
Jul 27 05:36:29 api.crc.testing systemd[1]: ovsdb-server.service: Start request repeated too quickly.
Jul 27 05:36:29 api.crc.testing systemd[1]: ovsdb-server.service: Failed with result 'exit-code'.
Jul 27 05:36:29 api.crc.testing systemd[1]: Failed to start Open vSwitch Database Unit.
lines 411-463/463 (END)

@praveenkumar
Copy link
Member

For 4.13.3 where it all worked have

$ ssh crc -- cat /usr/lib/group | grep open
Warning: Permanently added '192.168.130.11' (ED25519) to the list of known hosts.
openvswitch:x:992:
hugetlbfs:x:1000:openvswitch

but for 4.13.4 have

$ ssh crc -- cat /usr/lib/group | grep open
Warning: Permanently added '192.168.130.11' (ED25519) to the list of known hosts.
openvswitch:x:992:

@praveenkumar
Copy link
Member

So https://issues.redhat.com/browse/OCPBUGS-15948 is actual issue and looks like it is with openvswitch side and for microshift they recently pinned the version to working one openshift/microshift@8ba6e57

@cfergeau
Copy link
Contributor

Workaround is openshift/os#1318

@praveenkumar
Copy link
Member

We don't need workaround in snc or crc side. microshift already taken that into account and I created 4.13.6 bundles which works as expected now. So I am closing this issue.

@gbraad
Copy link
Contributor

gbraad commented Jul 28, 2023

@praveenkumar so this was fixed before, but not in the release we used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

4 participants