Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running kind on rancher desktop fails with: failed to create cluster: failed to init node with kubeadm #3505

Closed
marcindulak opened this issue Feb 4, 2024 · 5 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@marcindulak
Copy link
Contributor

Posting this as a question, since I don't know if this scenario is supported.

Dockerfile

FROM docker.io/docker:25 AS docker

FROM docker.io/debian:stable

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends ca-certificates curl

WORKDIR /kind/bin

COPY --from=docker /usr/local/bin/docker .
RUN chmod u+x docker

RUN curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.21.0/kind-$(uname)-amd64 && chmod u+x kind

RUN curl -LO https://dl.k8s.io/release/v1.29.1/bin/linux/amd64/kubectl && chmod u+x kubectl
RUN curl -LO https://dl.k8s.io/release/v1.29.1/bin/linux/amd64/kubectl.sha256
RUN echo "$(cat kubectl.sha256) kubectl" | sha256sum --check

ENV PATH=/kind/bin:$PATH
ENV DOCKER_DEFAULT_PLATFORM=linux/amd64

compose.yaml

services:
  kind:
    build:
      context: .
      dockerfile: ./Dockerfile
    command: sh -c "sleep infinity"
    privileged: true
    volumes:
       - /var/run/docker.sock:/var/run/docker.sock

First, the hello-world seems to be working inside of DooD (https://blog.teracy.com/2017/09/11/how-to-use-docker-in-docker-dind-and-docker-outside-of-docker-dood-for-local-ci-testing/)

export DOCKER_DEFAULT_PLATFORM=linux/amd64
docker compose up -d
docker compose exec kind bash -c "docker run hello-world"
docker compose exec kind bash -c "kind version"
kind v0.21.0 go1.20.13 linux/amd64

but creation of a kind cluster fails

docker compose exec kind bash -c "kind delete cluster || true"
docker compose exec kind bash -c "kind create cluster -v 3"

Output

Creating cluster "kind" ...
DEBUG: docker/images.go:58] Image: kindest/node:v1.29.1@sha256:a0cc28af37cf39b019e2b448c54d1a3f789de32536cb5a5db61a49623e527144 present locally
 ✓ Ensuring node image (kindest/node:v1.29.1) 🖼 
 ✓ Preparing nodes 📦  
DEBUG: config/config.go:96] Using the following kubeadm config for node kind-control-plane:
apiServer:
  certSANs:
  - localhost
  - 127.0.0.1
  extraArgs:
    runtime-config: ""
apiVersion: kubeadm.k8s.io/v1beta3
clusterName: kind
controlPlaneEndpoint: kind-control-plane:6443
controllerManager:
  extraArgs:
    enable-hostpath-provisioner: "true"
kind: ClusterConfiguration
kubernetesVersion: v1.29.1
networking:
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/16
scheduler:
  extraArgs: null
---
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- token: abcdef.0123456789abcdef
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 172.20.0.2
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock
  kubeletExtraArgs:
    node-ip: 172.20.0.2
    node-labels: ""
    provider-id: kind://docker/kind/kind-control-plane
---
apiVersion: kubeadm.k8s.io/v1beta3
controlPlane:
  localAPIEndpoint:
    advertiseAddress: 172.20.0.2
    bindPort: 6443
discovery:
  bootstrapToken:
    apiServerEndpoint: kind-control-plane:6443
    token: abcdef.0123456789abcdef
    unsafeSkipCAVerification: true
kind: JoinConfiguration
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock
  kubeletExtraArgs:
    node-ip: 172.20.0.2
    node-labels: ""
    provider-id: kind://docker/kind/kind-control-plane
---
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd
cgroupRoot: /kubelet
evictionHard:
  imagefs.available: 0%
  nodefs.available: 0%
  nodefs.inodesFree: 0%
failSwapOn: false
imageGCHighThresholdPercent: 100
kind: KubeletConfiguration
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
conntrack:
  maxPerCore: 0
iptables:
  minSyncPeriod: 1s
kind: KubeProxyConfiguration
mode: iptables
 ✓ Writing configuration 📜 
DEBUG: kubeadminit/init.go:82] I0204 20:16:37.384713     245 initconfiguration.go:260] loading configuration from "/kind/kubeadm.conf"
W0204 20:16:37.404783     245 initconfiguration.go:341] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
 ✗ Starting control-plane 🕹️
Deleted nodes: ["kind-control-plane"]
ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged kind-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 137
Command Output: I0204 20:16:37.384713     245 initconfiguration.go:260] loading configuration from "/kind/kubeadm.conf"
W0204 20:16:37.404783     245 initconfiguration.go:341] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
Stack Trace: 
sigs.k8s.io/kind/pkg/errors.WithStack
	sigs.k8s.io/kind/pkg/errors/errors.go:59
sigs.k8s.io/kind/pkg/exec.(*LocalCmd).Run
	sigs.k8s.io/kind/pkg/exec/local.go:124
sigs.k8s.io/kind/pkg/cluster/internal/providers/docker.(*nodeCmd).Run
	sigs.k8s.io/kind/pkg/cluster/internal/providers/docker/node.go:146
sigs.k8s.io/kind/pkg/exec.CombinedOutputLines
	sigs.k8s.io/kind/pkg/exec/helpers.go:67
sigs.k8s.io/kind/pkg/cluster/internal/create/actions/kubeadminit.(*action).Execute
	sigs.k8s.io/kind/pkg/cluster/internal/create/actions/kubeadminit/init.go:81
sigs.k8s.io/kind/pkg/cluster/internal/create.Cluster
	sigs.k8s.io/kind/pkg/cluster/internal/create/create.go:135
sigs.k8s.io/kind/pkg/cluster.(*Provider).Create
	sigs.k8s.io/kind/pkg/cluster/provider.go:181
sigs.k8s.io/kind/pkg/cmd/kind/create/cluster.runE
	sigs.k8s.io/kind/pkg/cmd/kind/create/cluster/createcluster.go:110
sigs.k8s.io/kind/pkg/cmd/kind/create/cluster.NewCommand.func1
	sigs.k8s.io/kind/pkg/cmd/kind/create/cluster/createcluster.go:54
github.com/spf13/cobra.(*Command).execute
	github.com/spf13/[email protected]/command.go:856
github.com/spf13/cobra.(*Command).ExecuteC
	github.com/spf13/[email protected]/command.go:974
github.com/spf13/cobra.(*Command).Execute
	github.com/spf13/[email protected]/command.go:902
sigs.k8s.io/kind/cmd/kind/app.Run
	sigs.k8s.io/kind/cmd/kind/app/main.go:53
sigs.k8s.io/kind/cmd/kind/app.Main
	sigs.k8s.io/kind/cmd/kind/app/main.go:35
main.main
	sigs.k8s.io/kind/main.go:25
runtime.main
	runtime/proc.go:250
runtime.goexit
	runtime/asm_amd64.s:1598

I'm running this on an arm64 Mac M1, with rancher desktop https://github.com/rancher-sandbox/rancher-desktop/releases/tag/v1.12.2. Could the problem be related to #3277?

This is the information on the host, outside of DooD

docker info
Client:
 Version:    24.0.7-rd
 Context:    rancher-desktop
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.0
    Path:     /Users/test/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.23.3
    Path:     /Users/test/.docker/cli-plugins/docker-compose

Server:
 Containers: 3
  Running: 1
  Paused: 0
  Stopped: 2
 Images: 8
 Server Version: 23.0.6
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 0cae528dd6cb557f7201036e9f43420650207b58
 runc version: 860f061b76bb4fc671f0f9e900f7d80ff93d4eb7
 init version: 
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 6.1.64-0-virt
 Operating System: Alpine Linux v3.18
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 7.749GiB
 Name: lima-rancher-desktop
 ID: XXX
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

This is inside of DooD

docker compose exec kind bash -c "docker info"
Client:
 Version:    25.0.2
 Context:    default
 Debug Mode: false

Server:
 Containers: 3
  Running: 1
  Paused: 0
  Stopped: 2
 Images: 8
 Server Version: 23.0.6
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 0cae528dd6cb557f7201036e9f43420650207b58
 runc version: 860f061b76bb4fc671f0f9e900f7d80ff93d4eb7
 init version: 
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 6.1.64-0-virt
 Operating System: Alpine Linux v3.18
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 7.749GiB
 Name: lima-rancher-desktop
 ID: XXX
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

I tried also a custom kind config, like in #1281 (comment), without an improvement.

@marcindulak marcindulak added the kind/support Categorizes issue or PR as a support question. label Feb 4, 2024
@marcindulak
Copy link
Contributor Author

marcindulak commented Feb 5, 2024

I'm getting this error even when running kind create cluster directly on the MacOS M1 arm64 host, with rancher desktop https://github.com/rancher-sandbox/rancher-desktop/releases/tag/v1.12.2 and https://github.com/kubernetes-sigs/kind/releases/download/v0.21.0/kind-darwin-arm64

Here are the rancher desktop settings

Screenshot 2024-02-05 at 19 11 10 Screenshot 2024-02-05 at 19 10 57

@marcindulak marcindulak changed the title Running kind in DooD fails with: failed to create cluster: failed to init node with kubeadm Running kind on rancher desktop fails with: failed to create cluster: failed to init node with kubeadm Feb 5, 2024
@BenTheElder
Copy link
Member

AFAIK rancher desktop was recently fixed (see #3277 which is pinned in the issue tracker)

We really don't recommend adding another Docker-in-Docker layer, it causes a lot of headaches and it's generally unnecessary since kind is available as a static go binary that otherwise just needs access to docker.

@marcindulak
Copy link
Contributor Author

I don't see any cgroup mention in my error print.
Assuming it's the same issue, in #3277 (comment) it reads

wait until a version of Rancher Desktop with Alpine 3.19 is out for verification. That is probably not going to happen until early March though

I would say it's a bit misleading that the above issue is closed, if the problem persists.

I switched to https://github.com/kubernetes-sigs/kind/releases/tag/v0.19.0, and kind runs without errors for now.

@BenTheElder
Copy link
Member

I would say it's a bit misleading that the above issue is closed, if the problem persists.

It's still pinned, and the broken behavior is in the rancher desktop distro.

@BenTheElder
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

2 participants