Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8s unable to start due to issue with update server #7412

Closed
sbmw opened this issue Sep 4, 2024 · 11 comments · Fixed by #7461
Closed

K8s unable to start due to issue with update server #7412

sbmw opened this issue Sep 4, 2024 · 11 comments · Fixed by #7461
Assignees
Labels
area/kubernetes k8s and related, like traefik kind/bug Something isn't working
Milestone

Comments

@sbmw
Copy link

sbmw commented Sep 4, 2024

Actual Behavior

Rancher Desktop displays an error during startup re invalid response from https://update.k3s.io/v1-release/channels

K8s fails to launch and RD > Preferences > Kubernetes > Kubernetes Version dropdown is empty

Steps to Reproduce

Start Rancher Desktop with K8s enabled

Result

Error Window Messaging

Error Starting Kubernetes

FetchError: invalid json response body at https://update.k3s.io/v1-release/channels reason: Unexpected token '<', "
<!DOCTYPE "... is not valid JSON

Context:
Starting Backend

Some recent logfile lines:

2024-09-04T10:27:33.360Z: Error starting lima: FetchError: invalid json response body at https://update.k3s.io/v1-release/channels reason: Unexpected token '<', "
<!DOCTYPE "... is not valid JSON
    at /tmp/.mount_RancheDoGFx8/opt/rancher-desktop/resources/app.asar/node_modules/node-fetch/lib/index.js:273:32
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Y.updateCache (/tmp/.mount_RancheDoGFx8/opt/rancher-desktop/resources/app.asar/dist/app/background.js:2:137378)
    at async /tmp/.mount_RancheDoGFx8/opt/rancher-desktop/resources/app.asar/dist/app/background.js:2:139096 {
  type: 'invalid-json'
}

Expected Behavior

A failed update check should not result in a failure to launch.

Additional Information

My RD setup was fully operational before restarting it this morning.

I tried disabling wifi to bypass the check, but then get:

Kubernetes Error
Rancher Desktop 1.15.1 - linux (x64)

Error Starting Kubernetes
Error: No kubernetes version available.

Context:
Starting Backend

Some recent [logfile](app://./index.html#) lines:
time="2024-09-04T11:39:52+01:00" level=info msg="[hostagent] Not forwarding TCP [::]:22"
time="2024-09-04T11:39:52+01:00" level=info msg="[hostagent] The final requirement 1 of 1 is satisfied"
time="2024-09-04T11:39:52+01:00" level=info msg="READY. Run `limactl shell 0` to open the shell."

2024-09-04T10:39:52.897Z: Error starting lima: Error: No kubernetes version available.
    at q.getDesiredVersion (/tmp/.mount_RancheFsai2L/opt/rancher-desktop/resources/app.asar/dist/app/background.js:29:69414)
    at /tmp/.mount_RancheFsai2L/opt/rancher-desktop/resources/app.asar/dist/app/background.js:29:165783
    at async Xt.download (/tmp/.mount_RancheFsai2L/opt/rancher-desktop/resources/app.asar/dist/app/background.js:29:160698)
    at async /tmp/.mount_RancheFsai2L/opt/rancher-desktop/resources/app.asar/dist/app/background.js:29:139072

Rancher Desktop Version

1.15.1

Rancher Desktop K8s Version

1.30.4

Which container engine are you using?

moby (docker cli)

What operating system are you using?

Other Linux

Operating System / Build Version

Fedora 40 KDE

What CPU architecture are you using?

x64

Linux only: what package format did you use to install Rancher Desktop?

AppImage

Windows User Only

No response

@sbmw sbmw added the kind/bug Something isn't working label Sep 4, 2024
@sbmw
Copy link
Author

sbmw commented Sep 4, 2024

FYI this is running as expected again as https://update.k3s.io/v1-release/channels is now returning json (while this morning it was returning an html page).

So feel free to close, but I've left open for now in case it's worth allowing RD/K8s startup to continue following any issue with this check?

@jandubois jandubois added the area/kubernetes k8s and related, like traefik label Sep 5, 2024
@jandubois
Copy link
Member

This is still happening (from a report on Slack):

2024-09-05T19:55:09.296Z: Progress: errored Resetting Kubernetes state...: FetchError: invalid json response body at https://update.k3s.io/v1-release/channels reason: Unexpected token '<', "
<!DOCTYPE "... is not valid JSON
2024-09-05T19:55:09.446Z: Capturing output: wsl.exe --list --verbose
2024-09-05T19:55:09.643Z: Registered distributions: docker-desktop-data,docker-desktop,rancher-desktop-data,rancher-desktop,Ubuntu
2024-09-05T19:55:09.643Z: Capturing output: wsl.exe --list --quiet
2024-09-05T19:55:09.829Z: Capturing output: wsl.exe --list --verbose
2024-09-05T19:55:10.032Z: Registered distributions: docker-desktop-data,docker-desktop,rancher-desktop-data,rancher-desktop,Ubuntu
2024-09-05T19:55:10.033Z: Capturing output: wsl.exe --list --quiet
2024-09-05T19:55:10.218Z: Capturing output: wsl.exe --list --verbose

@jandubois jandubois added this to the 1.16 milestone Sep 5, 2024
@jandubois
Copy link
Member

@VueAddict
Copy link

I have the same issue as of today.

@snewell92
Copy link

time for an s3 or flat-file-in-a-GH-Repo mirror that only can have json? (in addition to handling that error and continuing without info?)

@mook-as mook-as self-assigned this Sep 10, 2024
@mook-as
Copy link
Contributor

mook-as commented Sep 10, 2024

Odd; that list is cached on disk and should only block startup if you're in first run. Nevertheless, I think we can do some things to make it better:

  • If the versions list is unavailable, gracefully start anyway (with Kubernetes disabled); this means not allowing Kubernetes to be enabled in first-run.
  • Double check our versions fetching logic and make sure we don't update the list if it's not correctly downloaded.

@jandubois
Copy link
Member

@sbmw Did you do a factory reset just before this error occurred? We've been trying to reproduce this failure mode, and they only code paths where this seems possible is when the settings don't yet contain a Kubernetes version, and that setting is filled in during the first start, even if you disable Kubernetes.

@sbmw
Copy link
Author

sbmw commented Sep 12, 2024

@jandubois Sorry, no factory reset, just a laptop restart. Where is the list cached? Perhaps I or some other cleanup process cleared it?

@jandubois
Copy link
Member

@sbmw It should be in ~/.cache/rancher-desktop/k3s-versions.json.

We have merged a "fix" that should trap this error going forward, but I'm still concerned that I've not been able to reproduce it myself.

The only way I can see this happening is when your settings doesn't yet include a value for kubernetes.version. But once the version list has been initialized once, that field should be set.

Furthermore, even when update.k3s.io is not working correctly, we are still falling back into fetching the list of all versions from the GitHub release (we just won't have any stable or latest labels without the channel file). So we should always have a "default" Kubernetes version, and in my tests with 1.15.1 I never could get it to show an error dialog, even when the mocked-up channel server was returning an error page.

Anyways, hoping this is fixed, but somewhat unsatisfying that I cannot confirm it because I cannot repro it.

@sbmw
Copy link
Author

sbmw commented Sep 13, 2024

@jandubois

Oh no... I didn't do a factory reset, but BleachBit ran when I restarted my laptop. (Use of a system clean-up app is an old habit that I need to drop.)

I just confirmed that this does purge ~/.cache/rancher-desktop/k3s-versions.json (but not ~/.config/rancher-desktop/settings.json nor the value for kubernetes.version). I expect this explains the unexpected outcome when the update server wasn't returning json for a brief time.

Please accept my apologies for any wasted time and I hope that at least some good has come from it.

Thank you all for an excellent product and support!

@jandubois
Copy link
Member

I just confirmed that this does purge ~/.cache/rancher-desktop/k3s-versions.json (but not ~/.config/rancher-desktop/settings.json nor the value for kubernetes.version). I expect this explains the unexpected outcome when the update server wasn't returning json for a brief time.

Thank you, it does indeed explain it, and now I can reproduce the issue. Unfortunately it seems our current "fix" for this just trades this issue for a different failure. I will create a new GitHub issue for it, but it will probably not be fixed for the next release, as it is quite an edge case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes k8s and related, like traefik kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants