Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8s] Add debug message for timeout #3821

Merged
merged 1 commit into from
Aug 9, 2024
Merged

Conversation

romilbhardwaj
Copy link
Collaborator

Adds a debugging hint to run kubectl when get_nodes or get_pods fails. This is usually caused by network issues, and running kubectl for debugging will assure the user the issue is related to k8s reachability, not SkyPilot.

Tested (run the relevant ones):

  • Code formatting: bash format.sh=

Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the command looks good to me. Just wondering if we can somehow print the error message directly by calling the kubernetes API in our code when the timeout happens?

@romilbhardwaj
Copy link
Collaborator Author

Thanks! I think it might be better to let the user run it themselves since it may take a long time (e.g., in the event of a network call timeout, default timeout is 60s, retried 3x) and may delay our response.

@romilbhardwaj romilbhardwaj added this pull request to the merge queue Aug 9, 2024
Merged via the queue into master with commit 1c32aa4 Aug 9, 2024
20 checks passed
@romilbhardwaj romilbhardwaj deleted the k8s_error_msg_debug branch August 9, 2024 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants