-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] Remove SSH jump pod for port-forward mode #3657
Conversation
…o k8s_sshjump_remove
…o k8s_sshjump_remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome @romilbhardwaj! This should improve the robustness of our Kubernetes support. The code looks mostly good to me.
Can we test the backward compatibility for existing clusters launched in master?
Thanks @Michaelvll! Ran manual backward compatibility tests by launching from master -> switching to this branch -> try ssh on master cluster -> launch new cluster -> verify ssh, Running smoke tests. |
…o k8s_sshjump_remove_v2
Ran into an issue with custom images which use a different default username than |
Fixed the custom image support by dynamically updating ProxyCommand once the ssh_user is determined and updated in the cluster handle. Running smoke tests.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the PR @romilbhardwaj! LGTM with a minor comment.
auth_config = backend_utils.ssh_credential_from_yaml( | ||
handle.cluster_yaml, | ||
ssh_user=handle.ssh_user, | ||
docker_user=handle.docker_user) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we test this for other clouds with image_id
specified with docker:xxx
, just to make sure changing this will not affect ssh
for those?
Or, if we have passed in the ssh_user
and docker_user
here, should we remove the argument of handle.docker_user
and handle.ssh_user
in the add_cluster
function below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - tested with pytest tests/test_smoke.py::test_job_queue_with_docker --gcp
.
Thanks! Tested:
|
* working prototype of direct-to-pod port-forwarding * lint * switch to using head as jump * removed ssh jump pod * remove sleep * update note * comments * remove vestiges * updates * remove slash * add ssh_user placeholder * fix private key * lint
Closes #3566. SSH jump pod is not required when using port-forward mode. This PR directly
kubectl port-forwards
to the head pod.Also removes the sleep in our proxycommand. This was previously required for thread-safe concurrent SSH connections when SkyPilot was using SSHCommandRunner for Kubernetes (#2628), but with #3157, SSH is no longer used. This improves SSH connection latency significantly (~2 seconds). Up to 5 concurrent SSH connection requests work fine without the sleep, which should be enough for most usage of SSH outside of SkyPilot.
Also lays the groundwork for easy switching between kubecontexts/kubeconfigs while retaining SSH functionality (requested by user).
Benchmarks
======= This branch - After removing SSH Jump pod and sleep =======
======= Master - with SSH Jump pod =======
Tested (run the relevant ones):
bash format.sh
sky launch -c test --num-nodes 2 --cloud kubernetes
, followed byssh test ls
andssh test-worker1 ls