-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[K8s] Zero config networking for Kubernetes #2500
Merged
Merged
Changes from all commits
Commits
Show all changes
211 commits
Select commit
Hold shift + click to select a range
0431f96
Working Ray K8s node provider based on SSH
romilbhardwaj 5f715e8
Merge branch 'master' into k8s_cloud
romilbhardwaj 197acea
wip
romilbhardwaj f06b22d
working provisioning with SkyPilot and ssh config
romilbhardwaj cf1ddec
working provisioning with SkyPilot and ssh config
romilbhardwaj 0937cc3
Merge branch 'master' into k8s_cloud
romilbhardwaj 40aad6d
Updates to master
romilbhardwaj 47d0953
ray2.3
romilbhardwaj 9f59467
Clean up docs
romilbhardwaj 07f9bcb
multiarch build
romilbhardwaj bd12014
hacking around ray start
romilbhardwaj 4baf0b6
more port fixes
romilbhardwaj b08eb1b
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cloud
romilbhardwaj 7ed02eb
fix up default instance selection
romilbhardwaj 898a851
fix resource selection
romilbhardwaj fcb51d1
Add provisioning timeout by checking if pods are ready
romilbhardwaj 13eb198
Working mounting
romilbhardwaj 428f143
Remove catalog
romilbhardwaj ebf9d83
fixes
romilbhardwaj da570fc
fixes
romilbhardwaj 1bea866
Fix ssh-key auth to create unique secrets
romilbhardwaj 9def756
Fix for ContainerCreating timeout
romilbhardwaj 8f9cafe
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cloud
romilbhardwaj 65366eb
Fix head node ssh port caching
romilbhardwaj b984ead
mypy
romilbhardwaj 3bca8a9
lint
romilbhardwaj 61df297
fix ports
romilbhardwaj 036eaf9
typo
romilbhardwaj 95e160c
cleanup
romilbhardwaj 301a914
cleanup
romilbhardwaj 2c88daf
wip
romilbhardwaj 7ece7f7
Update setup
romilbhardwaj cc85f94
readme updates
romilbhardwaj 0450cee
lint
romilbhardwaj f3f0578
Fix failover
romilbhardwaj 574a9c6
Fix failover
romilbhardwaj 0632b48
optimize setup
romilbhardwaj 05508d3
Fix sync down logs for k8s
romilbhardwaj fb36a40
test wip
romilbhardwaj 7db4027
instance name parsing wip
romilbhardwaj 632ed30
Fix instance name parsing
romilbhardwaj d7bd766
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cloud
romilbhardwaj 1a444d1
Merge fixes for query_status
romilbhardwaj da9cba2
[k8s_cloud] Delete k8s service resources. (#2105)
aviweit 81871ac
Status refresh WIP
romilbhardwaj 0d1c4ac
refactor to kubernetes adaptor
romilbhardwaj 8017020
tests wip
romilbhardwaj 5d7f8e8
clean up auth
romilbhardwaj aa787f8
wip tests
romilbhardwaj c026559
cli
romilbhardwaj 3dc80d2
cli
romilbhardwaj 63ce29b
sky local up/down cli
romilbhardwaj f9d5b73
cli
romilbhardwaj b81647a
lint
romilbhardwaj 050cfc2
lint
romilbhardwaj d64c394
lint
romilbhardwaj 7367b4a
Speed up kind cluster creation
romilbhardwaj 756c56c
tests
romilbhardwaj d4c0990
lint
romilbhardwaj b64dd19
tests
romilbhardwaj 10333d7
handling for non-reachable clusters
romilbhardwaj b07fc58
Invalid kubeconfig handling
romilbhardwaj 5af58aa
Timeout for sky check
romilbhardwaj 4d6710f
code cleanup
romilbhardwaj c057c88
lint
romilbhardwaj b8e414e
Do not raise error if GPUs requested, return empty list
romilbhardwaj c2ebfe7
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cloud
romilbhardwaj 1fc857b
Address comments
romilbhardwaj 0ae92eb
comments
romilbhardwaj 10f302f
lint
romilbhardwaj 2a4caac
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cloud
romilbhardwaj 54b2b28
Remove public key upload
romilbhardwaj fc362b7
GPU support init
romilbhardwaj 36f9ebc
wip
romilbhardwaj 5ee821d
add shebang
romilbhardwaj d6ca85a
comments
romilbhardwaj fbae4bf
change permissions
romilbhardwaj 6e9e6ba
remove chmod
romilbhardwaj 7fa9d7e
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cloud
romilbhardwaj a3f827e
merge 2241
romilbhardwaj 9687ea8
add todo
romilbhardwaj 4b54555
Handle kube config management for sky local commands (#2253)
hemildesai f73f1b2
Switch context in create_cluster if cluster already exists.
romilbhardwaj 0c45b9a
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cloud
romilbhardwaj a69df01
fix typo
romilbhardwaj ff1d832
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cloud
romilbhardwaj 6a931e2
update sky check error msg after sky local down
romilbhardwaj 662e4b9
lint
romilbhardwaj 4046749
update timeout check
romilbhardwaj 92d588d
fix import error
romilbhardwaj 9ff1662
Fix kube API access from within cluster (load_incluster_auth)
romilbhardwaj 364b03f
lint
romilbhardwaj 691f6b7
lint
romilbhardwaj ed0741f
working autodown and sky status -r
romilbhardwaj 3fe9bfb
lint
romilbhardwaj b98ced3
add test_kubernetes_autodown
romilbhardwaj 07ea97d
lint
romilbhardwaj 73ee737
address comments
romilbhardwaj 7726850
address comments
romilbhardwaj 2ee4833
lint
romilbhardwaj 9e0f5b6
deletion timeouts wip
romilbhardwaj b36fba4
[k8s_cloud] Ray pod not created under current context namespace. (#2302)
aviweit c137360
Merge branch 'k8s_cloud' of github.com:skypilot-org/skypilot into k8s…
romilbhardwaj a806b39
head ssh port namespace fix
romilbhardwaj a9b9636
[k8s-cloud] Typo in sky local --help. (#2308)
aviweit 7903339
[k8s-cloud] Set build_image.sh to be executable. (#2307)
aviweit 4ab5329
remove ingress
romilbhardwaj 4b49241
remove debug statements
romilbhardwaj 83aecd3
UX and readme updates
romilbhardwaj bdeb7d5
lint
romilbhardwaj 993f736
Merge branch 'k8s_cloud' of github.com:skypilot-org/skypilot into k8s…
romilbhardwaj 4fb1d94
fix logging for 409 retry
romilbhardwaj 02e3415
lint
romilbhardwaj c1b7438
lint
romilbhardwaj b9701ca
Merge branch 'k8s_cloud' of github.com:skypilot-org/skypilot into k8s…
romilbhardwaj 4289462
Debug dockerfile
romilbhardwaj 3d770bd
wip
romilbhardwaj 25f84b1
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cl…
romilbhardwaj 2875ff9
Fix GPU image
romilbhardwaj 1202c34
Query cloud specific env vars in task setup (#2347)
hemildesai d8e5bd2
Merge branch 'k8s_cloud_beta1' of github.com:skypilot-org/skypilot in…
romilbhardwaj d1a6ef4
working GPU type selection for GKE and EKS. GFD needs work.
romilbhardwaj b3fcadc
TODO for auto-detection
romilbhardwaj 4a7d5d7
Add image toggling for CPU/GPU
romilbhardwaj 85ee1e1
Add image toggling for CPU/GPU
romilbhardwaj d95438b
Fix none acce_type
romilbhardwaj 607ad85
remove memory from j2
romilbhardwaj 6f702da
Make resnet examples run again
romilbhardwaj 738ae19
lint
romilbhardwaj 9cdbf86
Merge branch 'example_resnet_cudnn' of github.com:skypilot-org/skypil…
romilbhardwaj c3420a8
v100 readme
romilbhardwaj c87c64d
dockerfile and smoketest
romilbhardwaj 85f2b9e
fractional cpu and mem
romilbhardwaj 509fd96
nits
romilbhardwaj 22b1d17
refactor utils
romilbhardwaj 552481c
lint and cleanup
romilbhardwaj 33b29b8
lint and cleanup
romilbhardwaj e65d3c1
lint and cleanup
romilbhardwaj 82327fb
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cl…
romilbhardwaj 22fc6ad
lint and cleanup
romilbhardwaj 3e9656a
lint and cleanup
romilbhardwaj be3d905
lint and cleanup
romilbhardwaj 277295a
lint
romilbhardwaj 69168dd
lint
romilbhardwaj cfe2502
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cl…
romilbhardwaj 3004951
manual lint
romilbhardwaj b76b3a6
manual isort
romilbhardwaj 7207c34
test readme update
romilbhardwaj 56ac60f
Remove EKS
romilbhardwaj d988307
lint
romilbhardwaj a208d91
add gpu labeler
romilbhardwaj c857a9d
updates
romilbhardwaj ee89f65
lint
romilbhardwaj 8934b22
update script
romilbhardwaj 9b5019b
ux
romilbhardwaj 53e5d80
fix formatter
romilbhardwaj f806aed
test update
romilbhardwaj 4bf43ee
test update
romilbhardwaj 429eed4
fix test_optimizer_dryruns
romilbhardwaj 8dd1a76
docs
romilbhardwaj df10bc6
cleanup
romilbhardwaj 512d9fb
test readme update
romilbhardwaj 858eb51
lint
romilbhardwaj 96647bf
lint
romilbhardwaj fdff1a6
[k8s_cloud_beta1] Add sshjump host support. (#2369)
aviweit 1b8385c
Merge branch 'k8s_cloud_beta1' of github.com:skypilot-org/skypilot in…
romilbhardwaj 3ef135a
Update build image
romilbhardwaj 7b638cc
fix image path
romilbhardwaj 7da33e4
fix merge
romilbhardwaj 5d4d27c
cleanup
romilbhardwaj e9d0ed1
lint
romilbhardwaj f21b50a
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cl…
romilbhardwaj f736236
fix utils ref
romilbhardwaj 7b5d0b5
typo
romilbhardwaj 8a3d5a7
refactor pod creation
romilbhardwaj 58b8126
lint
romilbhardwaj f9b401e
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cl…
romilbhardwaj 950de00
merge fixes
romilbhardwaj 292a350
portfix
romilbhardwaj c17f854
merge fixes
romilbhardwaj 330c3b4
[k8s_cloud_beta1] Sky down for a cluster deployed in Kubernetes to po…
aviweit d760676
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_cl…
romilbhardwaj f2ea761
cleanup
romilbhardwaj 9d34ff7
Add networking benchmarks
romilbhardwaj 5b5aacd
comment
romilbhardwaj aae4676
comment
romilbhardwaj 2eedca6
lint
romilbhardwaj b07748d
autodown fixes
romilbhardwaj e379291
lint
romilbhardwaj 482a69d
fix label
romilbhardwaj fb09398
[k8s_cloud_beta1] Adding support for ssh using kubectl port-forward t…
landscapepainter a721f83
refactor
romilbhardwaj 9a1cdbe
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_ze…
romilbhardwaj c620f94
fix
romilbhardwaj 94bf1a9
updates
romilbhardwaj 48d53a5
lint
romilbhardwaj 08fd88d
Update sky/skylet/providers/kubernetes/node_provider.py
landscapepainter 693af6d
fix test
romilbhardwaj 582b484
Merge remote-tracking branch 'origin/k8s_zeroconf_networking' into k8…
romilbhardwaj 33439e3
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_ze…
romilbhardwaj d214495
[k8s] Showing reasons for provisioning failure in K8s (#2422)
landscapepainter 4e8b678
cleanup
romilbhardwaj 21cee8b
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_ze…
romilbhardwaj d8302f0
lint
romilbhardwaj c7e8429
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_ze…
romilbhardwaj fd2976a
fix for ssh jump image_id
romilbhardwaj 9827bbb
comments
romilbhardwaj f74c9df
ssh jump refactor
romilbhardwaj 657cd6f
lint
romilbhardwaj 9c4e338
image build fixes
romilbhardwaj add29dd
Merge branch 'master' of github.com:skypilot-org/skypilot into k8s_ze…
romilbhardwaj File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does removing this mean the
NodePort
mode will not work?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, NodePort would still work - it's just that now everything goes through a SSH Jump Pod, so the SSH port remains fixed at 22 and we don't need to get port here. Note that the jump port is dynamic and is fetched in
kubernetes_utils.get_ssh_proxy_command
at provisioning time.