Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] Realtime GPU availability of kubernetes cluster in
sky show-gpus
#3499[k8s] Realtime GPU availability of kubernetes cluster in
sky show-gpus
#3499Changes from 18 commits
e6b975d
a6b5bfc
1346159
6bbbf25
a263365
6dfb785
0bd06a4
6bf3045
f960322
8e1821d
8878254
3fe8fc6
2203d6b
b75e471
ba98957
b44b759
57cc132
4665386
400336f
3d3e121
e13ba3d
9e308e0
8a36851
db95895
91a4356
8e48e68
997bec1
72f08d9
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will calling this adds additional overhead to the
list_accelerators
? Since we are relying on thelist_accelerators
to generate the optimization candidate resources, which will be called multiple times during the failover process. Would be nice to make sure this does not add overhead. : )There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point.. the overhead compared to a the previous implementation isn't much different since the previous implementation was also invoking the kubernetes API:
That said, we should put a lru cache with a time-to-live (TTL) to expire based on time. Added a TODO.