Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GCP] Fix GCP labels for TPU #3652

Merged
merged 32 commits into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
9f4d62a
[GCP] initial take for dws support with migs
gurcangercek May 21, 2024
689c4a6
fix lint errors
gurcangercek May 21, 2024
3c8a236
dependency and format fix
Jun 4, 2024
034a2af
refactor mig instance creation
Michaelvll Jun 5, 2024
6bafabf
fix
Michaelvll Jun 5, 2024
af0cde5
remove unecessary instance creation code for mig
Michaelvll Jun 5, 2024
5c7850b
Fix deletion
Michaelvll Jun 5, 2024
91bba39
Fix instance template logic
Michaelvll Jun 5, 2024
7524186
Restart
Michaelvll Jun 5, 2024
4d29c5b
format
Michaelvll Jun 5, 2024
d839357
format
Michaelvll Jun 5, 2024
b4b8266
move to REST APIs instead of python APIs
Michaelvll Jun 5, 2024
ea6aefb
add multi-node back
Michaelvll Jun 5, 2024
504f0c6
Fix multi-node
Michaelvll Jun 6, 2024
b5484c6
Avoid spot
Michaelvll Jun 6, 2024
4ec8869
format
Michaelvll Jun 6, 2024
e300898
format
Michaelvll Jun 6, 2024
30792a2
fix scheduling
Michaelvll Jun 6, 2024
58768b2
fix cancel
Michaelvll Jun 6, 2024
78b3d2f
Add smoke test
Michaelvll Jun 6, 2024
bade730
revert some changes
Michaelvll Jun 6, 2024
00afacf
fix smoke
Michaelvll Jun 6, 2024
439fa2a
Fix
Michaelvll Jun 6, 2024
2ff8d27
fix
Michaelvll Jun 6, 2024
4c21abc
Fix smoke
Michaelvll Jun 6, 2024
0fb1809
Merge pull request #4 from skypilot-org/dws-gce-support
gurcangercek Jun 6, 2024
1d48fa6
[GCP] Changing the config name for DWS support and fix for resize req…
Michaelvll Jun 7, 2024
121ae2c
Merge branch 'master' into dws-gce-support
gurcangercek Jun 10, 2024
3b8b040
Fix labels for GCP TPU
Michaelvll Jun 11, 2024
e6d1396
format
Michaelvll Jun 11, 2024
050bd99
Merge branch 'master' of https://github.com/skypilot-org/skypilot int…
Michaelvll Jun 11, 2024
ed9073e
fix key
Michaelvll Jun 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions sky/clouds/gcp.py
Original file line number Diff line number Diff line change
Expand Up @@ -509,6 +509,10 @@ def make_deploy_resources_variables(
('gcp', 'managed_instance_group'), None)
use_mig = managed_instance_group_config is not None
resources_vars['gcp_use_managed_instance_group'] = use_mig
# Convert boolean to 0 or 1 in string, as GCP does not support boolean
# value in labels for TPU VM APIs.
resources_vars['gcp_use_managed_instance_group_value'] = str(
int(use_mig))
if use_mig:
resources_vars.update(managed_instance_group_config)
return resources_vars
Expand Down
2 changes: 1 addition & 1 deletion sky/templates/gcp-ray.yml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ available_node_types:
{%- for label_key, label_value in labels.items() %}
{{ label_key }}: {{ label_value|tojson }}
{%- endfor %}
managed-instance-group: {{ gcp_use_managed_instance_group }}
use-managed-instance-group: {{ gcp_use_managed_instance_group_value|tojson }}
{%- if gcp_use_managed_instance_group %}
managed-instance-group:
run_duration: {{ run_duration }}
Expand Down
Loading