Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Provisioner] Patch when head node not found for launching on existing cluster #4087

Open
cblmemo opened this issue Oct 15, 2024 · 0 comments

Comments

@cblmemo
Copy link
Collaborator

cblmemo commented Oct 15, 2024

In RunPod and Lambda (#3865), we directly raise an error when we cannot found head node for a cluster:

if head_instance_id is None:
raise RuntimeError(
f'Cluster {cluster_name_on_cloud} has no head node.')

This is possible when the user manually changes instance's name on the cloud console.

Instead of error out, we can patch one of the nodes to head node. However, due to different runtime that has been installed to the head node, we need to identify the original head node.

@Michaelvll Michaelvll added the OSS label Dec 19, 2024 — with Linear
@Michaelvll Michaelvll removed the OSS label Dec 19, 2024
@Michaelvll Michaelvll added the OSS label Dec 19, 2024 — with Linear
@Michaelvll Michaelvll removed the OSS label Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants