Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unique node selector and toleration per replica #223

Open
2 of 3 tasks
Tracked by #221
ahg-g opened this issue Sep 14, 2024 · 3 comments
Open
2 of 3 tasks
Tracked by #221

Unique node selector and toleration per replica #223

ahg-g opened this issue Sep 14, 2024 · 3 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@ahg-g
Copy link
Contributor

ahg-g commented Sep 14, 2024

What would you like to be added:

Allow injecting a unique nodeSelector and toleration for each LWS replica to trigger cluster autoscaler to create a dedicated placement group for each replica.

In the api, the user sets the key they would like to use, and the value would be the name of the replica (the leader pod name)

ReplicaSpecificNodeSelector: 
   - compact-placement-group

The result is a nodeSelector injected as follows:
compact-placement-group: <lws-leader-name>

Similarly for tolerations:

ReplicaSpecificToleration
      - key: compact-placement-group
        effect: NoSchedule

The result is a toleration injected on the pods of a group as follows:

      - key: group
        operator: Equal
        value: <lws-leader-name>
        effect: NoSchedule

Why is this needed:
To force cluster autoscaler to create a node group per replica, which can be necessary to create compactly placed nodes (on the same rack for example) for better network performance, and can improve multi-host GPU inference.

Completion requirements:

This enhancement requires the following artifacts:

  • Design doc
  • API change
  • Docs update

The artifacts should be linked in subsequent comments.

@ahg-g ahg-g added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 14, 2024
@googs1025
Copy link
Member

I'm willing to give it a try.:)
/assign

Would it be better to provide a Google Docs first?

@googs1025
Copy link
Member

googs1025 commented Sep 17, 2024

Sorry, I don't quite understand how compact-placement-group is defined.
Does compact-placement-group mean the name of a leader pod or a user-defined field name?

apiVersion: leaderworkerset.x-k8s.io/v1
kind: LeaderWorkerSet
metadata:
  name: leaderworkerset-multi-template
spec:
  replicas: 3
  leaderWorkerTemplate:
    ReplicaSpecificNodeSelector: compact-placement-group
    ReplicaSpecificToleration:
      - key: compact-placement-group
    leaderTemplate:
      spec:
        containers:
          - name: nginx2
...

@ahg-g
Copy link
Contributor Author

ahg-g commented Sep 18, 2024

compact-placement-group is a string that the user sets, and we use it as the key to a nodeSelector with a value equal to the leader pod name.

so the snippet you have for the api is correct, the outcome is that we inject a nodeSelector for each group as follows:

nodeSelector:
 - compact-placement-group: <leader-pod-name>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

2 participants