Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YUNIKORN-2504] Support canonical labels for queue/applicationId in scheduler #860

Closed
wants to merge 1 commit into from

Conversation

chenyulin0719
Copy link
Contributor

@chenyulin0719 chenyulin0719 commented Jun 21, 2024

What is this PR for?

Support canonical Queue/ApplicationId labels in Pod, allows it coexist with the existing metadata.

  • yunikorn.apache.org/app-id (New, Canonical Label)
  • yunikorn.apache.org/queue (New, Canonical Label)

Check metadata consistency before move task state from 'New' to 'Pending',
Run the pod metadata check in task.sanityCheckBeforeScheduling()

  • If sanity check is failed due to inconsistent metadata, move the task from 'New' to 'Rejected' state. And fail the pod with reasons. (Please check below screenshots.)

What type of PR is it?

  • - Feature

Todos

Admission Controller should fail the pod request too if the metadata is inconsistent. Will create another Jira once this PR got approved.

What is the Jira issue?

https://issues.apache.org/jira/browse/YUNIKORN-2504

How should this be tested?

Run e2e test:

  • (basic_scheduling) Verify_Pod_With_Conflicting_AppId
  • (basic_scheduling) Verify_Pod_With_Conflicting_QueueName
  • (recovery_and_restart) Verify_Pod_Restart_After_Add_Conflict_Metadata

Run below simple sleep pods:


apiVersion: v1
kind: Pod
metadata:
  labels:
    app: sleep
    yunikorn.apache.org/app-id: "application-sleep-0001"
    yunikorn.apache.org/queue: "root.sandbox"
  annotations:
    yunikorn.apache.org/queue: "root.sandbox-another"
  name: pod-with-inconsistent-queue
spec:
  schedulerName: yunikorn
  restartPolicy: Never
  containers:
    - name: sleep-6000s
      image: "alpine:latest"
      command: ["sleep", "6000"]
      resources:
        requests:
          cpu: "100m"
          memory: "500M"
          


---
apiVersion: v1
kind: Pod
metadata:
  labels:
    app: sleep
    yunikorn.apache.org/app-id: "application-sleep-0002"
  annotations:
    yunikorn.apache.org/app-id: "application-sleep-0002-another"
  name: pod-with-inconsistent-app-id
spec:
  schedulerName: yunikorn
  restartPolicy: Never
  containers:
    - name: sleep-6000s
      image: "alpine:latest"
      command: ["sleep", "6000"]
      resources:
        requests:
          cpu: "100m"
          memory: "500M"
          
--- 
apiVersion: v1
kind: Pod
metadata:
  labels:
    app: sleep
    yunikorn.apache.org/app-id: "application-sleep-0003"
    yunikorn.apache.org/queue: "root.sandbox"
  name: pod-with-correct-labels
spec:
  schedulerName: yunikorn
  restartPolicy: Never
  containers:
    - name: sleep-6000s
      image: "alpine:latest"
      command: ["sleep", "6000"]
      resources:
        requests:
          cpu: "100m"
          memory: "500M"
---

Screenshots (if appropriate)

Below is the screenshot without admission controller:
image

image image

Questions:

NA

Copy link
Contributor Author

@chenyulin0719 chenyulin0719 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make it easier for the reviewer to read, add some notes.

  1. Can start from 'task.sanityCheckBeforeScheduling()' in applications.go
  2. Can check e2e tests in basic_scheduling_test and recover_and_restart first.

if err := task.sanityCheckBeforeScheduling(); err == nil {
// if the task is not ready for scheduling, we keep it in New state
// if the task pod is bounded and have conflicting metadata, we move the task to Rejected state
err, rejectTask := task.sanityCheckBeforeScheduling()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perform a sanity check before move this task to Pending state.

Before this PR, sanity check only check PVC's readiness

  • If sanity check passed, move task state from 'New' -> 'Pending'
  • If sanity check failed, task state remains in 'New' (Will be checked again in next schedule cycle)

After this PR (Sanity check check PVC and Pod Metadata)

  • if sanity check passed, 'New' -> 'Pending'
  • if sanity check fails due to PVC -> 'New' (No change)
  • if sanity check fails due to a unbound pod with inconsistent metadata (AppID/Label), move task state from 'New' to 'Rejected'

Design decision: Only reject unbound pods because we don't want to failed existing running pod after restart YK.

Comment on lines -571 to -573
func failTaskPodWithReasonAndMsg(task *Task, reason string, msg string) {
podCopy := task.GetTaskPod().DeepCopy()
podCopy.Status = v1.PodStatus{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved failTaskPodWithReasonAndMsg() to task.go

change

  • podCopy := task.GetTaskPod().DeepCopy()
    to
  • podCopy := task.pod.DeepCopy()

to prevent deadlock when task state machine is handling TaskRejected event. (WLock)

Comment on lines +93 to +94
constants.CanonicalLabelApplicationID: app.GetApplicationID(),
constants.CanonicalLabelQueueName: app.GetQueue(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note:

  • We can directly use canonical representation for placeholder here. The newer version shim allows legacy and canonical representation metadata coexists.

Comment on lines +477 to +480
func (task *Task) postTaskRejected(reason string) {
// if task is rejected because of conflicting metadata, we should fail the pod with reason
if strings.Contains(reason, constants.TaskPodInconsistMetadataFailure) {
task.failTaskPodWithReasonAndMsg(constants.TaskRejectedFailure, reason)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design decision:

  • Fail the pod if the task's reject reason is inconsistent metadata (PodInconsistentMetadata).

pkg/common/utils/utils.go Show resolved Hide resolved
pkg/common/utils/utils.go Show resolved Hide resolved
@chenyulin0719 chenyulin0719 self-assigned this Jun 21, 2024
@chenyulin0719 chenyulin0719 marked this pull request as draft June 21, 2024 17:19
@chenyulin0719 chenyulin0719 marked this pull request as ready for review June 22, 2024 02:37
@chenyulin0719
Copy link
Contributor Author

Mark this PR as draft after the discussion of the community sync-up.

To be changed:

  1. Ordering change to (Cannonical Label -> Annotation -> Queue(Existing))
  2. Version 1.6.0 only prints out warnings; task rejection should occur in version 1.7.0.

@chenyulin0719
Copy link
Contributor Author

Close this draft PR and another PR for formal review was created.
#871

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant