Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YUNIKORN-1452] document admission controller use case #369

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

youxuan0714
Copy link
Contributor

What is this PR for?

Add and integrate documents about admission controller

What type of PR is it?

  • - Bug Fix
  • - Improvement
  • - Feature
  • [*] - Documentation
  • - Hot Fix
  • - Refactoring

Todos

  • - Task

What is the Jira issue?

open an issue on Jira https://issues.apache.org/jira/browse/YUNIKORN-1452

How should this be tested?

Screenshots (if appropriate)

image

Questions:

  • - The licenses files need update.
  • - There is breaking changes for older versions.
  • - It needs documentation.

Copy link
Contributor

@wilfred-s wilfred-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not mix changes in one PR (remove queue.md)
Do not commit/update the package.json file unless we change the docusaurus release.

@youxuan0714
Copy link
Contributor Author

@wilfred-s, thanks for reminding, I have corrected it.

Copy link
Contributor

@wilfred-s wilfred-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will probably become much clearer if we point to a use case to make clear when what and how:
use case 1: standard deployment, everything scheduled by YuniKorn
use case 2: standard deployment, user details from
use case 3: plugin deployment everything through YuniKorn
use case 4: plugin deployment, only workloads in some namespaces scheduled by YuniKorn
We can then hook the different settings into those and explain how things work

Comment on lines 69 to 70
| Variable | Default value | Description |
|---------------------------|-----------------------------------------|----------------------------------------------------------------------------|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are already part of the Service Configuration. We should not have the same information in two locations as that becomes a sync issue. Better is to point back to the service config.

Comment on lines 98 to 100
Access control configuration, all entries start with the prefix `admissionController.accessControl.`.

| Variable | Default value | Description |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are already part of the Service Configuration. We should not have the same information in two locations as that becomes a sync issue. Better is to point back to the service config.

The admission controller runs in a separate pod, it runs a [mutation webhook](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#validatingadmissionwebhook)
and a [validation webhook](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#validatingadmissionwebhook), where:

1. The `mutation webhook` mutates pod spec by:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make separate heading i.e. ###

also missing the fact that pod updates are checked for changes to the user annotations: no changes allowed for already created pods.

- Otherwise, adds `queue: root.default`
- Adding `disableStateAware` label
- If pod was assigned a generated applicationId by the admission controller, also set `disableStateAware: true`. This causes the generated application to immediately transition from the `Starting` to `Running` state so that it will not block other applications.
2. The `validation webhook` validates the configuration set in the configmap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a separate heading i.e. ###

- Adding `disableStateAware` label
- If pod was assigned a generated applicationId by the admission controller, also set `disableStateAware: true`. This causes the generated application to immediately transition from the `Starting` to `Running` state so that it will not block other applications.
2. The `validation webhook` validates the configuration set in the configmap
- This is used to prevent writing malformed configuration into the configmap.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain in text instead of bullet points

and a [validation webhook](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#validatingadmissionwebhook), where:

1. The `mutation webhook` mutates pod spec by:
- Adding `schedulerName: yunikorn`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use text to explain and link back to the configuration


On startup, the admission controller performs a series of tasks to ensure that it is properly registered with Kubernetes:

1. Loads a Kubernetes secret called `admission-controller-secrets`. This secret stores a pair of CA certificates which are used to sign the TLS server certificate used by the admission controller.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain in text instead of bullet points.

@youxuan0714
Copy link
Contributor Author

Hi @wilfred-s, sorry it took so long. I've modified it as suggested, but there are still some things I'm not sure about, for example, the 'user detail from' in use case 2. Please help confirm what else needs to be modified or added, thank you.

Copy link
Contributor

@pbacsko pbacsko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some typo & grammar issues spotted by Google docs.


In addition, The `mutation webhook` will also add the following labels:
- `applicationId` label. If applicationId or spark-app-selector already exists, it will be used. Otherwise, an applicationId will be generated for the pod. The default is `yunikorn-<namespace>-autogen`. Pods in the same namespace will generate the same applicationId. For other generation methods, please refer to [Service configuration](service_config.md).
- `queue` label. If there is a queue label, it will be used. Note that if placement rule is enabled, the value set in label will be ignored. Otherwise, `root.default` will be assigned by default. If you want to change the default queue name, please refer to [Service configuration](service_config.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "if placement rules are enabled, the value set in the label will be ignored"

On startup, the admission controller performs a series of tasks to ensure that it is properly registered with Kubernetes:

Loads a Kubernetes secret called `admission-controller-secrets`. This secret stores a pair of CA certificates which are used to sign the TLS server certificate used by the admission controller.
If the secret cannot be found or either CA certificate is within 90 days of expiration, generates new certificate(s). If a certificate is expiring, a new one is generated with an expiration of 12 months in the future. If both certificates are missing or expiring, the second certificate is generated with an expiration of 6 months in the future. This ensures that both certificates do not expire at the same time, and that there is an overlap of trusted certificates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "it generates new certificate(s)"

If the CA certificates were created or updated, writes the secrets back to Kubernetes.
Generates an ephemeral TLS server certificate signed by the CA certificate with the latest expiration date.
Validates, and if necessary, creates or updates the Kubernetes webhook configurations named `yunikorn-admission-controller-validations` and `yunikorn-admission-controller-mutations`. If the CA certificates have changed, the webhooks will also be updated. These webhooks allow the Kubernetes API server to connect to the admission controller service to perform configmap validations and pod mutations.
Starts up the admission controller HTTPS server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing subject (what starts up the https server?).


Loads a Kubernetes secret called `admission-controller-secrets`. This secret stores a pair of CA certificates which are used to sign the TLS server certificate used by the admission controller.
If the secret cannot be found or either CA certificate is within 90 days of expiration, generates new certificate(s). If a certificate is expiring, a new one is generated with an expiration of 12 months in the future. If both certificates are missing or expiring, the second certificate is generated with an expiration of 6 months in the future. This ensures that both certificates do not expire at the same time, and that there is an overlap of trusted certificates.
If the CA certificates were created or updated, writes the secrets back to Kubernetes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "it writes"


For more details on admission controller configuration, please refer to [Service configuration](service_config.md).

For more information on user and group access management, please view [User&Group Resolutiona](usergroup_resolution.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "resolutiona"

```

### Use case
The following are the usage scenarios of using admission controller to set up in different deployment modes of Yunikorn.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "of using the"

#### Standard deployment, everything scheduled by Yunikorn
In standard deployment mode, can set which namespaces will be scheduled by YuniKorn through the `admissionController.filtering.processNamespaces` tag in `yunikorn-config`.

If want all namespaces to be scheduled by YuniKorn, just set `admissionController.filtering.processNamespaces` to the empty string (default).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If you want"

#### Plugin deployment, everything scheduled by YuniKorn
Same as the standard mode, can set which namespaces will be scheduled by YuniKorn through the `admissionController.filtering.processNamespaces` tag in `yunikorn-config`.

If want all namespaces to be scheduled by YuniKorn, just set `admissionController.filtering.processNamespaces` to the empty string (default).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If you want"

```

#### Plugin deployment, everything scheduled by YuniKorn
Same as the standard mode, can set which namespaces will be scheduled by YuniKorn through the `admissionController.filtering.processNamespaces` tag in `yunikorn-config`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Just like in standard mode, the admissionController.filtering.processNamespaces tag sets which namespaces will be scheduled by Yunikorn..." or sth like that.

#All namespaces will be schduler by YuniKorn
admissionController.filtering.processNamespaces: ""
```
#### Plugin deployment, only workloads in some namespaces scheduled byYuniKorn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "byYunikorn"

I think the "only" is not needed here (sounds weird to me).

Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youxuan0714 thanks for nice patch. a couple of comments left.

spec:
container:
- name: nginx
image: "nginx:stable-alpine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are some invalid format, please fix it.

### Mutation webhook
The `mutation webhook` will add `schedulerName:yunikorn` to the pod spec so that the pod can be scheduled by the Yunikorn scheduler.

In addition, The `mutation webhook` will also add the following labels:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- `queue` label. If there is a queue label, it will be used. Note that if placement rule is enabled, the value set in label will be ignored. Otherwise, `root.default` will be assigned by default. If you want to change the default queue name, please refer to [Service configuration](service_config.md).
- `disableStateAware` label, if the admission controller generates an applicationId for the pod, `disableStateAware: true` must also be set. This will cause the application to immediately transition from "Starting" to "Running" state so that it will not block other applications

When updating a created pod, the admission controller will check whether the userinfo annotation before and after the update is the same. If it is different, the change will not be allowed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pardon me, could you share reference to me? did not noticed this functionality before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked the current AM implementation, it seems like the "check" behavior wasn't implemented.
And I don't think we need it too because AM won't generate new user info if user annotation has already been set.

https://github.com/apache/yunikorn-k8shim/blob/master/pkg/admission/admission_controller.go#L176

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chenyulin0719 thanks for confirmation.

It seems the related code was committed by apache/yunikorn-k8shim#469. @pbacsko Could you take a look? We can just delete the statement from the docs, or we can have a ticket to implement that behavior for AM

The `validation webhook` calls scheduler [validation REST API](https://yunikorn.apache.org/docs/api/scheduler#configuration-validation) to validate configmap updates.

## Admission controller deployment
By default, the admission controller is deployed as part of the YuniKorn Helm chart installation. This can be disabled if necessary (though not recommended) by setting the Helm parameter `embedAdmissionController` to `false`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"necessary (though not recommended)" could we have a section to discuss the reasons why we discourage users from doing this?

@pbacsko
Copy link
Contributor

pbacsko commented Apr 19, 2024

ping @youxuan0714 - do you have time to do the requested changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants