Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add priority to dex jobs #1439

Merged
merged 5 commits into from
Jun 27, 2023

Conversation

gracedo
Copy link
Contributor

@gracedo gracedo commented Jun 21, 2023

What type of PR is this?

feat

What this PR does/ why we need it:

Set priority class on dex jobs. However, for this upgrade, we will still hit this immutable error since the job already exists. So, I've added a pre-upgrade hook here to clean up the Jobs prior to upgrading. This will only target jobs deployed in chart version 1.11.1 (released in DKP 2.5), so will only recreate jobs during upgrade from DKP 2.5 to 2.6 (and not within 2.6). We can remove this hook in following releases.

The certs jobs will re-run on this upgrade but from my testing I don't see any effects, but would like some confirmation from our sso stack sme's.

Which issue(s) this PR fixes:

https://d2iq.atlassian.net/browse/D2IQ-97813

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


Checklist

  • If a chart is changed, the chart version is correctly incremented.
  • The commit message explains the changes and why are needed.
  • The code builds and passes lint/style checks locally.
  • The relevant subset of integration tests pass locally.
  • The core changes are covered by tests.
  • The documentation is updated where needed.

@gracedo gracedo added ready ready ok-to-test Signals mergebot that CI checks are ready to be kicked off labels Jun 21, 2023
@gracedo gracedo self-assigned this Jun 21, 2023
@gracedo gracedo marked this pull request as ready for review June 21, 2023 21:52
@gracedo gracedo requested review from a team as code owners June 21, 2023 21:52
@gracedo gracedo force-pushed the gracedo/dex_job_D2IQ-97813 branch 3 times, most recently from e2660b5 to 3506e46 Compare June 21, 2023 23:24
@gracedo
Copy link
Contributor Author

gracedo commented Jun 21, 2023

Error: UPGRADE FAILED: pre-upgrade hooks failed: warning: Hook pre-upgrade dex/templates/pre-upgrade-delete-jobs.yaml failed: Job.batch "dex-1l8c8myp4b-delete-jobs" is invalid: spec.template.spec.containers[0].image: Required value

upgrade test keeps failing :( I wonder if it's because the new values are not getting used in the upgrade? (but I feel like this would have failed in more tests whenever we added new value fields 🤔 ) When I do helm template the image is populating just fine

# Source: dex/templates/pre-upgrade-delete-jobs.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: release-name-dex-delete-jobs
  namespace: default
  annotations:
    "helm.sh/hook": pre-upgrade
    "helm.sh/hook-weight": "4"
    "helm.sh/hook-delete-policy": hook-succeeded,before-hook-creation
spec:
  template:
    metadata:
      name: release-name-dex-delete-jobs
    spec:
      serviceAccountName: release-name-dex-pre-upgrade
      restartPolicy: OnFailure
      containers:
        - name: kubectl
          image: "bitnami/kubectl:1.26.4"
          command:
            - sh
            - -c
            - kubectl delete jobs.batch -l 'app.kubernetes.io/component in (job-grpc-certs, job-web-certs),app.kubernetes.io/name=dex' --cascade=orphan -n default
            - ```

@gracedo gracedo force-pushed the gracedo/dex_job_D2IQ-97813 branch from fc7abe8 to a45797d Compare June 22, 2023 02:11
@gracedo
Copy link
Contributor Author

gracedo commented Jun 22, 2023

      --reset-values                               when upgrading, reset the values to the ones built into the chart
      --reuse-values                               when upgrading, reuse the last release's values and merge in any overrides from the command line via --set and -f. If '--reset-values' is specified, this is ignored

It looks like ct is using --reuse-values flag for helm upgrade, but here we probably want --reset-values to pull in the new set of values. I did attempt to pass in --reset-values into the helm args but it was causing the install command to fail (and would be ignored anyways since --reuse-values is already being passed in). Searching the ct repo yielded this PR to enable this, by one of our very own alum 😄 helm/chart-testing#531 since this is not currently supported, we'll need to ignore the failing test and admin merge to get this in.

I tested this with a chart hosted from my GH.

dex-delete-jobs-vr6jv                                                     dkp-critical-priority
dex-grpc-certs-vvc8m                                                      dkp-critical-priority

@gracedo gracedo changed the title feat: Add priority/ttl to dex jobs with pre-upgrade cleanup feat: Add priority to dex jobs Jun 23, 2023
Copy link
Contributor

@cbuto cbuto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! though still unsure of the consequences of rerunning this job

@gracedo
Copy link
Contributor Author

gracedo commented Jun 26, 2023

@palexster can you verify that it would be fine to rerun the dex job to regenerate certs? When I tested on the daily cluster, I didn't notice any issues (but maybe I wasn't looking at the right thing :) )

@palexster
Copy link
Contributor

@palexster can you verify that it would be fine to rerun the dex job to regenerate certs? When I tested on the daily cluster, I didn't notice any issues (but maybe I wasn't looking at the right thing :) )

I think we are safe. Those jobs are only re-bootstrapping a CA and new certificates, so every new execution would just bootstrap them again.

Btw, I don't see the purpose of re-bootstrapping a new CA at every update (the lowest default value is 10y of lifespan). We should probably consider migrating to cert-manager, managing them properly, and getting rid of those jobs, as suggested by @mhrabovcin.

@gracedo
Copy link
Contributor Author

gracedo commented Jun 27, 2023

We should probably consider migrating to cert-manager, managing them properly, and getting rid of those jobs

@palexster Can you (or @mhrabovcin) create a ticket in the backlog so we don't lose track of this improvement please? Thank you!

@gracedo gracedo merged commit 73f2404 into mesosphere:master Jun 27, 2023
d2iq-mergebot pushed a commit that referenced this pull request Jun 27, 2023
* feat: Add priorityclass and ttl to dex jobs

* feat: Add pre-upgrade hook to delete jobs

* fix: Add trailing line

* fix: Remove ttl, select dex 2.11.1 jobs to delete
@gracedo gracedo deleted the gracedo/dex_job_D2IQ-97813 branch June 27, 2023 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test Signals mergebot that CI checks are ready to be kicked off ready ready
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants