-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUILD-260: auto provision tekton via tekton operator if necessary #18
BUILD-260: auto provision tekton via tekton operator if necessary #18
Conversation
6c1f94e
to
8594f4a
Compare
8594f4a
to
6c8f79a
Compare
OK just got clean PR runs. I am retrying after pushing a brief README update. I've removed WIP @otaviof @adambkaplan to indicate this should be "ready for review" that said, I'll /hold for now as I still have not gone through the local dev flow to actually installing tekton operator (but not having it install tekton yet), then running this PRs changes I'll unhold once that is done. But in the meantime, I think it is OK to start gather comments from you guys, to start pulling in those changes as well. thanks |
OK the rerun of default_test.go (the ginkgo only test that launches an etc server) flaked this time (it passed last time before the readme update). I seen weirdness locally as well. I'd like to cover in office hours (I"m not sure how to debug these at first blush) when we next can all talk. thanks |
talked to @otaviof today
|
yeah removing |
ok pushed commit for last flake found .... e2e's green |
ran on kind, need to update rbac so these new changes can avoid perm problems like
will release hold once I've sorted this out and pushed the updates |
And the end to end flow has been verified on kind Projects before flow:
Then From the shipwright operator log: Projects after and pods:
/hold cancel @adambkaplan @otaviof PTAL .... let's review and do final iterations so we can merge :-) |
controllers/default_test.go
Outdated
@@ -68,7 +68,10 @@ var _ = g.Describe("Reconcile default ShipwrightBuild installation", func() { | |||
o.Expect(err).NotTo(o.HaveOccurred()) | |||
|
|||
err = k8sClient.Delete(ctx, build, &client.DeleteOptions{}) | |||
o.Expect(err).NotTo(o.HaveOccurred()) | |||
// the delete e2e's can delete this object before this AfterEach runs | |||
if !errors.IsNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking gomega examples, we can assert against multiple conditions, which makes this if
statement become:
o.Expect(err).To(o.BeNil(), o.MatchError(metav1.StatusReasonNotFound))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh neat ... will update in a bit, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated @otaviof
case tektonAPIErr == nil && tektonOPErr != nil: | ||
logger.Info("Tekton has been installed without use of its associated operator.") | ||
case errors.IsNotFound(tektonAPIErr) && errors.IsNotFound(tektonOPErr): | ||
//TODO should we error out here or proceed and give the user the opportunity of install Tekton after Shipwright? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few use-cases that could be impacted here:
- Tekton is already present in the cluster: In this case, this operator should only inspect if the version installed is compatible with Shipwright Controller;
- Tekton is already present in the cluster, but outdated: How would we handle Tekton upgrades?
- Tekton is ignored: This operator would not try to inspect or install Tekton;
I think we will need an extra flag to allow users during those use cases. Should we tackle this in a upcoming pull-request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the first 2 bullets fall under the open questions and subsequent card items already called out in BUILD-260 @otaviof
IMO It falls under installing (or upgrading) the operator itself. I initially prototyped an idea around that using manifestival but @adambkaplan and I discussed and agreed to not pursue that or other approaches with this PR at this time.
For the third bullet, yeah, that IMO most directly falls under my TODO comment/question. I'm on the fence. How strongly do you feel about it?
@adambkaplan thoughts on all ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Tekton is already present in the cluster: In this case, this operator should only inspect if the version installed is compatible with Shipwright Controller;
- Tekton is already present in the cluster, but outdated: How would we handle Tekton upgrades?
I think both of these should be addressed in a follow-up issue. For an MVP we just want to detect if Tekton was installed by the Tekton operator.
- Tekton is ignored: This operator would not try to inspect or install Tekton;
Not sure what is meant by "Tekton is ignored" - does this mean the cluster installed the Tekton operator, but didn't install Tekton pipelines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer the root TODO - I think we should error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Tekton is already present in the cluster: In this case, this operator should only inspect if the version installed is compatible with Shipwright Controller;
- Tekton is already present in the cluster, but outdated: How would we handle Tekton upgrades?
I think both of these should be addressed in a follow-up issue. For an MVP we just want to detect if Tekton was installed by the Tekton operator.
- Tekton is ignored: This operator would not try to inspect or install Tekton;
Not sure what is meant by "Tekton is ignored" - does this mean the cluster installed the Tekton operator, but didn't install Tekton pipelines?
My immediate interpretation was "Tekton is ignored" == the current statue of the operator. No inspection for Tekton is made. The shipwright operator moves forward assuming it is there, or the shipwright controller will function once it is put there.
But @otaviof please clarify if ^^ is not what you meant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update @adambkaplan to error out in next push.
if list == nil || len(list.Items) == 0 { | ||
tektonOperatorCfg := &tektonoperatorv1alpha1.TektonConfig{} | ||
tektonOperatorCfg.Name = "config" | ||
tektonOperatorCfg.Spec.TargetNamespace = "tekton-pipelines" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make this namespace configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could go either way. How strongly do you feel about it?
@adambkaplan thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did do a survey of both our operator and tekton's upstream....as we have no specific CRD api and associated clients created to capture such config in a operator like fashion with this current operator ... just a few command line options that map to the k8s controller-runtime options
Unless we consider a command line option for specifying the target namesapce for tekton resources sufficient for configuring with this PR / Jira, my feeling is that introducing all ^^ should be a separate PR / Jira
thoughts @adambkaplan @otaviof ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should take this up as a separate issue. This feels like it is more of a concern for a downstream distribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #20
4923208
to
a8b0d7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more minor changes requested. Otherwise looks good in its current form - most of the open questions here should be addressed in follow-up issues.
- kind: ServiceAccount | ||
name: default | ||
namespace: system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Binding to the default
system service account? Don't we want to bind to the operator's SA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there was no operator SA previously created in this repo nor in the tekton operator repo
See the tekton deployment (no SA specified): https://github.com/tektoncd/operator/blob/937dea56d9ed6011b1b3c8de4ed43c1cc4cba38b/config/kubernetes/operator.yaml
and
https://github.com/tektoncd/operator/blob/937dea56d9ed6011b1b3c8de4ed43c1cc4cba38b/config/openshift/operator.yaml
https://github.com/shipwright-io/operator/blob/main/config/manager/manager.yaml
I took adding an SA as out of scope for this particular Jira
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok @adambkaplan and I talked in office hours ... when I looked at the file, saw I merged tekton operator vs. ours
now, we are currently following the tekton operator's conversion of using the default
SA in the system
namespace
@adambkaplan and I agreed we won't change things with this PR. But he will be opening upstream issues / Jira's for the future work cited in this PR, and created a non-default SA for our operator will be one of those items.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #19
case tektonAPIErr == nil && tektonOPErr != nil: | ||
logger.Info("Tekton has been installed without use of its associated operator.") | ||
case errors.IsNotFound(tektonAPIErr) && errors.IsNotFound(tektonOPErr): | ||
//TODO should we error out here or proceed and give the user the opportunity of install Tekton after Shipwright? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Tekton is already present in the cluster: In this case, this operator should only inspect if the version installed is compatible with Shipwright Controller;
- Tekton is already present in the cluster, but outdated: How would we handle Tekton upgrades?
I think both of these should be addressed in a follow-up issue. For an MVP we just want to detect if Tekton was installed by the Tekton operator.
- Tekton is ignored: This operator would not try to inspect or install Tekton;
Not sure what is meant by "Tekton is ignored" - does this mean the cluster installed the Tekton operator, but didn't install Tekton pipelines?
case tektonAPIErr == nil && tektonOPErr != nil: | ||
logger.Info("Tekton has been installed without use of its associated operator.") | ||
case errors.IsNotFound(tektonAPIErr) && errors.IsNotFound(tektonOPErr): | ||
//TODO should we error out here or proceed and give the user the opportunity of install Tekton after Shipwright? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer the root TODO - I think we should error here.
if tektonOpCRD.Labels == nil { | ||
retErr := fmt.Errorf("the CRD TektonConfig does not have labels set, inclding its version") | ||
logger.Error(retErr, "Problem confirming Tekton Operator version") | ||
return RequeueWithError(retErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can requeue here - this is a permanent failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will change to perm fail / error out in next push
if !exists { | ||
retErr := fmt.Errorf("the CRD TektonConfig does not have labels set, inclding its version") | ||
logger.Error(retErr, "Problem confirming Tekton Operator version") | ||
return RequeueWithError(retErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar - this is a permanent failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will change to perm fail / error out in next push
// replace v0 with v1 to make k8s version check happy | ||
if strings.HasPrefix(value, "v0") { | ||
value = strings.ReplaceAll(value, "v0", "v1") | ||
} | ||
version, err := version.ParseSemantic(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why this is the case - v0.6.1
is a perfectly valid semver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it has been that way since this utility landed in apimachinery: see kubernetes/apimachinery@0e7924a#diff-7f2dfde88efef53aa3c8aba302b41cb5b2b9037095700fa5a3ecb461a9123374 and the godoc for ParseGeneric
. There at least is explicit intent there.
I think the code was in k8s/k8s before that, though I did not find it immediately, and claimed due diligence 10 minutes into that.
If your curiosity prompts you to continue the search and find out @adambkaplan do let us know :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First digit 0 is acceptable. Leading zero (such as 01
) is not.
See https://github.com/kubernetes/apimachinery/blob/master/pkg/util/version/version_test.go#L76 and https://github.com/kubernetes/apimachinery/blob/master/pkg/util/version/version_test.go#L148
if list == nil || len(list.Items) == 0 { | ||
tektonOperatorCfg := &tektonoperatorv1alpha1.TektonConfig{} | ||
tektonOperatorCfg.Name = "config" | ||
tektonOperatorCfg.Spec.TargetNamespace = "tekton-pipelines" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should take this up as a separate issue. This feels like it is more of a concern for a downstream distribution.
3b2b99b
to
f72a901
Compare
ok @adambkaplan thanks for the confirmation on the "outside scope of this jira" items I've pushed as separate commits a squashed commit of your log update, and then a separate commit of the changes for when to error out vs. requeue, and the unit/integration test changes to account for that I'll reorg/squash once we are good on those changes the only comment which did not result in a code change is the one around no deployment specific SA's existing for tekton/shipwright operators |
@adambkaplan @otaviof - another item I did not add to the PR, since it was not explicitly called out in the BUILD-260, is the creation of the target namesapce for shipwright. If you look at operator/controllers/shipwrightbuild_controller.go Lines 95 to 103 in a9c5e27
And I confirmed that manifestival call there only changes the namespace ref in the other objects being created by manifestival. Also, if you look at the top level usage example at https://github.com/shipwright-io/operator/blob/a9c5e270d1452163f0a162f7e414bb1940ad03a0/README.md#usage the namespace creation is there. So with that background, my question: do we want to take on creating the namespace in the operator ? Pluses: it is arguably an ease of use improvement WDYT? |
@gabemontero, @adambkaplan In my opinion namespaces should exist before anything is installed, so maybe we could simply pass the namespace name as a configuration entry, instead of creating the namespace. |
Either I'm missing something you are getting at, or we are not on the same page @otaviof ..... let me try to at least clarify what I am saying. To do so, let me post the current yaml contents at https://github.com/shipwright-io/operator/blob/a9c5e270d1452163f0a162f7e414bb1940ad03a0/README.md#usage 👍
So the namespace in question which we currently have the user create prior to poking the operator is Then, in shipwrightbuild.operator.shipwright.io.spec.targetNamespace, we again specify So we you say "pass the namespace name as a configuration entry", do you mean we should capture it? If so, we are already doing that, right? Or did you mean something else. Lastly, what I'm saying is making sure that namespace exists and creating it if need be before calling manifestival. Given my clarification, are you saying we should just not do that ? |
f72a901
to
bf319b6
Compare
ok target namespace creation added (e2e's work for me locally, simply removing the creation of the target namespace in the e2e's tests the path) commits squashed PTAL @adambkaplan |
curious that unit test failure is @otaviof a flake? we need to update the perms in this repo so PR author's can retest ... I cannot currently @adambkaplan can you do it ? |
Might be a flake, indeed. And second the need to be able to trigger CI again. /lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit with the use of the semver library, otherwise looks good.
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: ClusterRole | ||
metadata: | ||
name: namespace-role |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@otaviof I think our CI didn't catch this because we aren't testing on a "real" cluster yet.
- kind: ServiceAccount | ||
name: default | ||
namespace: system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #19
// replace v0 with v1 to make k8s version check happy | ||
if strings.HasPrefix(value, "v0") { | ||
value = strings.ReplaceAll(value, "v0", "v1") | ||
} | ||
version, err := version.ParseSemantic(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First digit 0 is acceptable. Leading zero (such as 01
) is not.
See https://github.com/kubernetes/apimachinery/blob/master/pkg/util/version/version_test.go#L76 and https://github.com/kubernetes/apimachinery/blob/master/pkg/util/version/version_test.go#L148
if list == nil || len(list.Items) == 0 { | ||
tektonOperatorCfg := &tektonoperatorv1alpha1.TektonConfig{} | ||
tektonOperatorCfg.Name = "config" | ||
tektonOperatorCfg.Spec.TargetNamespace = "tekton-pipelines" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #20
Interesting test panic here:
|
yeah that is what I noted in #18 (comment) @adambkaplan we'll see what happens when I push the semver change in a few minutes and trigger new tests |
test for TaskRun CRD presence, create/configure tekton operator as needed README update fix race on delete in integration default_test.go add tektonconfig rbac add namespace rbac; create target namespace if not present make bundle
bf319b6
to
e5f838b
Compare
semver fix pushed @adambkaplan PTAL thanks |
yeah it ran clean this time @adambkaplan @otaviof see https://github.com/shipwright-io/operator/pull/18/checks?check_run_id=3509444530 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adambkaplan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Changes
Fixes #8
/kind feature
Submitter Checklist
Release Notes
/assign @adambkaplan
/assign @otaviof
I've marked this WIP guys, as I just made coding changes so far, so if you want to hold off on any review, I understand. But if you have enough cycles to take a quick glance and see any glaring red flags, do feel free to comment here as you see fit.