-
Notifications
You must be signed in to change notification settings - Fork 50
v1alpha2 API #703
Comments
I think the one immediate downside to this approach is the loss of extensibility in the idea of others being able to package and ship their own provisioner, separate from the rukpak core. However, from a practical perspective, I'm not sure how big of a use case that idea actually is. And I wonder if we could provide extension points via some other mechanism (e.g. could the eventual |
If I'm correct in assuming that nobody currently has a custom provisioner, I'd argue that designing up front for extensibility was not necessary. If it comes up in the future, we can revisit (assuming we proceed w/this approach). |
The motivation for a separate API was to simplify pivots between old and new versions of a bundle. The state could be handled internally by a controller, but it's more difficult to introspect in the case of failure, and limits what can be installed to "stateless" bundles only (or at least - I don't see anything in the BundleDeployment example that would allow installation logic to depend on the previous state). In the context of OLM, the "stateful" bundle type that was interesting was the previous output of dependency resolution.
The "class" model was intended to enable an ecosystem (a la persistent volumes or Gateway/Ingress) so that new bundle types don't require PRs to add types in-tree in Without these two things (stateful pivots and a stable api to plug into) I don't think RukPak looks very different from Flux v2? |
I would think that dependency resolution would happen with an API at a higher level and that BundleDeployments are already the outcome of it. Are you seeing that differently? |
Referencing secrets in a different namespace could be gated by an admission controller checking that the author has read right on them. Wouldn't it help with the security concern? |
There was the request to deactivate manifest reconciliation, for instance to install a patch that has not been productized yet. Do you think that it would be premature to add a field for the purpose? |
Another aspect that may not be reflected in the BundleDeployment API but in the behavior of the controller is the notion of optional manifests. Taking ServiceMonitor as an example, an operator provider may want the resource to get created on clusters where Prometheus is installed but not to fail the installation on other clusters. I wanted to check whether the discussion is fully orthogonal to your v1alpha2 API proposal. |
I think what Evan is saying (correct me if I'm wrong, @ecordell) is that, yes, something else previously came up with a successfully applied and now healthy set of BundleDeployments. If the next resolution updates a BD spec to something that fails to unpack or apply for some reason, it seems ideal for the previous healthy state to remain in place rather than to put the BD in some intermediate failing state, if possible. So I think the question is, "where/how is that previous bundle content stored so that rukpak can refer back to it in the event that the current spec cannot progress?" I think the current implementation of the core provisioners don't quite actually work this way anyway. IIRC, the previous bundle is deleted as soon as the new bundle unpacks successfully. And the BD provisioners don't actually start using the new Bundle until it is fully unpacked. For the purposes of relying on previous state, the provisioners depend on Helm storing releases in release secrets. So we clearly don't need the Bundle API with the current implementation. But, regardless, we're also talking about removing the helm underpinnings of the provisioners, so we still need an answer for "where do we store previous state?" |
Consistency: apiVersion: core.rukpak.io/v1alpha2
kind: BundleDeployment
metadata:
name: foo
spec:
# sources is a list of places from which we fetch content. We support multiple in case you need to fetch multiple.
sources:
- kind: image
uri: foo/bar@sha256:123
auth:
secret:
namespace: ns
name: n
destination: foo/bar # where to unpack the image's contents in the local fs; useful when you have multiple sources
- kind: git
uri: https://github.com/foo/bar
ref: a3b1dff2 # could this somehow be part of the uri?
directory: '' # what part of the git repo to copy to .destination; leave blank to copy the entire repo
auth:
secret:
namespace: foo
name: my-git-creds
destination: foo/moo
- kind: http
uri: https://path/to/some/tar.gz
auth:
secret:
namespace: foo
name: my-https-creds
destination: foo/bar
# format indicates what's in the bundle and how we should process it. In a future alpha, I'd like to convert this to an array of `processors` so we add support for templating and other mutations. In the spirit of iteration, though, we can consider keeping this a single field in alpha2.
format: registry-v1 | helm | plain |
@fgiloux I have added a |
@fgiloux let's make sure we don't lose this idea, but I'd prefer to do it in a future iteration |
@fgiloux we have to be careful with the idea of "author" or "owner", as those concepts don't really exist or work really well in k8s. Also, I want to avoid admission webhooks unless they are absolutely critical and there are no alternatives. |
Another thing to consider: With the Bundle/BundleDeployment split, we get separate reconciles for "pull down the data" and "apply the data". This is beneficial because the Bundle spec is immutable, which means that as soon as the Bundle is unpacked and the underlying data is made available, the Bundle pretty much stops reconciling and there's no network communication outside the cluster. Without a separate resource to reconcile and source data from, the BD reconciliation has to figure out if and how to avoid frequent network calls outside the cluster. (e.g. does this image tag still resolve to the same digest? does this git branch reference still resolve to the same commit, does this http URL still have the same last-modified/etag response header?) Do others see this as a real concern? If so, any ideas to mitigate this? |
Two options:
|
As a user I would want 2. in a staging or production environment and as such would not use an image tag or a git branch. In a development environment on the other hand it can be handy to use a tag or a branch reference. I would however expect that the reconciliation happens against what is referenced rather than a stale state copied onto the cluster. |
Option 2 would also be a necessary precondition to all of the stories around review of content before install - can't be certain of RBAC or cloud provider permissions that the bundle needs if it's sitting on a floating reference. |
Doesn't the Bundle reconciler have the same "problem"? re @stevekuznetsov's suggestions, I'm curious how we'd play out the Operator CR flow in the event of an upgrade:
|
My expectation was 3.ii. - same as the way Deployment handles ReplicaSet, and would allow a policy field for history as well, to handle the "we failed to upgrade" case if needed. |
If you do 3.ii, then you have to be very careful about handing over control of existing objects from one BundleDeployment to the next. That's not an issue with Deployment/ReplicaSet, because all of the underlying pods are new. In BundleDeployment, it isn't acceptable to delete CRDs/deployments and re-create them. |
Would you prefer a different alternative? How would you sufficiently fulfill the "precondition on install or upgrade" user stories? |
@joelanford re handing over control, I think it would be sufficient to use a label that is common to all BundleDeployments for the same underlying bundle? As long as at most 1 is only ever un-paused? @stevekuznetsov what are those users stories? Not ringing a bell. |
@ncdc failing for the link right now but there are a couple of stories in the spec about users being able to review the RBAC or cloud account permissions needed by an operator before installing it. If we do not make the process fully deterministic, the ability of the user to review these things in a console before creating an Operator CR and expecting their review to be valid is racy. |
I do think any and all "preview" type of user stories should be imperative and client-side only (CLI or console) if at all possible. |
Following the discussion on #653 what is the use case where |
As we add support for multiple layers of manifest/template processing, we may need to support multiple destination directories. I don't have a great example at hand. It's mostly about flexibility for the future. If we don't have a specific need for it now, we can wait until later. |
@ncdc not sure I follow the implication here. What I'm concerned about is that without a careful design for this API, we don't get a system whereby clients can do their client-side work and be certain that their inputs are honored. The pre-flight checks here are certainly intended to be client-side, but we need to ensure that a client can condition the install of an operator on their intent. |
@stevekuznetsov agreed; what is reviewed and approved must be exactly what is executed. I'm thinking along the lines of: $ k operator install mongodb
Previewing install of mongodb-1.2.3
The following cloud permissions are required: ...
The following cluster permissions will be created when this operator is installed: ...
Proceed? [Y/N] As long as saying Y doesn't result in a different version, I think we're good? |
This issue has become stale because it has been open 60 days with no activity. The maintainers of this repo will remove this label during issue triage or it will be removed automatically after an update. Adding the |
Closing. We are pursuing direct use of Carvel instead. |
I'd like to iterate on our API as we think about moving to a more stable beta/GA version. Here are my thoughts for v1alpha2:
Remove the Bundle CRD entirely. It adds unnecessary complexity that instead can be handled internally by BundleDeployment.
Here's what I'm thinking for BundleDeployment:
The BundleDeployment controller architecture would change such that there is no longer a provisioner class. That is now controlled by .spec.format. Also, we only need a single instance of the controller; the reconcile function can handle fetching, dealing with the format, and applying to the cluster.
We will probably also want to include
.spec.healthChecks
in some shape, but I haven't thought what that looks like just yet. It might make sense to model it after pod probes.The text was updated successfully, but these errors were encountered: