Skip to content
This repository has been archived by the owner on Aug 12, 2024. It is now read-only.

Conversation

varshaprasad96
Copy link
Member

@varshaprasad96 varshaprasad96 commented Oct 16, 2023

This PR does considerable refactoring to the existing code to introduce a new reconciler to handle v1alpha2 bundle deployment API. This does not remove or make any changes to the existing alpha1 code base. Once we have everything related to alpha2, we can cleanup and remove other controllers and manifests that are not required.

Major changes made:

  1. A new reconciler introduced in: internal/v1alpha2/controllers/bundledeployment/bundledeployment.go
  2. Store interface for creating a local fs for each bundle deployment to store unpacked contents: internal/v1alpha2/store
  3. Unpacker interface and related code for unpacking multiple sources: internal/v1alpha2/source
  4. Validator: internal/v1alpha2/validator
  5. Deployer: internal/v1alpha2/deployer

The reconciler is not plugged into core/main.go. This PR splits #727 to make review process easier. All the features, not implemented here are tracked in there.

@varshaprasad96 varshaprasad96 requested a review from a team as a code owner October 16, 2023 22:11
@varshaprasad96
Copy link
Member Author

cc: @perdasilva @ncdc

@@ -86,6 +86,9 @@ type BundleDeploymentStatus struct {
//+kubebuilder:printcolumn:name=Provisioner,type=string,JSONPath=`.spec.provisionerClassName`,priority=1

// BundleDeployment is the Schema for the bundledeployments API
// TODO(varsha): Marking this as unserved version for now. Remove this definition during cleanup, when the all the
// work on alpha2 is done.
// +kubebuilder:unservedversion
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does not wire in the reconciler to the controller yet. Can make this as a storage version until that is done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mark v1alpha1 as unserved is the expectation that conversion from v1alpha1 to v1alpha2 and back exists?

@varshaprasad96 varshaprasad96 changed the title Add a bundle deployment reconciler to handle v1alpha1 API Add a bundle deployment reconciler to handle v1alpha2 API Oct 16, 2023
@codecov
Copy link

codecov bot commented Oct 16, 2023

Codecov Report

Attention: 606 lines in your changes are missing coverage. Please review.

Comparison is base (61432e2) 24.67% compared to head (fb0fa79) 23.71%.

❗ Current head fb0fa79 differs from pull request most recent head 1237a3c. Consider uploading reports for the commit 1237a3c to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #743      +/-   ##
==========================================
- Coverage   24.67%   23.71%   -0.96%     
==========================================
  Files          14       22       +8     
  Lines        1143     1923     +780     
==========================================
+ Hits          282      456     +174     
- Misses        807     1373     +566     
- Partials       54       94      +40     
Files Coverage Δ
internal/v1alpha2/validator/validator.go 81.25% <81.25%> (ø)
internal/v1alpha2/source/mock_store.go 23.52% <23.52%> (ø)
internal/v1alpha2/source/interface.go 0.00% <0.00%> (ø)
internal/v1alpha2/store/store.go 40.00% <40.00%> (ø)
...ollers/bundledeployment/bundledeployment_status.go 0.00% <0.00%> (ø)
internal/v1alpha2/source/image.go 52.53% <52.53%> (ø)
...2/controllers/bundledeployment/bundledeployment.go 0.00% <0.00%> (ø)
internal/v1alpha2/convert/registryV1.go 14.01% <14.01%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@varshaprasad96
Copy link
Member Author

Please review from the second commit. The first one would be rebased on #713


controller crcontroller.Controller
dynamicWatchMutex sync.RWMutex
dynamicWatchGVKs map[schema.GroupVersionKind]struct{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sets.Set[schema.GroupVersionKind]

//+kubebuilder:rbac:groups=core.rukpak.io,resources=bundledeployments,verbs=list;watch
//+kubebuilder:rbac:groups=core.rukpak.io,resources=bundledeployments/status,verbs=update;patch
//+kubebuilder:rbac:groups=core.rukpak.io,resources=bundledeployments/finalizers,verbs=update
//+kubebuilder:rbac:groups=*,resources=*,verbs=*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh oh :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know! But I have no idea on the intention behind this. Copied from

//+kubebuilder:rbac:groups=*,resources=*,verbs=*

func (b *bundleDeploymentReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
log := log.FromContext(ctx)

existingBD := &v1alpha2.BundleDeployment{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rukpackv1alpha2


// The controller is not updating spec, we only update the status. Hence sending
// a status update should be enough.
if !equality.Semantic.DeepEqual(existingBD.Status, reconciledBD.Status) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use applyconfigurations? I imagine the code generators create them already; new code should be written to use them if possible.

// by the bundle deployment store. These methods facilitate the unpacking and storage
// of contents from all the sources.
type Store interface {
afero.Fs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised you want to inline all the low-level bits from this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, the sources use limited number of helpers just for creating/removing/verifying contents inside dir. Instead of specifying them separately, inlining of afero.Fs would make it easier and save us from adding more helpers to this interface. I assume we would be using this only in rukpak, that too to access localdir, so expecting an afero.Fs implementation to be passed seemed reasonable.

return nil, fmt.Errorf("validation unsuccessful during unpacking %v", err)
}
// storage path to store contents in local directory.
storagePath := filepath.Join(store.GetBundleDeploymentName(), filepath.Clean(source.Destination))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You pass store and source into i.unpack() so you can likely inline this logic there instead of passing it in as a parameter

op, err := i.ensureUnpackPod(ctx, store.GetBundleDeploymentName(), *source, pod, opts)
if err != nil {
return nil, err
} else if op == controllerutil.OperationResultCreated || op == controllerutil.OperationResultUpdated || pod.DeletionTimestamp != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i.ensureUnpackPod() did not return Created or Updated, why is a deleting pod a pending unpack?

if !apierrors.IsInvalid(err) {
return controllerutil.OperationResultNone, err
}
if err := i.Client.Delete(ctx, existingPod); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the Pod spec is immutable. What's the meaning of creating a broad apply-config & doing this dance? In what cases is updating the existing pod what you want? Why not delete + create always?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is this factored to start with an apply for the happy case where there's nothing to change? I imagine there's a way to compute whether or not a change has occurred on the Bundle object and determine if it's needed to update the pod without a call to the API server

Copy link
Member Author

@varshaprasad96 varshaprasad96 Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is this factored to start with an apply for the happy case where there's nothing to change?

This has been taken from the current alpha1 implementation. But this is the only reason I see.

I imagine there's a way to compute whether or not a change has occurred on the Bundle object and determine if it's needed to update the pod without a call to the API server

This in general needs to done on a higher level, regardless of the image unpacking using a pod. Right now, even though there is no change, the design unpacks during every reconcile (whether image, git or http) which is unnecessary. Probably a caching mechanism, where we compute a hash of the bdName and bundleSource (which contain image digest, git commit etc) to figure out if there has been a change in the bundle content would be a better way to reduce time and resource during reconcile. I had implemented something similar in #727, but wanted to keep it separate for now. If this is implemented, then we need not patch, instead as you mentioned, delete+create as we would be sure of the change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whichever gets you iterating faster is fine. I think a more explicit input check makes the rest of those code simpler & easier to test and reason about, but since you're copying it's moot :)


// make sure the passed in pod value is updated with the latest
// version of the pod
*pod = *updatedPod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to return the pod instead of mutating in place - is there motivation for the side-effect approach?

// compare existingPod to newPod and return an appropriate
// OperatorResult value.
newPod := updatedPod.DeepCopy()
unsetNonComparedPodFields(existingPod, newPod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly hard to follow. If what you want is to compare pod.Spec, why not just do that?

if err != nil {
return fmt.Errorf("unpack failed: failed to retrieve failed pod logs: %v", err)
}
_ = i.Client.Delete(ctx, pod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely want to ignore not found and return the error?

// Options to configure bundleDeploymentReconciler
type Option func(bd *bundleDeploymentReconciler)

func WithUnpacker(u *source.Unpacker) Option {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is it valid not to have an unpacker?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we just have this as a required field in the constructor instead of an option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we just have this as a required field in the constructor instead of an option?

+1. An alternative would be to have a default unpacker that would be changed if this option is specified.

return ctrl.Result{}, err
}

switch res.State {
case source.StateUnpackPending:
setUnpackStatusPending(&bundleDeployment.Status.Conditions, fmt.Sprintf("pending unpack"), bundleDeployment.Generation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be watching pods with some unpacking label and triggering off of them changing state?

@stevekuznetsov
Copy link
Member

How much of this is straight copies from v1alpha1? Notably I think @everettraven has direct client unpacking merged, are you expecting to use that too?

@varshaprasad96
Copy link
Member Author

varshaprasad96 commented Oct 17, 2023

@stevekuznetsov The unpacking and deploy logic is copied from v1alpha1 bits. Implementing direct client unpacking is not a now a part of this right now. The design for this was to allow different implementations of store, unpacker and deployer. Some of this are to be passed per bundle deployment (store), whereas unpacker, validator and deployer are passed while creating the controller. I'll look at your comments and address those.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 18, 2023
This commit intends to introduce the new
v1alpha2 APIs

Signed-off-by: Varsha Prasad Narsing <[email protected]>
(cherry picked from commit a0238b9)
This commit adds alpha2 as storage version, and marks
alpha1 to be unserved.

Signed-off-by: Varsha Prasad Narsing <[email protected]>
This commit:
1. Adds skeleton for controller logic.
2. Adds "Store" interface, implementation for bundledeploymentstore and
   relevant tests.

Signed-off-by: Varsha Prasad Narsing <[email protected]>
This commit adds the default unpackers, with the specific
implementation and tests for unpacking from image source.

Signed-off-by: Varsha Prasad Narsing <[email protected]>
This commit adds:
1. Logic for validation of helm, registryV1 and plain bundles.
2. Tests for image source and a mock store implementation for use.

Signed-off-by: Varsha Prasad Narsing <[email protected]>
This commit introduces:
1. Deployer interface and Helm deployer - to deploy contents using helm
2. Wire deployer logic into reconciler.
3. Add SetupWithManager, so that this can be called from core/main.go in a
  follow up
.
Signed-off-by: Varsha Prasad Narsing <[email protected]>
@varshaprasad96 varshaprasad96 force-pushed the refactor/v1alpha2-controller-image-source branch from aa3dfa5 to 1237a3c Compare October 18, 2023 21:26
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 18, 2023
// +optional
Paused bool `json:"paused,omitempty"`

// Config is helm spcific configuration to load
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Config is helm spcific configuration to load
// Config is helm specific configuration to load

// Options to configure bundleDeploymentReconciler
type Option func(bd *bundleDeploymentReconciler)

func WithUnpacker(u *source.Unpacker) Option {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we just have this as a required field in the constructor instead of an option?

+1. An alternative would be to have a default unpacker that would be changed if this option is specified.

Comment on lines +153 to +158
// result can be nil, when there is an error during unpacking. This indicates
// that unpacking was failed. Update the status accordingly.
if res == nil || err != nil {
setUnpackStatusFailing(&bundleDeployment.Status.Conditions, fmt.Sprintf("unpack unsuccessful %v", err), bundleDeployment.Generation)
return ctrl.Result{}, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for res == nil and err == nil at the same time? If so, I'd be cautious of lumping them into the same conditional here. From a ux perspective, if we hit the case where res == nil && err == nil the message in the status is going to be something like unpack unsuccessful <nil> which is a pretty non-descriptive message.

Comment on lines +163 to +166
// Requeing after 5 sec for now since the average time to unpack an registry bundle locally
// was around ~4sec.
// Warning: This could end up requeing indefinitely, till an error has occurred.
return ctrl.Result{RequeueAfter: 5 * time.Second}, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to manually requeue? IIUC we have a watch on Pods because the Unpacker uses them to unpack the bundle images. It looks like the v1alpha2 unpacker only has an image source so it seems like changes to the unpack Pod should trigger a re-reconciliation of the BundleDeployment that owns it making this forced requeue unnecessary

Comment on lines +167 to +168
case source.StateUnpacking:
setUnpackStatusPending(&bundleDeployment.Status.Conditions, "unpacking in progress", bundleDeployment.Generation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC having a returned state of Unpacking from the unpacker used to correlate to updating the status condition Unpacked's reason to Unpacking. Is there a particular reason to remove it and instead use Pending with the message unpacking in progress?

Comment on lines +91 to +93
// setUnpgradeStatusFailed sets the installed success to failing as there is an error while patching
// objects on cluster.
func setUnpgradeStatusFailed(conditions *[]metav1.Condition, message string, generation int64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// setUnpgradeStatusFailed sets the installed success to failing as there is an error while patching
// objects on cluster.
func setUnpgradeStatusFailed(conditions *[]metav1.Condition, message string, generation int64) {
// setUpgradeStatusFailed sets the installed success to failing as there is an error while patching
// objects on cluster.
func setUpgradeStatusFailed(conditions *[]metav1.Condition, message string, generation int64) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this net new or has this code been moved+modified?

}

// Result conveys the progress information about deploying content.
// TODO: Refactor to use the same result struct for unpacking and deployment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What prevents us from doing this now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of ResolvedSource field. This api was intended to accept multiple resolved sources for a single BD. So that would need more discussion.

Comment on lines +280 to +283
// deployContents calls the registered deployer to apply the bundle contents onto the cluster.
func (b *bundleDeploymentReconciler) deployContents(ctx context.Context, store store.Store, bd *v1alpha2.BundleDeployment) (*deployer.Result, error) {
return b.deployer.Deploy(ctx, store, bd)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems unneccessary. We could just use the b.deployer.Deploy() function directly

Comment on lines +188 to +205
switch deployResult.State {
case deployer.StateIntallFailed:
setInstallStatusFailed(&bundleDeployment.Status.Conditions, err.Error(), bundleDeployment.Generation)
return ctrl.Result{}, err
case deployer.StateUnpgradeFailed:
setUnpgradeStatusFailed(&bundleDeployment.Status.Conditions, err.Error(), bundleDeployment.Generation)
return ctrl.Result{}, err
case deployer.StateReconcileFailed:
setReconcileStatusFailed(&bundleDeployment.Status.Conditions, err.Error(), bundleDeployment.Generation)
return ctrl.Result{}, err
case deployer.StateObjectFetchFailed:
setDynamicWatchFailed(&bundleDeployment.Status.Conditions, err.Error(), bundleDeployment.Generation)
return ctrl.Result{}, err
case deployer.StateDeploySuccessful:
setInstallStatusSuccess(&bundleDeployment.Status.Conditions, fmt.Sprintf("installed %s", bundleDeployment.GetName()), bundleDeployment.Generation)
default:
return ctrl.Result{}, fmt.Errorf("unknown deploy state %q for bundle deployment %s: %v", deployResult.State, bundleDeployment.GetName(), bundleDeployment.Generation)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we could make deployContents return the condition that should be set?

@ncdc
Copy link
Member

ncdc commented Oct 19, 2023

Have discussed with @varshaprasad96 @gallettilance @joelanford - we are going to pause on this for the moment.

During the pause, we are going to investigate using https://carvel.dev/kapp-controller/ an alternative to continuing to restructure the rukpak APIs. kapp-controller does almost everything rukpak currently does (it doesn't yet support registry+v1 bundles) and it will save us a lot of time if we can switch to using it.

Closing this for now. We can reopen depending on the results of our research.

@ncdc ncdc closed this Oct 19, 2023
@grokspawn
Copy link
Contributor

Have discussed with @varshaprasad96 @gallettilance @joelanford - we are going to pause on this for the moment.

During the pause, we are going to investigate using https://carvel.dev/kapp-controller/ an alternative to continuing to restructure the rukpak APIs. kapp-controller does almost everything rukpak currently does (it doesn't yet support registry+v1 bundles) and it will save us a lot of time if we can switch to using it.

Closing this for now. We can reopen depending on the results of our research.

Haven't seen this in WG discussion. Have I missed a thread?

@ncdc
Copy link
Member

ncdc commented Oct 20, 2023

No missed threads. This is something I've been talking about with @joelanford and a few other folks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants