Fix: Missing Kind in preview mode (yaml/v2) #2904

EronWright · 2024-03-22T20:27:45Z

Proposed changes

The yaml/v2 package eagerly resolves kinds to apply a default namespace. This PR addresses a problem with the use case
where a kind is installed by one ConfigGroup and then used by another. In preview mode, the kind cannot be found by the latter group because the expected side-effect (i.e. installation of a CRD) doesn't occur.

The solution to the issue is to maintain a cache in the provider of new CRDs. When the provider executes Check on a CustomResourceDefinition resource, it adds the CRD to the cache. Later, when a component resource must resolve a given kind, it checks the cache. In this way, the cache compensates for the lack of side-effects (i.e. API server calls) that would normally allow the kind to be resolved. Overall ordering is established using DependsOn.

It is reasonable to populate the cache during Check because a)Check produces the "planned" state that is needed later without actuating it, and b) Update is called conditionally and thus isn't a reliable alternative.

Implementation Details

The provider has three overlapping modes of operation - preview-mode, clusterUnreachable and yamlRenderMode - that affect the initialization and use of clientSet. Previously when clusterUnreachable is true then the clientSet is left null. Now, we new-up the clientSet itself but leave its various fields null. This change makes it possible to use the CRD cache in all modes, as was necessary to support yamlRenderMode.

Testing

A new integration test is added called yamlv2 that exercises this scenario.

Manual testing of "cluster unreachable mode" was done by misconfiguring my local environment (to have an invalid kube context). Tested in combination with "yaml rendering mode" and found that the latter works as expected even when the cluster is unreachable.

Related issues (optional)

github-actions · 2024-03-22T20:32:52Z

Does the PR have any schema changes?

Looking good! No breaking changes found.
No new resources/functions.

codecov · 2024-03-23T01:39:56Z

Codecov Report

Attention: Patch coverage is 41.79104% with 39 lines in your changes are missing coverage. Please review.

Project coverage is 27.63%. Comparing base (a15641e) to head (28c6912).

Files	Patch %	Lines
provider/pkg/clients/cache.go	50.00%	13 Missing and 5 partials ⚠️
provider/pkg/clients/clients.go	36.84%	11 Missing and 1 partial ⚠️
provider/pkg/provider/provider.go	22.22%	6 Missing and 1 partial ⚠️
provider/pkg/provider/invoke_decode_yaml.go	0.00%	1 Missing ⚠️
provider/pkg/provider/invoke_helm_template.go	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2904      +/-   ##
==========================================
+ Coverage   27.55%   27.63%   +0.08%     
==========================================
  Files          53       54       +1     
  Lines        7818     7862      +44     
==========================================
+ Hits         2154     2173      +19     
- Misses       5485     5504      +19     
- Partials      179      185       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

provider/pkg/provider/resource/provider.go

blampe

Mostly just some small questions and clarifications.

Higher level -- I definitely agree this makes for a better user experience, and how you've implemented things matches my understanding of the provider lifecycle (a single provider per operation etc.). I can't think of a better way to accomplish the desired UX given the current engine behavior.

For completeness, the strongest argument I can think of against this is that it's an experience the user will only see once when a CRD is first installed. Does one preview over the lifetime of a stack warrant complexity that bumps into things like the engine/provider contract?

I don't have enough context to feel strongly one way or the other. I do prefer the UX, but my default position is to usually err on the side of simpler-is-better when brushing up against system boundaries like this. If folks from the Platform side are on board with this then it seems like a no brainer.

blampe · 2024-03-26T16:02:40Z

provider/pkg/clients/clients.go

+			GenericClient:         nil,
+			DiscoveryClientCached: nil,
+			RESTMapper:            nil,
+			CRDCache:              &CRDCache{},


I would make CRDCache private if there's only meant to be one instance accessed via this client set.

provider/pkg/clients/clients_test.go

blampe · 2024-03-26T16:09:09Z

provider/pkg/provider/invoke_helm_template.go

@@ -257,7 +257,7 @@ func (c *chart) template(clientSet *clients.DynamicClientSet) (string, error) {
 		installAction.APIVersions = c.opts.APIVersions
 	}

-	if clientSet != nil {
+	if clientSet != nil && clientSet.DiscoveryClientCached != nil {


Is the tl;dr behind clientSet and DiscoveryClientCached that I would use DiscoveryClientCached if I can tolerate stale data and clientSet directly if I want to hit the live API?

No, the clientSet is a set of clients including a discovery client (that happens to have a built-in cache), a resource client, and a REST Mapper for translating Kinds to API Resources. AFAIK the provider doesn't cache any k8s objects but does cache the discovered type information.

provider/pkg/provider/provider.go

tests/provider/yamlv2_test.go

provider/pkg/clients/clients_test.go

provider/pkg/provider/resource/provider.go

EronWright · 2024-03-26T22:04:17Z

I had a discussion with @Frassle where he expressed support but advised me to submit some tests into pu/pu to harden the contract.

Frassle · 2024-03-26T22:26:19Z

I don't have enough context to feel strongly one way or the other. I do prefer the UX, but my default position is to usually err on the side of simpler-is-better when brushing up against system boundaries like this. If folks from the Platform side are on board with this then it seems like a no brainer.

There are a couple of arguments against this (or adding state anywhere in providers for that matter).

Is it makes debug attach trickier because you probably now need a clear "Reset" signal to tell the long running attached provider process that it's time to clear its state and start again because a new deployment cycle is happening.
It's less performant because we can't share the provider process across the preview and up phases of up (well I suppose we could if we had a supported Reset method like for 1).

But I suspect providers can do much smarter previews if they are stateful, and that's probably worth the downsides of the above.

provider/pkg/clients/cache.go

EronWright · 2024-03-27T21:22:09Z

Just want to mention that the Cancel RPC method could maybe serve as a reset signal. Today, Cancel is not called unless the user cancels the deployment, but it is probably harmless to call at the end of every run, and could reset the provider state for another run. To the point about attaching to a long-running provider, cancellation is a further complication to that story unless the provider takes pains to 'reset' (p-k doesn't; it simply closes a channel).

rquitales

LGTM

provider/pkg/clients/cache.go

prototype

6aee8c7

EronWright added the area/yaml label Mar 22, 2024

mjeffryes assigned EronWright Mar 22, 2024

more robust

8a6cec1

self-review

a89c0df

EronWright marked this pull request as ready for review March 25, 2024 17:12

EronWright requested review from rquitales and blampe March 25, 2024 17:12

EronWright added the impact/no-changelog-required This issue doesn't require a CHANGELOG update label Mar 25, 2024

EronWright mentioned this pull request Mar 25, 2024

yaml/v2: apply the namespace uniformly to the resource name #2897

Merged

EronWright commented Mar 25, 2024

View reviewed changes

provider/pkg/provider/resource/provider.go Show resolved Hide resolved

blampe reviewed Mar 26, 2024

View reviewed changes

provider/pkg/provider/resource/provider.go Outdated Show resolved Hide resolved

rquitales reviewed Mar 27, 2024

View reviewed changes

provider/pkg/clients/cache.go Outdated Show resolved Hide resolved

provider/pkg/clients/cache.go Show resolved Hide resolved

feedback

28c6912

rquitales approved these changes Mar 27, 2024

View reviewed changes

provider/pkg/clients/cache.go Show resolved Hide resolved

EronWright enabled auto-merge (squash) March 27, 2024 21:58

EronWright merged commit 76baec9 into master Mar 27, 2024
18 checks passed

EronWright deleted the eronwright/fix-missing-kind branch March 27, 2024 22:42

This was referenced Oct 31, 2024

Release resource should integrate with the CRD cache #3299

Open

Incorporate a built-in list of well-known CRDs #3300

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Missing Kind in preview mode (yaml/v2) #2904

Fix: Missing Kind in preview mode (yaml/v2) #2904

EronWright commented Mar 22, 2024 •

edited

Loading

github-actions bot commented Mar 22, 2024

codecov bot commented Mar 23, 2024 •

edited

Loading

blampe left a comment

blampe Mar 26, 2024

blampe Mar 26, 2024

EronWright Mar 26, 2024 •

edited

Loading

EronWright commented Mar 26, 2024

Frassle commented Mar 26, 2024

EronWright commented Mar 27, 2024

rquitales left a comment

Fix: Missing Kind in preview mode (yaml/v2) #2904

Fix: Missing Kind in preview mode (yaml/v2) #2904

Conversation

EronWright commented Mar 22, 2024 • edited Loading

Proposed changes

Implementation Details

Testing

Related issues (optional)

github-actions bot commented Mar 22, 2024

Does the PR have any schema changes?

codecov bot commented Mar 23, 2024 • edited Loading

Codecov Report

blampe left a comment

Choose a reason for hiding this comment

blampe Mar 26, 2024

Choose a reason for hiding this comment

blampe Mar 26, 2024

Choose a reason for hiding this comment

EronWright Mar 26, 2024 • edited Loading

Choose a reason for hiding this comment

EronWright commented Mar 26, 2024

Frassle commented Mar 26, 2024

EronWright commented Mar 27, 2024

rquitales left a comment

Choose a reason for hiding this comment

EronWright commented Mar 22, 2024 •

edited

Loading

codecov bot commented Mar 23, 2024 •

edited

Loading

EronWright Mar 26, 2024 •

edited

Loading