Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestReconcile flakiness #2560

Closed
acpana opened this issue Feb 1, 2023 · 6 comments
Closed

TestReconcile flakiness #2560

acpana opened this issue Feb 1, 2023 · 6 comments

Comments

@acpana
Copy link
Contributor

acpana commented Feb 1, 2023

(parking this here since I don't have time to investigate further)

I've seen a couple of TestReconcile failed runs on some PRs. e.g. run:

Here's (a probably not very helpful) error message from the run: https://github.com/open-policy-agent/gatekeeper/actions/runs/4069256375/jobs/7008780533#step:4:56

--- FAIL: TestReconcile (32.07s)
    manager.go:29: running Manager: failed waiting for all runnables to end within grace period of 30s: context deadline exceeded
2023-02-01T22:27:00Z	INFO	controller	Running test: Cancel the expectations when sync only resource gets deleted	{"kind": "Config"}

further down I can see:

E0201 22:27:11.689901   14958 reflector.go:140] pkg/dynamiccache/internal/informers_map.go:159: Failed to watch constraints.gatekeeper.sh/v1beta1, Kind=DenyAllCRDRecreated: the server could not find the requested resource

although that looks like a red herring.

@stale
Copy link

stale bot commented Apr 3, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 3, 2023
@acpana
Copy link
Contributor Author

acpana commented Apr 14, 2023

not stale, still happening @maxsmythe @ritazh @sozercan could one of you triage this issue with the right labels please?

@acpana
Copy link
Contributor Author

acpana commented Jun 15, 2023

Also, here's another indication of how this fails:

2023-06-13T22:22:17Z	ERROR	Reconciler error	{"controller": "config-controller", "object": {"name":"config","namespace":"gatekeeper-system"}, "namespace": "gatekeeper-system", "name": "config", "reconcileID": "3ea0538d-9833-418e-af1f-d93b08fec397", "error": "replaying data: replaying data for /v1, Kind=ConfigMap: synthetic failure"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/home/runner/work/gatekeeper/gatekeeper/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/home/runner/work/gatekeeper/gatekeeper/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/home/runner/work/gatekeeper/gatekeeper/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:235

@acpana
Copy link
Contributor Author

acpana commented Aug 9, 2023

Having spent some time writing tests that use the mgr and other controllers, I'm somewhat confident that the issue comes from not setting up or cleaning test runnables correctly. If I have cycles I can look at our _controller tests.

@acpana
Copy link
Contributor Author

acpana commented Aug 30, 2023

This may have been gotten fixed by 6ca3fa5 but will keep around for now and take another pass over all _controller_tests before closing off.

@acpana
Copy link
Contributor Author

acpana commented Oct 20, 2023

I'm fairly certain that this has been solved as per the above ^. Haven't seen occurrences of this flake anymore.

Will close but we can re-open if I am wrong

@acpana acpana closed this as completed Oct 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants