Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/large validation #10417

Merged
merged 68 commits into from
Dec 2, 2024
Merged

Feat/large validation #10417

merged 68 commits into from
Dec 2, 2024

Conversation

nfuden
Copy link
Collaborator

@nfuden nfuden commented Nov 27, 2024

Note this is entirely ripped from #10296

Description

To address an issue with large configs passed as arguments causing a arguments to long error during validation; This PR adjusts how the config is provided to Envoy. The config is now fed to the process via STDIN and the --config-yaml is set to the file descriptor for STDIN.

Code changes

  • Switch to STDIN for injecting the config during validation
  • Pass CLUSTER_NAME to kube2e test running
  • Adjusted instructions for running kube3e tests locally.
  • Added make target for kind setup
  • Increased timeout of validation webhook in tests and added note to docs

Context

A customer with a large config reported the error.

Interesting decisions

Initially there were discussions about saving a file inside of the container, but concerns about read-only root filesystems were raised. A volume would address that issue, but is a more complex solution. To avoid the need for a volume in some cases, I opted to use STDIN to provide the config to the program and read the STDIN FD to the config.

Testing steps

I've applied the large config yaml in the test with and without the fix confirming that is was broken and is now fixed.

% make kind-setup
% helm upgrade --install -n gloo-system --create-namespace gloo ./_test/gloo-1.0.0-ci1.tgz --values ./test/kubernetes/e2e/tests/manifests/common-recommendations.yaml
% make -B kind-reload-gloo
% kubectl create namespace full-envoy-validation-test
% kubectl apply -f test/kubernetes/e2e/features/validation/testdata/valid-resources/large-configuration.yaml

Notes for reviewers

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

ryanrolds and others added 30 commits November 11, 2024 13:00
@nfuden nfuden requested a review from a team as a code owner November 27, 2024 15:36
@solo-changelog-bot
Copy link

Issues linked to changelog:
https://github.com/solo-io/solo-projects/issues/7089

Copy link

github-actions bot commented Nov 27, 2024

Visit the preview URL for this PR (updated for commit 807a1d1):

https://gloo-edge--pr10417-feat-large-validatio-lmx29blq.web.app

(expires Mon, 09 Dec 2024 20:20:36 GMT)

🔥 via Firebase Hosting GitHub Action 🌎

Sign: 77c2b86e287749579b7ff9cadb81e099042ef677

Copy link

@sam-heilbron sam-heilbron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good!

projects/envoyinit/pkg/runner/run.go Outdated Show resolved Hide resolved

start := time.Now()
err := validateCmd.Run()
logger.Debugf("full envoy validation of %d size completed in %s", len(bootstrapConfig), time.Since(start))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this! I wonder if we could make this information consumable to create insights on, so users could see if their configuration was growing over time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might not be the right place but I like the idea. Perhaps higher up in translation we take our final output there to emit a metric?

@@ -1,7 +1,9 @@
gateway:
validation:
failurePolicy: Fail # For "strict" validation mode, fail the validation if webhook server is not available
allowWarnings: false # For "strict" validation mode, webhook will also reject warnings
allowWarnings: false

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this is false? I imagined that we could test the fullEnvoyValidation even with allowWarnings=true

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, this doesnt change the option just removes the comment

Copy link

@sam-heilbron sam-heilbron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Can you update the PR body to just add some context to it?

@nfuden nfuden merged commit 5598417 into main Dec 2, 2024
7 of 8 checks passed
@nfuden nfuden deleted the feat/large_validation branch December 2, 2024 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants