Skip to content

DRAFT: Fix SnippetsFilter Functional test #3751

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

bjee19
Copy link
Contributor

@bjee19 bjee19 commented Aug 15, 2025

Running the pipeline, do not review yet.

Issue is that after applying the SnippetsFilter manifest, which contains the HTTPRoute and GRPCRoute, there was no check to see if the GRPCRoute is ready. We only had a check to make sure traffic flowed to the HTTPRoute. I believe this caused a race condition where the test could occasionally fail if NGF was still processing the GRPCRoute while it had already processed the HTTPRoute and moved forward in the test suite.

Closes: #3753

@bjee19 bjee19 requested a review from a team as a code owner August 15, 2025 20:59
@bjee19 bjee19 changed the title Fix SnippetsFilter Functional test DRAFT: Fix SnippetsFilter Functional test Aug 15, 2025
@github-actions github-actions bot added bug Something isn't working tests Pull requests that update tests labels Aug 15, 2025
Copy link

codecov bot commented Aug 15, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.90%. Comparing base (235a4b2) to head (145aa73).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3751      +/-   ##
==========================================
- Coverage   86.90%   86.90%   -0.01%     
==========================================
  Files         128      128              
  Lines       16187    16201      +14     
  Branches       62       62              
==========================================
+ Hits        14067    14079      +12     
- Misses       1948     1949       +1     
- Partials      172      173       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

sarthyparty
sarthyparty previously approved these changes Aug 15, 2025
@sarthyparty sarthyparty dismissed their stale review August 18, 2025 14:28

it was draft

@bjee19 bjee19 force-pushed the bug/functional-tests-sf-fail branch from af17ad4 to 4469ac3 Compare August 18, 2025 23:30
@@ -64,6 +64,7 @@ var _ = Describe("SnippetsFilter", Ordered, Label("functional", "snippets-filter

BeforeAll(func() {
Expect(resourceManager.ApplyFromFiles(snippetsFilter, namespace)).To(Succeed())
Expect(resourceManager.WaitForAppsToBeReady(namespace)).To(Succeed())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omg hoping this is it, great find

@bjee19
Copy link
Contributor Author

bjee19 commented Aug 19, 2025

Adding flake attempts, not quite sure whats causing these 502 errors and traffic not flowing to the httproutes as seen here: https://github.com/nginx/nginx-gateway-fabric/actions/runs/17055138233/job/48355054192#step:17:729.

The "no live upstreams while connecting to upstream" is a common error we've seen in other tests so it could be related to that. This error also seems to be quite inconsistent, in the past 8 local runs and 12 (4 times per pipeline run, and i ran it 3 times) runs of a functional test in the pipeline it only occurred once.

@sjberman
Copy link
Collaborator

no live upstreams is indicative of a problem with the backend apps themselves, not nginx. It would be nice if we could diagnose why those might be having issues (like logs, for example).

@bjee19
Copy link
Contributor Author

bjee19 commented Aug 19, 2025

I guess in our case the backend apps are just nginxdemos/nginx-hello:plain-text which shouldn't really be facing issues, do you think its worth diagnosing them? Especially when its very infrequent? @sjberman

Actually, it looks like all the non-plus SF tests fail on the GRPC test case which should be solved by the waiting for apps to be ready check. But the Plus function test runs fail on the httproute not being able to send traffic. See https://github.com/nginx/nginx-gateway-fabric/actions/runs/17075769277/job/48417202770 for an example. I'll keep taking a look.

@sjberman
Copy link
Collaborator

@bjee19 Well there is clearly something going wrong here. If the apps themselves are having problems (or k8s is restarting the pods, or something else), it's causing our tests to be flaky, which is annoying. We do still have to ensure our infrastructure works. Can't ignore the potential for that issue just because it isn't a product issue. Or the other potential issue is that nginx has the wrong IP addresses for the endpoints. In that case, it could be our problem directly.

There are definitely cases where rerunning flaky tests is valid, but I also don't want us to hide issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working tests Pull requests that update tests
Projects
Status: 🆕 New
Development

Successfully merging this pull request may close these issues.

Fix failing SnippetsFilter functional test
4 participants