-
Notifications
You must be signed in to change notification settings - Fork 260
test: expand LRP test to include lifecycle events #4086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds negative Local Redirect Policy (LRP) test scenarios and parameterizes Prometheus endpoint usage in existing LRP tests.
- Extends testLRPCase to accept a Prometheus address for flexible metric validation.
- Introduces comprehensive negative and resilience LRP tests (resource recreation, pod restarts, cilium validation).
- Adds helper functions for validating Cilium LRP state and recreating test resources.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| test/integration/lrp/lrp_test.go | Adds negative LRP test flow, Prometheus address parameter, and Cilium/LRP validation helpers. |
| test/integration/lrp/lrp_fqdn_test.go | Updates calls to testLRPCase to include the new Prometheus address parameter. |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| require.Greater(t, afterValue, beforeValue, "dns metric count did not increase after command - before: %.0f, after: %.0f", beforeValue, afterValue) | ||
| } else { | ||
| require.Equal(t, afterMetric.GetCounter().GetValue(), beforeMetric.GetCounter().GetValue(), "dns metric count increased after command") | ||
| require.Equal(t, afterValue, beforeValue, "dns metric count increased after command - before: %.0f, after: %.0f", beforeValue, afterValue) |
Copilot
AI
Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The testify assertions use formatted messages with placeholders but require.Greater and require.Equal do not perform format substitution; use require.Greaterf / require.Equalf or pre-format the message with fmt.Sprintf to ensure values appear in failure output.
| require.Equal(t, afterValue, beforeValue, "dns metric count increased after command - before: %.0f, after: %.0f", beforeValue, afterValue) | |
| require.Equalf(t, afterValue, beforeValue, "dns metric count increased after command - before: %.0f, after: %.0f", beforeValue, afterValue) |
test/integration/lrp/lrp_test.go
Outdated
| // Wait for deletion to complete | ||
| time.Sleep(10 * time.Second) |
Copilot
AI
Oct 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed sleep introduces unnecessary delay and potential flakiness; replace with a poll-based wait (e.g., repeatedly checking for resource absence/recreation with a timeout) to reduce test duration and improve reliability.
| // Wait for deletion to complete | |
| time.Sleep(10 * time.Second) | |
| // Wait for deletion to complete (poll for absence of client DaemonSet and LRP) | |
| retry.DoWithTimeout(ctx, "wait for client DaemonSet deletion", 30*time.Second, func(ctx context.Context) (bool, error) { | |
| _, err := dsClient.Get(ctx, clientDS.Name, metav1.GetOptions{}) | |
| if err != nil { | |
| // DaemonSet not found | |
| return true, nil | |
| } | |
| // Still exists | |
| return false, nil | |
| }) | |
| retry.DoWithTimeout(ctx, "wait for LRP deletion", 30*time.Second, func(ctx context.Context) (bool, error) { | |
| _, err := lrpClient.Get(ctx, lrp.Name, metav1.GetOptions{}) | |
| if err != nil { | |
| // LRP not found | |
| return true, nil | |
| } | |
| // Still exists | |
| return false, nil | |
| }) |
321a3ea to
ba2b602
Compare
ba2b602 to
b947a40
Compare
dfb53d7 to
4e6b260
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
test/integration/lrp/lrp_test.go:517
- The
rand.Seed()call is made every timeTakeOne()is invoked, which can cause issues in concurrent tests or when called multiple times in quick succession (as the seed would be very similar). Additionally,rand.Seed()has been deprecated since Go 1.20 in favor of using the global generator which is automatically seeded.
Recommendation: Remove the rand.Seed() call and use the global random generator directly, which is automatically seeded in Go 1.20+:
func TakeOne[T any](slice []T) T {
if len(slice) == 0 {
var zero T
return zero
}
return slice[rand.Intn(len(slice))]
}Alternatively, if you need better randomness, use crypto/rand or math/rand/v2 for Go 1.22+.
// TakeOne takes one item from the slice randomly; if empty, it returns the empty value for the type
// Use in testing only
func TakeOne[T any](slice []T) T {
if len(slice) == 0 {
var zero T
return zero
}
rand.Seed(uint64(time.Now().UnixNano()))
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
test/integration/manifests/cilium/v1.13/cilium-config/cilium-config.yaml
Show resolved
Hide resolved
test/integration/manifests/cilium/v1.13/cilium-config/cilium-config.yaml
Show resolved
Hide resolved
5523454 to
b7e8be8
Compare
| retrier := retry.Retrier{Attempts: RetryAttempts, Delay: RetryDelay} | ||
| return errors.Wrap(retrier.Do(ctx, checkLRPDeleted), "failed to wait for LRP to delete") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it reasonable to retry for 15 minutes (900seconds)? How long does this process normally take?
If it is what I expect (less than 10 seconds) then we can leverage Delete* consts. Anything more than that is excessive.
Reason for Change:
Test Steps:
Purpose: This comprehensive lifecycle test ensures that Local Redirect Policy functionality is robust and survives various operational scenarios including pod restarts, resource deletion/recreation, and service restarts - validating both functional behavior through DNS metrics and dataplane configuration through Cilium CLI inspection.
Issue Fixed:
Requirements:
Notes: