-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(backup): enqueue volumes with correct names #3413
Conversation
WalkthroughThe pull request introduces changes to the Changes
Possibly related PRs
Suggested reviewers
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
@Mergifyio backport v1.8.x |
🟠 Waiting for conditions to match
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
controller/system_backup_controller.go (2)
Line range hint
837-850
: Consider re-queue or wait mechanism for unsynced backups.When
c.isVolumeLastBackupSynced(backup)
returnsfalse
, the code simply continues, leaving these backups in a non-processed state. Consider re-queuing them after a delay or adding a wait mechanism, so that eventually theisVolumeLastBackupSynced()
condition can be satisfied.
870-883
: Validate objects before checking the volume’s LastBackup.If the snapshot or volume is missing, this method logs a warning and returns
false
. Consider adding a check for concurrency or repeating the lookup after some delay to handle transience or race conditions in object creation and updating.controller/volume_controller.go (1)
4697-4718
: Improve logging or error handling for mismatched backup targets.The code enqueues volumes only when their
Spec.BackupTargetName
matchesbv.Spec.BackupTargetName
. Consider adding debug logs in theif backupTargetName != volume.Spec.BackupTargetName
branch to clarify when volumes are skipped, aiding in future troubleshooting.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
controller/system_backup_controller.go
(3 hunks)controller/volume_controller.go
(1 hunks)
5e145d6
to
76ae0d4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
controller/system_backup_controller_test.go (2)
262-268
: Consider verifying snapshot existence before assigning SnapshotName
Currently, the test setsbackup.Status.SnapshotName = backup.Spec.SnapshotName
and retrieves the snapshot object based on that assignment. It might be safer to validatebackup.Spec.SnapshotName
before assigning it tobackup.Status.SnapshotName
in case it’s empty or invalid, to make debugging test failures easier.
380-411
: Improve readability by extracting snapshot deletion logic
WithinfakeSystemRolloutSnapshot()
, the logic to remove any existing snapshot with the same name could be moved into a smaller helper method. This would help maintain clarity of the primary flow and simplify future test maintenance.controller/controller_test.go (1)
467-475
: Check for conflicting snapshot names
The newnewSnapshot
function sets the snapshot name but doesn’t validate whether the name is already used by an existing snapshot in the cluster. For robust testing in larger suites, consider adding randomization or uniqueness checks to avoid conflicts with other snapshots.controller/system_backup_controller.go (1)
870-884
: Validate potential concurrency during snapshot-volume retrieval
isVolumeLastBackupSynced
fetches both the snapshot and volume in read-only mode. If the volume’sStatus.LastBackup
is updated in parallel, the verification could be stale. Consider adding concurrency or version checks if the system backup workflow can overlap with volume updates.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
controller/controller_test.go
(1 hunks)controller/system_backup_controller.go
(3 hunks)controller/system_backup_controller_test.go
(2 hunks)controller/volume_controller.go
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- controller/volume_controller.go
🔇 Additional comments (2)
controller/system_backup_controller.go (2)
837-837
: Validate usage of GetBackupRO
Replacing the standard GetBackup
call with GetBackupRO
clarifies read-only usage. Ensure that no other code modifications in this routine require a mutable backup object.
848-850
: Confirm correctness of the volume-backup sync check
Here, the backup is skipped if isVolumeLastBackupSynced
returns false. Confirm that calling continue
is the right approach, and verify whether any additional cleanup or logging is needed for backups still in progress or partially synced.
In `enqueueVolumesForBackupVolume` of volume controller, it needs to use `BackupVolume.Spec.VolumeName` to get volume CR instead of `BackupVolume.Name`. ref: longhorn/longhorn 5411, 10057 Signed-off-by: James Lu <[email protected]>
76ae0d4
to
1ce9ed7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
controller/system_backup_controller.go (1)
Line range hint
837-850
: Check partial backups with unsynchronized last backupWhen a backup’s state is completed (line 848), the logic checks
isVolumeLastBackupSynced(backup)
. If it's not synced, the code skips the backup by continuing. This behavior effectively waits until the volume’s best-known backup matches the backup resource. Ensure that higher-level logic either re-queues or handles any leftover unsynchronized backups to prevent them from being ignored indefinitely.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
controller/controller_test.go
(1 hunks)controller/system_backup_controller.go
(3 hunks)controller/system_backup_controller_test.go
(1 hunks)controller/system_restore_controller_test.go
(1 hunks)controller/volume_controller.go
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- controller/controller_test.go
🔇 Additional comments (4)
controller/system_restore_controller_test.go (1)
894-924
: Double-check snapshot creation logic for test correctness
The function effectively cleans up any existing snapshot with the same name before creating a new one, ensuring a fresh environment for testing. This approach looks thorough. However, consider edge cases where multiple snapshots share the same name or concurrency scenarios where multiple goroutines might interact with snapshots simultaneously. Although these situations may not arise in typical unit tests, adding documentation or safeguards for concurrent usage could further strengthen the code.
controller/system_backup_controller_test.go (1)
262-268
: Validate SnapshotName
resolution and ensure coverage of error handling
Lines 262–268 retrieve the snapshot from the cluster, then immediately assume its presence. The usage seems correct for a controlled environment. However, consider verifying that backup.Status.SnapshotName
is set correctly in all scenarios and that the snapshot indeed belongs to the volume under test. This helps avoid confusion if multiple snapshots share a similar naming scheme.
controller/system_backup_controller.go (1)
870-883
: Ensure robust error handling for missing snapshot references
The isVolumeLastBackupSynced
function logs a warning if the snapshot or volume is missing and then returns false, indicating the backup is not synced. This is straightforward but consider whether you want any fallback behavior or error propagation if the snapshot is unexpectedly absent. In some cases, automatically re-trying or flagging an error condition might be more appropriate than silently returning false.
controller/volume_controller.go (1)
4697-4718
: Handle DR volumes enqueuing for sync
This segment updates the queue with volumes that have matching backup volume names, but only if the backup target name matches (lines 4696–4718), thus avoiding enqueuing irrelevant volumes. The logic is clear and helps limit extraneous queue additions. If there is a possibility of name collision or delayed synchronization from the backup target, ensure the needed re-queues happen so volumes receive updates after the backup volume is created or updated.
@mergify backport v1.8.x |
✅ Backports have been created
|
Which issue(s) this PR fixes:
Issue # longhorn/longhorn#5411, longhorn/longhorn#10057
What this PR does / why we need it:
In
enqueueVolumesForBackupVolume
of volume controller, it needs to use BackupVolume.Spec.VolumeNameto get volume CR instead of
BackupVolume.Name`.Special notes for your reviewer:
Additional documentation or context
Run test cases:
https://ci.longhorn.io/job/private/job/longhorn-tests-regression/7981
https://ci.longhorn.io/job/private/job/longhorn-tests-regression/7986
https://ci.longhorn.io/job/private/job/longhorn-tests-regression/7995
https://ci.longhorn.io/job/private/job/longhorn-tests-regression/8001
https://ci.longhorn.io/job/private/job/longhorn-tests-regression/8012
https://ci.longhorn.io/job/private/job/longhorn-tests-regression/8013
https://ci.longhorn.io/job/private/job/longhorn-tests-regression/8014
https://ci.longhorn.io/job/private/job/longhorn-tests-regression/8015