Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(backup): enqueue volumes with correct names #3413

Merged
merged 1 commit into from
Dec 26, 2024

Conversation

mantissahz
Copy link
Contributor

@mantissahz mantissahz commented Dec 25, 2024

@mantissahz mantissahz requested review from derekbit, c3y1huang and a team December 25, 2024 14:22
@mantissahz mantissahz self-assigned this Dec 25, 2024
Copy link

coderabbitai bot commented Dec 25, 2024

Walkthrough

The pull request introduces changes to the SystemBackupController and VolumeController to enhance backup volume management and synchronization. A new method isVolumeLastBackupSynced is added to verify if the last backup of a volume is synchronized with its current state. The WaitForVolumeBackupToComplete method is updated to incorporate this synchronization check. Additionally, the enqueueVolumesForBackupVolume method in the VolumeController is modified to accurately retrieve and process backup volume names and disaster recovery (DR) volumes. New helper functions are also introduced in the test files to facilitate better snapshot management in tests.

Changes

File Change Summary
controller/system_backup_controller.go - Added isVolumeLastBackupSynced method to check backup synchronization
- Updated WaitForVolumeBackupToComplete to validate backup synchronization
- Changed GetBackup to GetBackupRO for read-only backup retrieval
controller/volume_controller.go - Modified enqueueVolumesForBackupVolume to use canonical backup volume name
- Refined logic for enqueuing DR volumes based on backup target name
controller/controller_test.go - Added newSnapshot function to create longhorn.Snapshot objects for testing
controller/system_backup_controller_test.go - Updated TestReconcileSystemBackup to handle snapshot management
- Introduced fakeSystemRolloutSnapshot for lifecycle management of snapshots in tests
controller/system_restore_controller_test.go - Added fakeSystemRolloutSnapshot for managing fake snapshots in tests

Possibly related PRs

Suggested reviewers

  • derekbit
  • ChanYiLin
  • innobead

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@mantissahz
Copy link
Contributor Author

@Mergifyio backport v1.8.x

Copy link

mergify bot commented Dec 25, 2024

backport v1.8.x

🟠 Waiting for conditions to match

  • merged [📌 backport requirement]

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
controller/system_backup_controller.go (2)

Line range hint 837-850: Consider re-queue or wait mechanism for unsynced backups.

When c.isVolumeLastBackupSynced(backup) returns false, the code simply continues, leaving these backups in a non-processed state. Consider re-queuing them after a delay or adding a wait mechanism, so that eventually the isVolumeLastBackupSynced() condition can be satisfied.


870-883: Validate objects before checking the volume’s LastBackup.

If the snapshot or volume is missing, this method logs a warning and returns false. Consider adding a check for concurrency or repeating the lookup after some delay to handle transience or race conditions in object creation and updating.

controller/volume_controller.go (1)

4697-4718: Improve logging or error handling for mismatched backup targets.

The code enqueues volumes only when their Spec.BackupTargetName matches bv.Spec.BackupTargetName. Consider adding debug logs in the if backupTargetName != volume.Spec.BackupTargetName branch to clarify when volumes are skipped, aiding in future troubleshooting.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b30ba2 and 5e145d6.

📒 Files selected for processing (2)
  • controller/system_backup_controller.go (3 hunks)
  • controller/volume_controller.go (1 hunks)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
controller/system_backup_controller_test.go (2)

262-268: Consider verifying snapshot existence before assigning SnapshotName
Currently, the test sets backup.Status.SnapshotName = backup.Spec.SnapshotName and retrieves the snapshot object based on that assignment. It might be safer to validate backup.Spec.SnapshotName before assigning it to backup.Status.SnapshotName in case it’s empty or invalid, to make debugging test failures easier.


380-411: Improve readability by extracting snapshot deletion logic
Within fakeSystemRolloutSnapshot(), the logic to remove any existing snapshot with the same name could be moved into a smaller helper method. This would help maintain clarity of the primary flow and simplify future test maintenance.

controller/controller_test.go (1)

467-475: Check for conflicting snapshot names
The new newSnapshot function sets the snapshot name but doesn’t validate whether the name is already used by an existing snapshot in the cluster. For robust testing in larger suites, consider adding randomization or uniqueness checks to avoid conflicts with other snapshots.

controller/system_backup_controller.go (1)

870-884: Validate potential concurrency during snapshot-volume retrieval
isVolumeLastBackupSynced fetches both the snapshot and volume in read-only mode. If the volume’s Status.LastBackup is updated in parallel, the verification could be stale. Consider adding concurrency or version checks if the system backup workflow can overlap with volume updates.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e145d6 and 76ae0d4.

📒 Files selected for processing (4)
  • controller/controller_test.go (1 hunks)
  • controller/system_backup_controller.go (3 hunks)
  • controller/system_backup_controller_test.go (2 hunks)
  • controller/volume_controller.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • controller/volume_controller.go
🔇 Additional comments (2)
controller/system_backup_controller.go (2)

837-837: Validate usage of GetBackupRO
Replacing the standard GetBackup call with GetBackupRO clarifies read-only usage. Ensure that no other code modifications in this routine require a mutable backup object.


848-850: Confirm correctness of the volume-backup sync check
Here, the backup is skipped if isVolumeLastBackupSynced returns false. Confirm that calling continue is the right approach, and verify whether any additional cleanup or logging is needed for backups still in progress or partially synced.

In `enqueueVolumesForBackupVolume` of volume controller, it needs
to use `BackupVolume.Spec.VolumeName` to get volume CR instead of
`BackupVolume.Name`.

ref: longhorn/longhorn 5411, 10057

Signed-off-by: James Lu <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
controller/system_backup_controller.go (1)

Line range hint 837-850: Check partial backups with unsynchronized last backup

When a backup’s state is completed (line 848), the logic checks isVolumeLastBackupSynced(backup). If it's not synced, the code skips the backup by continuing. This behavior effectively waits until the volume’s best-known backup matches the backup resource. Ensure that higher-level logic either re-queues or handles any leftover unsynchronized backups to prevent them from being ignored indefinitely.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76ae0d4 and 1ce9ed7.

📒 Files selected for processing (5)
  • controller/controller_test.go (1 hunks)
  • controller/system_backup_controller.go (3 hunks)
  • controller/system_backup_controller_test.go (1 hunks)
  • controller/system_restore_controller_test.go (1 hunks)
  • controller/volume_controller.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • controller/controller_test.go
🔇 Additional comments (4)
controller/system_restore_controller_test.go (1)

894-924: Double-check snapshot creation logic for test correctness

The function effectively cleans up any existing snapshot with the same name before creating a new one, ensuring a fresh environment for testing. This approach looks thorough. However, consider edge cases where multiple snapshots share the same name or concurrency scenarios where multiple goroutines might interact with snapshots simultaneously. Although these situations may not arise in typical unit tests, adding documentation or safeguards for concurrent usage could further strengthen the code.

controller/system_backup_controller_test.go (1)

262-268: Validate SnapshotName resolution and ensure coverage of error handling

Lines 262–268 retrieve the snapshot from the cluster, then immediately assume its presence. The usage seems correct for a controlled environment. However, consider verifying that backup.Status.SnapshotName is set correctly in all scenarios and that the snapshot indeed belongs to the volume under test. This helps avoid confusion if multiple snapshots share a similar naming scheme.

controller/system_backup_controller.go (1)

870-883: Ensure robust error handling for missing snapshot references

The isVolumeLastBackupSynced function logs a warning if the snapshot or volume is missing and then returns false, indicating the backup is not synced. This is straightforward but consider whether you want any fallback behavior or error propagation if the snapshot is unexpectedly absent. In some cases, automatically re-trying or flagging an error condition might be more appropriate than silently returning false.

controller/volume_controller.go (1)

4697-4718: Handle DR volumes enqueuing for sync

This segment updates the queue with volumes that have matching backup volume names, but only if the backup target name matches (lines 4696–4718), thus avoiding enqueuing irrelevant volumes. The logic is clear and helps limit extraneous queue additions. If there is a possibility of name collision or delayed synchronization from the backup target, ensure the needed re-queues happen so volumes receive updates after the backup volume is created or updated.

@derekbit derekbit merged commit b4418eb into longhorn:master Dec 26, 2024
9 checks passed
@derekbit
Copy link
Member

@mergify backport v1.8.x

Copy link

mergify bot commented Dec 26, 2024

backport v1.8.x

✅ Backports have been created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants