Skip to content

Conversation

@AndrewChubatiuk
Copy link
Contributor

fixes #1548

@AndrewChubatiuk AndrewChubatiuk force-pushed the pdb-per-shard branch 4 times, most recently from 0596368 to b6f202e Compare October 19, 2025 20:00
Copy link
Collaborator

@f41gh7 f41gh7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this change for the following reasons:

  1. it's intrusive change with a lot of refactoring and not necessary changes. It's hard to read.
  2. commit message doesn't have any motivation of this change.
  3. It moves shard template feature out of scope for the current implementation. If templating is supported by PBD, why it's not supported by ServiceSpec, ServiceScrapeSpec, HPA? What is the boundaries of it? I'd like to keep it as simple as possible and isolated.

@f41gh7
Copy link
Collaborator

f41gh7 commented Oct 20, 2025

cc @vrutkovs

@AndrewChubatiuk
Copy link
Contributor Author

AndrewChubatiuk commented Oct 20, 2025

this PR doesn't add templating support for PDB, it creates PDB per shard with additional shard-num label.

why it's not supported by ServiceSpec, ServiceScrapeSpec, HPA

PDB applies policy for pods, that match provided selector, and in case of multiple deployments there's no guarantee that pods will be rotated properly during upgrade. HPA is not supported by VMAgent, one ServiceScrapeSpec is enough to scrape metrics properly from all pods. For Service maybe it makes sense to have one per shard for scenario, when vmagent shards are configured as remote write destination, but in our case sharding is needed for scraping first of all and Service is mostly used for scraping as well

@AndrewChubatiuk AndrewChubatiuk force-pushed the pdb-per-shard branch 2 times, most recently from f887281 to 8de82d2 Compare October 20, 2025 12:22
@AndrewChubatiuk
Copy link
Contributor Author

AndrewChubatiuk commented Oct 20, 2025

  1. it's intrusive change with a lot of refactoring and not necessary changes. It's hard to read.
  2. commit message doesn't have any motivation of this change.
  1. i can split it into several PR's, maybe can apply some changes, that could improve readability for you, but overall idea of "not necessary changes" is to have similar logic of places, where shards are configured.
  2. updated

Copy link
Contributor

@vrutkovs vrutkovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets move ginkgo and golangci bumps in a separate commits - this would make it easier to revert if needed

@AndrewChubatiuk AndrewChubatiuk force-pushed the pdb-per-shard branch 2 times, most recently from c5a666d to f8b8cf7 Compare October 20, 2025 16:27
@AndrewChubatiuk
Copy link
Contributor Author

updated PR

@AndrewChubatiuk AndrewChubatiuk force-pushed the pdb-per-shard branch 9 times, most recently from 86a833d to 18219fd Compare October 22, 2025 06:42
@AndrewChubatiuk AndrewChubatiuk force-pushed the pdb-per-shard branch 2 times, most recently from 8caa9cc to 91448eb Compare October 24, 2025 18:43
Copy link
Contributor

@vrutkovs vrutkovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, few suggestions

@AndrewChubatiuk AndrewChubatiuk force-pushed the pdb-per-shard branch 2 times, most recently from a38d2f9 to dc0c5b9 Compare October 31, 2025 13:27
@f41gh7 f41gh7 requested a review from Copilot October 31, 2025 15:08
@AndrewChubatiuk AndrewChubatiuk requested review from Copilot and removed request for Copilot October 31, 2025 15:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the sharding implementation for VMAgent and VMAnomaly to create per-shard PodDisruptionBudgets (PDBs), ensuring proper application protection in sharded deployments. The key changes include:

  • Consolidation of shard-related logic into a new build/shard.go package
  • Refactoring PDB creation to support per-shard PDBs instead of a single global PDB
  • Introduction of unified orphaned resource removal that supports Deployments, StatefulSets, and PDBs
  • Removal of placeholder-based string templating in favor of structured shard handling

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
internal/controller/operator/factory/build/shard.go New package consolidating shard iteration, naming, labeling, and rendering logic
internal/controller/operator/factory/build/shard_test.go Tests for the new shard iterator function
internal/controller/operator/factory/build/pdb.go Extended to support per-shard PDB creation
internal/controller/operator/factory/vmagent/vmagent.go Refactored to use new shard abstractions and create per-shard PDBs
internal/controller/operator/factory/vmagent/vmagent_test.go Added test coverage for per-shard PDB creation
internal/controller/operator/factory/vmanomaly/statefulset.go Refactored to use new shard abstractions and create per-shard PDBs
internal/controller/operator/factory/finalize/orphaned.go Unified orphaned resource removal using unstructured API
internal/controller/operator/factory/finalize/orphaned_test.go Refactored tests to use a helper function pattern
api/operator/v1beta1/vmagent_types.go Added GetShardCount() helper method
docs/CHANGELOG.md Documented the bugfix

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…cation protection, before there was a single PDB, which could lead to unpredicted application disruptions. fixes #1548
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PDB issue with vmAgent

3 participants