Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: stabilize test_storage_controller_node_deletion #10420

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jcsp
Copy link
Collaborator

@jcsp jcsp commented Jan 16, 2025

Problem

test_storage_controller_node_deletion sometimes failed because shards were moving around during timeline creation, and neon_local isn't tolerant of that. The movements were unexpected because the shards had only just been created.

This was a regression from #9916

Closes: #10383

Summary of changes

  • Make this test use multiple AZs -- this makes the storage controller's scheduling reliably stable

Why this works: in #9916 , I made a simplifying assumption that we would have multiple AZs to get nice stable scheduling -- it's much easier, because each tenant has a well defined primary+secondary location when they have an AZ preference and nodes have different AZs. Everything still works if you don't have multiple AZs, but you just have this quirk that sometimes the optimizer can disagree with initial scheduling, so once in a while a shard moves after being created -- annoying for tests, harmless IRL.

@jcsp jcsp added a/test Area: related to testing a/tech_debt Area: related to tech debt c/storage/controller Component: Storage Controller labels Jan 16, 2025
Copy link

7326 tests run: 6949 passed, 0 failed, 377 skipped (full report)


Flaky tests (2)

Postgres 17

Postgres 16

Code coverage* (full report)

  • functions: 33.7% (8427 of 25017 functions)
  • lines: 49.2% (70469 of 143315 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
08fc7cb at 2025-01-16T10:45:29.948Z :recycle:

@jcsp jcsp marked this pull request as ready for review January 16, 2025 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/tech_debt Area: related to tech debt a/test Area: related to testing c/storage/controller Component: Storage Controller
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storcon: created tenants can be optimised immediately (test_storage_controller_node_deletion failure)
1 participant