Compute release 2024-09-25 #9151

lubennikovaav · 2024-09-25T16:10:21Z

No description provided.

Punted from #9077

## Problem All the other patches were moved to the compute directory, and only one was left in the patches subdirectory in the root directory. ## Summary of changes The patch was moved to the compute directory as others

…ation (#9113) Don't start waiting for the index to appear in the secondary until it has been created in the primary. Before, if the "pgbench -i" step took more than 60 s, we would give up. There was a flaky test failure along those lines at: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9105/10997477941/index.html#suites/950eff205b552e248417890b8b8f189e/73cf4b5648fa6f74/ Hopefully, this avoids such failures in the future.

After initdb, we configure it with "fsync=off" anyway.

Even with the 100 ms interval, on my laptop the pageserver always becomes available on second attempt, so this saves about 900 ms at every test startup.

Part of #8130. ## Problem Currently, decompression is performed within the `read_blobs` implementation and the decompressed blob will be appended to the end of the `BytesMut` buffer. We will lose this flexibility of extending the buffer when we switch to using our own dio-aligned buffer (WIP in #8730). To facilitate the adoption of aligned buffer, we need to refactor the code to perform decompression outside `read_blobs`. ## Summary of changes - `VectoredBlobReader::read_blobs` will return `VectoredBlob` without performing decompression and appending decompressed blob. It becomes the caller's responsibility to decompress the buffer. - Added a new `BufView` type that functions as `Cow<Bytes, &[u8]>`. - Perform decompression within `VectoredBlob::read` so that people don't have to explicitly thinking about compression when using the reader interface. Signed-off-by: Yuchen Liang <[email protected]>

In the `imitate_synthetic_size_calculation_worker` function, we might obtain the `Cancelled` error variant instead of hitting the cancellation token based path. Therefore, catch `Cancelled` and handle it analogously to the cancellation case. Fixes #8886.

## Problem Compilation of neon extension on macOS produces a warning ``` pgxn/neon/neon_perf_counters.c:50:1: error: non-void function does not return a value [-Werror,-Wreturn-type] ``` ## Summary of changes - Change the return type of `NeonPerfCountersShmemInit` to void

1. Increase statement_timeout. It defaults to 120 s, which is not quite enough on slow or busy systems with debug build. On my laptop, the index creation takes about 100 s. On buildfarm, we've seen failures, e.g: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9084/10997888708/index.html#suites/821f97908a487f1d7d3a2a4dd1571e99/db1834bddfe8c5b9/ 2. Keep twiddling the LFC size through the whole test. Before, we would do it for the first 10 seconds, but that only covers a small part of the pgbench initialization phase. Change the loop so that the pgbench run time determines how long the test runs, and we keep changing the LFC for the whole time. In the passing, also fix bogus test description, copy-pasted from a completely unrelated test.

The last real use was removed in commit de90bf4. It was still used in a few unit tests, but they can use Attaching too.

Typo in the readme inside docker-compose folder ## Summary of changes - Update the readme

@koivunej

As @koivunej mentioned in the storage channel, for regress test, we don't need to create a log file for the scrubber, and we should reduce noisy logs. ## Summary of changes * Disable log file creation for storage scrubber * Only log at info level --------- Signed-off-by: Alex Chi Z <[email protected]>

We log something at info level each time we schedule a shard to a non-secondary location. Might as well have context for it.

## Problem Storage controller didn't previously consider AZ locality between compute and pageservers when scheduling nodes. Control plane has this feature, and, since we are migrating tenants away from it, we need feature parity to avoid perf degradations. ## Summary of changes The change itself is fairly simple: 1. Thread az info into the scheduler 2. Add an extra member to the scheduling scores Step (2) deserves some more discussion. Let's break it down by the shard type being scheduled: **Attached Shards** We wish for attached shards of a tenant to end up in the preferred AZ of the tenant since that is where the compute is like to be. The AZ member for `NodeAttachmentSchedulingScore` has been placed below the affinity score (so it's got the second biggest weight for picking the node). The rationale for going below the affinity score is to avoid having all shards of a single tenant placed on the same node in 2 node regions, since that would mean that one tenant can drive the general workload of an entire pageserver. I'm not 100% sure this is the right decision, so open to discussing hoisting the AZ up to first place. **Secondary Shards** We wish for secondary shards of a tenant to be scheduled in a different AZ from the preferred one for HA purposes. The AZ member for `NodeSecondarySchedulingScore` has been placed first, so nodes in different AZs from the preferred one will always be considered first. On small clusters, this can mean that all the secondaries of a tenant are scheduled to the same pageserver, but secondaries don't use up as many resources as the attached location, so IMO the argument made for attached shards doesn't hold. Related: #8848

…9065) ## Problem These commits are split off from https://github.com/neondatabase/neon/pull/8971/commits where I was fixing this to make a better scale test pass -- Vlad also independently recognized these issues with cloudbench in #9062. 1. The storage controller proxies GET requests to pageservers based on their intent, not the ground truth of where they're really attached. 2. Proxied requests can race with scheduling to tenants, resulting in 404 responses if the request hits the wrong pageserver. Closes: #9062 ## Summary of changes 1. If a shard has a running reconciler, then use the database generation_pageserver to decide who to proxy the request to 2. If such a request gets a 404 response and its scheduled node has changed since the request was dispatched.

Related to neondatabase/cloud#11698

This increases the support window of the OS used for PG17 by 2 years compared to the previous usage of Debian 11 "Bullseye".

## Problem Hard to tell which timeline is holding the migration. ## Summary of Changes Add timeline id to log.

Fixes #9098. ## Problem See #9098 (comment). ### Related A similar problem happened with branch creation, which was discussed [here](#2143 (comment)) and fixed by #2529. ## Summary of changes - Normalize the lsn on pageserver side upon lsn lease request, stores the normalized LSN. Signed-off-by: Yuchen Liang <[email protected]>

github-actions · 2024-09-25T17:11:09Z

5013 tests run: 4849 passed, 0 failed, 164 skipped (full report)

Flaky tests (2)

Postgres 17

test_pageserver_compaction_smoke: release-arm64

Postgres 14

test_subscriber_restart: release-x86-64

Code coverage* (full report)

functions: 32.0% (7490 of 23395 functions)
lines: 50.0% (60467 of 120849 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
d447f49 at 2024-09-25T17:11:08.487Z :recycle:}

arpad-m · 2024-09-25T17:13:03Z

#9146 is only the main branch backport, the commit should already be present on the release branch if you merge this PR (please merge with merge commit).

problame and others added 20 commits September 24, 2024 13:05

chore(#9077): cleanups & code dedup (#9082)

a65d437

Punted from #9077

Move the patch to compute (#9120)

b224a5a

## Problem All the other patches were moved to the compute directory, and only one was left in the patches subdirectory in the root directory. ## Summary of changes The patch was moved to the compute directory as others

test: Skip fsync when initdb'ing the storage controller db

589594c

After initdb, we configure it with "fsync=off" anyway.

test: Poll pageserver availability more aggressively at test startup

2f7ceca

Even with the 100 ms interval, on my laptop the pageserver always becomes available on second attempt, so this saves about 900 ms at every test startup.

Remove TenantState::Loading (#9118)

5cbf5b4

The last real use was removed in commit de90bf4. It was still used in a few unit tests, but they can use Attaching too.

chore(docker-compose): fix typo in readme (#9133)

938b163

Typo in the readme inside docker-compose folder ## Summary of changes - Update the readme

storcon: add tags to scheduler logs (#9127)

a26cc29

We log something at info level each time we schedule a shard to a non-secondary location. Might as well have context for it.

Re-export git-version from utils and remove as direct dep (#9138)

7dcfccc

docs(rfc): Independent compute release flow (#8881)

518f598

Related to neondatabase/cloud#11698

Build images for PG17 using Debian 12 "Bookworm" (#9132)

c4f5736

This increases the support window of the OS used for PG17 by 2 years compared to the previous usage of Debian 11 "Bullseye".

storcon: include timeline ID in LSN waiting logs (#9141)

c597238

## Problem Hard to tell which timeline is holding the migration. ## Summary of Changes Add timeline id to log.

lubennikovaav requested review from a team as code owners September 25, 2024 16:10

lubennikovaav requested review from problame, conradludgate and tristan957 and removed request for a team September 25, 2024 16:10

lubennikovaav mentioned this pull request Sep 25, 2024

Compute release 2024-09-25 #9149

Closed

lubennikovaav requested a review from MMeent September 25, 2024 16:11

bayandin approved these changes Sep 25, 2024

View reviewed changes

lubennikovaav requested a review from arpad-m September 25, 2024 16:13

MMeent approved these changes Sep 25, 2024

View reviewed changes

arpad-m approved these changes Sep 25, 2024

View reviewed changes

lubennikovaav merged commit 2b9fb47 into release Sep 25, 2024
196 of 231 checks passed

lubennikovaav deleted the releases/2024-09-25-compute-only-2 branch September 25, 2024 22:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute release 2024-09-25 #9151

Compute release 2024-09-25 #9151

lubennikovaav commented Sep 25, 2024

github-actions bot commented Sep 25, 2024

Postgres 17

Postgres 14

arpad-m commented Sep 25, 2024

Compute release 2024-09-25 #9151

Compute release 2024-09-25 #9151

Conversation

lubennikovaav commented Sep 25, 2024

github-actions bot commented Sep 25, 2024

5013 tests run: 4849 passed, 0 failed, 164 skipped (full report)

Postgres 17

Postgres 14

Code coverage* (full report)

arpad-m commented Sep 25, 2024