Releases/2024 10 17 compute kq only #9451

lubennikovaav · 2024-10-18T12:58:46Z

This release won't be deployed for everyone.
The only purpose of this release is to generate a compute image that will contain new versions of kq extensions.

I will pin this image for one customer, so that they could test that new version of their extension works as expected.
If everything is ok it will be deployed normally with the next compute release, if not, we will have time to fix/revert it.

We cannot use compute image just from main or from PR, because the extension uses build-custom-extension infrastructure and prod projects only have access to prod s3 buckets.

removes the ConsoleRedirect backend from the main auth::Backends enum, copy-paste the existing crate::proxy::task_main structure to use the ConsoleRedirectBackend exclusively. This makes the logic a bit simpler at the cost of some fairly trivial code duplication.

## Problem At MacOS `pg_dynshmem` file is create in PGDATADIR which cause mismatch in directories comparison ## Summary of changes Add this files to the ignore list. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <[email protected]>

## Problem When `Dockerfile.build-tools` gets changed, several PRs catch up with it and some might get unexpectedly cancelled workflows because of GitHub's concurrency model for workflows. See the comment in the code for more details. It should be possible to revert it after https://github.com/orgs/community/discussions/41518 (I don't expect it anytime soon, but I subscribed) ## Summary of changes - Do not queue `build-build-tools-image` workflows in the concurrency group

Adds a test for the (now fixed) storage broker limit issue, see #9268 for the description and #9299 for the fix. Also fix a race condition with endpoint creation/starts running in parallel, leading to file not found errors.

The empirically determined threshold doesn't hold for PG 17. Bump the limit to stabilise ci.

Also consider offloaded timelines for obtaining `retain_lsn`. This is required for correctness for all timelines that have not been flattened yet: otherwise we GC data that might still be required for reading. This somewhat counteracts the original purpose of timeline offloading of not having to iterate over offloaded timelines, but sadly it's required. In the future, we can improve the way the offloaded timelines are stored. We also make the `retain_lsn` optional so that in the future, when we implement flattening, we can make it None. This also applies to full timeline objects by the way, where it would probably make most sense to add a bool flag whether the timeline is successfully flattened, and if it is, one can exclude it from `retain_lsn` as well. Also, track whether a timeline was offloaded or not in `retain_lsn` so that the `retain_lsn` can be excluded from visibility and size calculation. Part of #8088

) We now also track: - Number of PS IOs in-flight - Number of pages cached by smgr prefetch implementation - IO timing histograms for LFC reads and writes, per IO issued ## Problem There's little insight into the timing metrics of LFC, and what the prefetch state of each backend is. This changes that, by measuring (and subsequently exposing) these data points. ## Summary of changes - Extract IOHistogram as separate type, rather than a collection of fields on NeonMetrics - others, see items above. Part of #8926

## Problem This PR switches CI and Storage to Debain 12 (Bookworm) based images. ## Summary of changes - Add Debian codename (`bookworm`/`bullseye`) to most of docker tags, create un-codenamed images to be used by default - `vm-compute-node-image`: create a separate spec for `bookworm` (we don't need to build cgroups in the future) - `neon-image`: Switch to `bookworm`-based `build-tools` image - Storage components and Proxy use it - CI: run lints and tests on `bookworm`-based `build-tools` image

## Problem We were seeing timeouts on migrations in this test. The test unfortunately tends to saturate local storage, which is shared between the pageservers and the control plane database, which makes the test kind of unrealistic. We will also want to increase the scale of this test, so it's worth fixing that. ## Summary of changes - Instead of randomly creating timelines at the same time as the other background operations, explicitly identify a subset of tenant which will have timelines, and create them at the start. This avoids pageservers putting a lot of load on the test node during the main body of the test. - Adjust the tenants created to create some number of 8 shard tenants and the rest 1 shard tenants, instead of just creating a lot of 2 shard tenants. - Use archival_config to exercise tenant-mutating operations, instead of using timeline creation for this. - Adjust reconcile_until_idle calls to avoid waiting 5 seconds between calls, which causes timelines with large shard count tenants. - Fix a pageserver bug where calls to archival_config during activation get 404

Add a test for timeline offloading, and subsequent unoffloading. Also adds a manual endpoint, and issues a proper timeline shutdown during offloading which prevents a pageserver hang at shutdown. Part of #8088.

pg_session_jwt now: 1. Sets the JWK in a PGU_BACKEND session guc, no longer in the init() function. 2. JWK no longer needs the kid.

## Problem There is double update of resize cache in `put_rel_truncation` Also `page_server_request` contains check that fork is MAIN_FORKNUM which 1. is incorrect (because Vm/FSM pages are shreded in the same way as MAIN fork pages and 2. is redundant because `page_server_request` is never called for `get page` request so first part to OR condition is always true. ## Summary of changes Remove redundant code ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <[email protected]>

There are quite a few benefits to this approach: - Reduce config duplication - The two sql_exporter configs were super similar with just a few differences - Pull SQL queries into standalone files - That means we could run a SQL formatter on the file in the future - It also means access to syntax highlighting - In the future, run different queries for different PG versions - This is relevant because right now, we have queries that are failing on PG 17 due to catalog updates Signed-off-by: Tristan Partin <[email protected]>

…ics log (#9401) Our replication bench project is stuck because it is too slow to generate basebackup and it caused compute to disconnect. https://neondb.slack.com/archives/C03438W3FLZ/p1728330685012419 The compute timeout for waiting for basebackup is 10m (is it true?). Generating basebackup directly on pageserver takes ~3min. Therefore, I suspect it's because there are too many wasted round-trip time for writing the 10000+ snapshot aux files. Also, it is possible that the basebackup process takes too long time retrieving all aux files that it did not write anything over the wire protocol, causing a read timeout. Basebackup size is 800KB gzipped for that project and was 55MB tar before compression. ## Summary of changes * Potentially fix the issue by placing a write buffer for basebackup. * Log how many aux files did we read + the time spent on it. Signed-off-by: Alex Chi Z <[email protected]>

They weren't added in that PR, but should be available immediately on rollout as the neon extension already defaults to 1.5.

Just a typo in a path. Signed-off-by: Tristan Partin <[email protected]>

This should make it a little bit easier for people wanting to check if their files are formated correctly. Has the added bonus of making the CI check simpler as well. Signed-off-by: Tristan Partin <[email protected]>

neondatabase/cloud#16723

… shutdown (#9415) ## Problem In test `test_timeline_offloading`, we see failures like: ``` PageserverApiException: queue is in state Stopped ``` Example failure: https://neon-github-public-dev.s3.amazonaws.com/reports/main/11356917668/index.html#testresult/ff0e348a78a974ee/retries ## Summary of changes - Amend code paths that handle errors from RemoteTimelineClient to check for cancellation and emit the Cancelled error variant in these cases (will give clients a 503 to retry) - Remove the implicit `#[from]` for the Other error case, to make it harder to add code that accidentally squashes errors into this (500-equivalent) error variant. This would be neater if we made RemoteTimelineClient return a structured error instead of anyhow::Error, but that's a bigger refactor. I'm not sure if the test really intends to hit this path, but the error handling fix makes sense either way.

```shell cargo +nightly fmt -p proxy -- -l --config imports_granularity=Module,group_imports=StdExternalCrate,reorder_imports=true ``` These rust-analyzer settings for VSCode should help retain this style: ```json "rust-analyzer.imports.group.enable": true, "rust-analyzer.imports.prefix": "crate", "rust-analyzer.imports.merge.glob": false, "rust-analyzer.imports.granularity.group": "module", "rust-analyzer.imports.granularity.enforce": true, ```

## Problem The reconciler use `seq`, but processing of results uses `sequence`. Order is different too. It makes it annoying to read logs. ## Summary of Changes Use the same tracing fields in both

## Problem This test would get failures like `command failed: Found no timeline id for branch name 'branch_8'` It's because neon_local is being invoked concurrently for branch creation, which is unsafe (they'll step on each others' JSON writes) Example failure: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-9410/11363051979/index.html#testresult/5ddc56c640f5422b/retries ## Summary of changes - Don't do branch creation concurrently with endpoint creation via neon_local

the lint stabilized in 1.80.

- plv8 3.2.3 - HypoPG 1.4.1 - pgtap 1.3.3 - timescaledb 2.17.0 - pg_hint_plan 17_1_7_0 - rdkit Release_2024_09_1 - pg_uuidv7 1.6.0 - wal2json 2.6 - pg_ivm 1.9 - pg_partman 5.1.0 update support of extensions for v14-v16: - HypoPG 1.4.0 -> 1.4.1 - pgtap 1.2.0 -> 1.3.3 - plpgsql_check 2.5.3 -> 2.7.11 - pg_uuidv7 1.0.1 -> 1.6.0 - wal2json 2.5 -> 2.6 - pg_ivm 1.7 -> 1.9 - pg_partman 5.0.1 -> 5.1.0

The current code has forgotten to activate timelines during unoffload, leading to inability to receive the basebackup, due to the timeline still being in loading state. ``` stderr: command failed: compute startup failed: failed to get basebackup@0/0 from pageserver postgresql://no_user@localhost:15014 Caused by: 0: db error: ERROR: Not found: Timeline 508546c79b2b16a84ab609fdf966e0d3/bfc18c24c4b837ecae5dbb5216c80fce is not active, state: Loading 1: ERROR: Not found: Timeline 508546c79b2b16a84ab609fdf966e0d3/bfc18c24c4b837ecae5dbb5216c80fce is not active, state: Loading ``` Therefore, also activate the timeline during unoffloading. Part of #8088

part of #9255 ## Summary of changes Upgrade remote_storage crate to use hyper1. Hyper0 is used when providing the streaming HTTP body to the s3 SDK, and it is refactored to use hyper1. Signed-off-by: Alex Chi Z <[email protected]>

Since GetAuthInfoError now boxes the ControlPlaneError message the variant is not big anymore and AuthError is 32 bytes.

``` warning: first doc comment paragraph is too long --> compute_tools/src/installed_extensions.rs:35:1 | 35 | / /// Connect to every database (see list_dbs above) and get the list of installed extensions. 36 | | /// Same extension can be installed in multiple databases with different versions, 37 | | /// we only keep the highest and lowest version across all databases. | |_ | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#too_long_first_doc_paragraph = note: `#[warn(clippy::too_long_first_doc_paragraph)]` on by default help: add an empty line | 35 ~ /// Connect to every database (see list_dbs above) and get the list of installed extensions. 36 + /// | ```

Bad copy-paste seemingly. This manifested itself as a failure to start for the sql_exporter, and was just dying on loop in staging. A future PR will have E2E testing of sql_exporter. Signed-off-by: Tristan Partin <[email protected]>

Checkpointer related statistics moved from pg_stat_bgwriter to pg_stat_checkpointer, so we need to adjust our queries accordingly. Signed-off-by: Tristan Partin <[email protected]>

We need libprotobuf-dev for some of the `/usr/include/google/protobuf/...*.proto` referenced by our protobuf decls.

## Problem See https://neondb.slack.com/archives/C033A2WE6BZ/p1729007738526309?thread_ts=1722942856.987979&cid=C033A2WE6BZ When replica receives WAL record which target page is not present in shared buffer, we evict this page from LFC. If all pages from the LFC chunk are evicted, then chunk is moved to the beginning of LRU least to force it reuse. Unfortunately access_count is not checked and if the entry is access at this moment then this operation can cause LRU list corruption. ## Summary of changes Check `access_count` in `lfc_evict` ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <[email protected]>

## Problem While running `find-garbage` and `purge-garbage`, I encountered two things that needed updating: - Console API may omit `user_id` since org accounts were added - When we cut over to using GenericRemoteStorage, the object listings we do during purge did not get proper retry handling, so could easily fail on usual S3 errors, and make the whole process drop out. ...and one bug: - We had a `.unwrap` which expects that after finding an object in a tenant path, a listing in that path will always return objects. This is not true, because a pageserver might be deleting the path at the same time as we scan it. ## Summary of changes - When listing objects during purge, use backoff::retry - Make `user_id` an `Option` - Handle the case where a tenant's objects go away during find-garbage.

First PR for #9284 Start unification of the client and connection pool interfaces: - Exclude the 'global_connections_count' out from the get_conn_entry() - Move remote connection pools to the conn_pool_lib as a reference - Unify clients among all the conn pools

Adds a configuration variable for timeline offloading support. The added pageserver-global config option controls whether the pageserver automatically offloads timelines during compaction. Therefore, already offloaded timelines are not affected by this, nor is the manual testing endpoint. This allows the rollout of timeline offloading to be driven by the storage team. Part of #8088

- pgvector 7.4 update support of extensions for v14-v16: - pgvector 7.2 -> 7.4

Without having the '--features testing' in the cargo build the proxy won't start causing tests to fail.

github-actions · 2024-10-18T14:06:01Z

5202 tests run: 4995 passed, 0 failed, 207 skipped (full report)

Flaky tests (2)

Postgres 17

test_storage_controller_compute_hook: release-x86-64

Postgres 16

test_subscriber_restart: release-x86-64

Code coverage* (full report)

functions: 31.3% (7555 of 24134 functions)
lines: 49.1% (60432 of 122990 lines)

* collected from Rust tests only

_{The comment gets automatically updated with the latest test results
a7c0568 at 2024-10-18T14:06:01.185Z :recycle:}

MMeent

Compute: LGTM

conradludgate and others added 30 commits October 14, 2024 12:25

Add test for fixed storage broker issue (#9311)

d92ff57

Adds a test for the (now fixed) storage broker limit issue, see #9268 for the description and #9299 for the fix. Also fix a race condition with endpoint creation/starts running in parallel, leading to file not found errors.

tests: make size comparisons more lenient (#9388)

f4f7ea2

The empirically determined threshold doesn't hold for PG 17. Bump the limit to stabilise ci.

Shut down timelines during offload and add offload tests (#9289)

ec4cc30

Add a test for timeline offloading, and subsequent unoffloading. Also adds a manual endpoint, and issues a proper timeline shutdown during offloading which prevents a pageserver hang at shutdown. Part of #8088.

[local_proxy] update api for pg_session_jwt (#9359)

d92d36a

pg_session_jwt now: 1. Sets the JWK in a PGU_BACKEND session guc, no longer in the init() function. 2. JWK no longer needs the kid.

proxy: Migrate jwt module away from anyhow (#9361)

fb74c21

Add newly added metrics from #9116 to exports (#9402)

18f4e5f

They weren't added in that PR, but should be available immediately on rollout as the neon extension already defaults to 1.5.

Fix jsonnet_files wildcard

be5d6a6

Just a typo in a path. Signed-off-by: Tristan Partin <[email protected]>

Add jsonnetfmt targets

061ea0d

This should make it a little bit easier for people wanting to check if their files are formated correctly. Has the added bonus of making the CI check simpler as well. Signed-off-by: Tristan Partin <[email protected]>

don't trigger workflows in two repos (#9340)

bc6b8ce

neondatabase/cloud#16723

storcon: use the same trace fields for reconciler and results (#9410)

d490ad2

## Problem The reconciler use `seq`, but processing of results uses `sequence`. Order is different too. It makes it annoying to read logs. ## Summary of Changes Use the same tracing fields in both

Remove allow(clippy::unknown_lints) (#9416)

3140c14

the lint stabilized in 1.80.

proxy: merge AuthError and AuthErrorImpl (#9418)

ed69473

Since GetAuthInfoError now boxes the ControlPlaneError message the variant is not big anymore and AuthError is 32 bytes.

Fix typo in sql_exporter generator

409a286

Bad copy-paste seemingly. This manifested itself as a failure to start for the sql_exporter, and was just dying on loop in staging. A future PR will have E2E testing of sql_exporter. Signed-off-by: Tristan Partin <[email protected]>

tristan957 and others added 8 commits October 16, 2024 14:46

Fix some sql_exporter metrics for PG 17

e0fa6bc

Checkpointer related statistics moved from pg_stat_bgwriter to pg_stat_checkpointer, so we need to adjust our queries accordingly. Signed-off-by: Tristan Partin <[email protected]>

readme: fix build instructions for debian 12 (#9371)

67d5d98

We need libprotobuf-dev for some of the `/usr/include/google/protobuf/...*.proto` referenced by our protobuf decls.

Add support of extensions for v17 (part 3) (#9430)

8b47938

- pgvector 7.4 update support of extensions for v14-v16: - pgvector 7.2 -> 7.4

test_runner: Update the README.md to build neon with 'testing' (#9437)

a7c0568

Without having the '--features testing' in the cargo build the proxy won't start causing tests to fail.

lubennikovaav requested review from a team as code owners October 18, 2024 12:58

lubennikovaav requested review from conradludgate, tristan957, problame, mtyazici and bayandin and removed request for a team October 18, 2024 12:58

MMeent approved these changes Oct 18, 2024

View reviewed changes

lubennikovaav merged commit 7f080da into release Oct 18, 2024
172 checks passed

lubennikovaav deleted the releases/2024-10-17-compute-kq-only branch October 18, 2024 15:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases/2024 10 17 compute kq only #9451

Releases/2024 10 17 compute kq only #9451

lubennikovaav commented Oct 18, 2024

github-actions bot commented Oct 18, 2024

Postgres 17

Postgres 16

MMeent left a comment

Releases/2024 10 17 compute kq only #9451

Releases/2024 10 17 compute kq only #9451

Conversation

lubennikovaav commented Oct 18, 2024

github-actions bot commented Oct 18, 2024

5202 tests run: 4995 passed, 0 failed, 207 skipped (full report)

Postgres 17

Postgres 16

Code coverage* (full report)

MMeent left a comment

Choose a reason for hiding this comment