Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: use direct IO for delta and image layer reads #9326

Merged
merged 32 commits into from
Oct 21, 2024

Conversation

yliang412
Copy link
Contributor

@yliang412 yliang412 commented Oct 8, 2024

Part of #8130

Problem

Pageserver previously goes through the kernel page cache for all the IOs. The kernel page cache makes light-loaded pageserver have deceptive fast performance. Using direct IO would offer predictable latencies of our virtual file IO operations.

In particular for reads, the data pages also have an extremely low temporal locality because the most frequently accessed pages are cached on the compute side.

Summary of changes

This PR enables pageserver to use direct IO for delta layer and image layer reads. We can ship them separately because these layers are write-once, read-many, so we will not be mixing buffered IO with direct IO.

  • implement IoBufferMut, an buffer type with aligned allocation (currently set to 512).
  • use IoBufferMut at all places we are doing reads on image + delta layers.
  • leverage Rust type system and use IoBufAlignedMut marker trait to guarantee that the input buffers for the IO operations are aligned.
  • page cache allocation is also made aligned.

* in-memory layer reads and the write path will be shipped separately.

Testing

Integration test suite run with O_DIRECT enabled: #9350

Performance

We evaluated performance based on the get-page-at-latest-lsn benchmark. The results demonstrate a decrease in the number of IOps, no sigificant change in the latency mean, and an slight improvement on the p99.9 and p99.99 latencies.

Benchmark

Rollout

We will add virtual_file_io_mode=direct region by region to enable direct IO on image + delta layers.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

yliang412 and others added 16 commits September 30, 2024 23:54
Signed-off-by: Yuchen Liang <[email protected]>
Signed-off-by: Yuchen Liang <[email protected]>
@yliang412 yliang412 changed the base branch from main to yuchen/virtual-file-config October 8, 2024 21:26
Copy link

github-actions bot commented Oct 8, 2024

5229 tests run: 5015 passed, 0 failed, 214 skipped (full report)


Flaky tests (1)

Postgres 17

Code coverage* (full report)

  • functions: 31.5% (7665 of 24363 functions)
  • lines: 48.9% (60249 of 123149 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
656ddbc at 2024-10-21T14:33:19.259Z :recycle:

Base automatically changed from yuchen/virtual-file-config to main October 9, 2024 12:33
@yliang412 yliang412 self-assigned this Oct 9, 2024
@yliang412 yliang412 added the c/storage/pageserver Component: storage: pageserver label Oct 9, 2024
@yliang412 yliang412 changed the title pageserver: use direct IO for disk io read path pageserver: use direct IO for delta and image layer reads Oct 11, 2024
Signed-off-by: Yuchen Liang <[email protected]>
@yliang412 yliang412 marked this pull request as ready for review October 14, 2024 04:42
@yliang412 yliang412 requested a review from a team as a code owner October 14, 2024 04:42
Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finish first part of my review for io buf implementation, will continue tomorrow with the blob readers and page cache :)

Copy link
Member

@skyzh skyzh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM apart from a few nits, good work!

pageserver/src/tenant/storage_layer/inmemory_layer.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/storage_layer/inmemory_layer.rs Outdated Show resolved Hide resolved
@yliang412 yliang412 merged commit 49d5e56 into main Oct 21, 2024
80 checks passed
@yliang412 yliang412 deleted the yuchen/direct-io-for-read branch October 21, 2024 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants