-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Epic: productionize gc-compaction #9114
Labels
Comments
skyzh
added
c/storage/pageserver
Component: storage: pageserver
t/Epic
Issue type: Epic
labels
Sep 23, 2024
This was referenced Sep 23, 2024
5 tasks
skyzh
added a commit
that referenced
this issue
Oct 17, 2024
part of #9114 ## Summary of changes gc-compaction may take a lot of disk space, and if it does, the caller should do a partial gc-compaction. This patch adds space check for the compaction job. --------- Signed-off-by: Alex Chi Z <[email protected]>
5 tasks
skyzh
added a commit
that referenced
this issue
Oct 24, 2024
…#9493) part of #9114, #8836, #8362 The split layer writer code can be used in a more general way: the caller puts unfinished writers into the batch layer writer and let batch layer writer to ensure the atomicity of the layer produces. ## Summary of changes * Add batch layer writer, which atomically finishes the layers. `BatchLayerWriter::finish` is simply a copy-paste from previous split layer writers. * Refactor split writers to use the batch layer writer. * The current split writer tests cover all code path of batch layer writer. --------- Signed-off-by: Alex Chi Z <[email protected]>
skyzh
added a commit
that referenced
this issue
Oct 29, 2024
…#9134) part of #8921, #9114 ## Summary of changes We start the partial compaction implementation with the image layer partial generation. The partial compaction API now takes a key range. We will only generate images for that key range for now, and remove layers fully included in the key range after compaction. --------- Signed-off-by: Alex Chi Z <[email protected]> Co-authored-by: Christian Schwarz <[email protected]>
5 tasks
skyzh
added a commit
that referenced
this issue
Nov 11, 2024
The final patch for partial compaction, part of #9114, close #8921 (note that we didn't implement parallel compaction or compaction scheduler for partial compaction -- currently this needs to be scheduled by using a Python script to split the keyspace, and in the future, automatically split based on the key partitioning when the pageserver wants to trigger a gc-compaction) ## Summary of changes * Update the layer selection algorithm to use the same selection as full compaction (everything intersect/below gc horizon) * Update the layer selection algorithm to also generate a list of delta layers that need to be rewritten * Add the logic to rewrite delta layers and add them back to the layer map * Update test case to do partial compaction on deltas --------- Signed-off-by: Alex Chi Z <[email protected]>
This was referenced Nov 11, 2024
skyzh
added a commit
that referenced
this issue
Nov 12, 2024
I had an impression that gc-compaction didn't test the case where the first record of the key history is will_init because of there are some code path that will panic in this case. Luckily it got fixed in #9026 so we can now implement such tests. Part of #9114 ## Summary of changes * Randomly changed some images into will_init neon wal record * Split `test_simple_bottom_most_compaction_deltas` into two test cases, one of them has the bottom layer as delta layer with will_init flags, while the other is the original one with image layers. --------- Signed-off-by: Alex Chi Z <[email protected]>
skyzh
added a commit
that referenced
this issue
Nov 18, 2024
close #9552, close #8920, part of #9114 ## Summary of changes * Drop keys not belonging to this shard during gc-compaction to avoid constructing history that might have been truncated during shard compaction. * Run gc-compaction at the end of shard compaction test. --------- Signed-off-by: Alex Chi Z <[email protected]>
skyzh
added a commit
that referenced
this issue
Nov 19, 2024
) part of #9114, we want to be able to run partial gc-compaction in tests. In the future, we can also expand this functionality to legacy compaction, so that we can trigger compaction for a specific key range. ## Summary of changes * Support passing compaction key range through pageserver routes. * Refactor input parameters of compact related function to take the new `CompactOptions`. * Add tests for partial compaction. Note that the test may or may not trigger compaction based on GC horizon. We need to improve the test case to ensure things always get below the gc_horizon and the gc-compaction can be triggered. --------- Signed-off-by: Alex Chi Z <[email protected]>
This was referenced Nov 19, 2024
github-merge-queue bot
pushed a commit
that referenced
this issue
Dec 5, 2024
## Problem part of #9114 gc-compaction can take a long time. This patch adds support for scheduling a gc-compaction job. The compaction loop will first handle L0->L1 compaction, and then gc compaction. The scheduled jobs are stored in a non-persistent queue within the tenant structure. This will be the building block for the partial compaction trigger -- if the system determines that we need to do a gc compaction, it will partition the keyspace and schedule several jobs. Each of these jobs will run for a short amount of time (i.e, 1 min). L0 compaction will be prioritized over gc compaction. ## Summary of changes * Add compaction scheduler in tenant. * Run scheduled compaction in integration tests. * Change the manual compaction API to allow schedule a compaction instead of immediately doing it. * Add LSN upper bound as gc-compaction parameter. If we schedule partial compactions, gc_cutoff might move across different runs. Therefore, we need to pass a pre-determined gc_cutoff beforehand. (TODO: support LSN lower bound so that we can compact arbitrary "rectangle" in the layer map) * Refactor the gc_compaction internal interface. --------- Signed-off-by: Alex Chi Z <[email protected]> Co-authored-by: Christian Schwarz <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this issue
Dec 6, 2024
## Problem part of #9114, stacked PR over #9809 The compaction scheduler now schedules partial compaction jobs. ## Summary of changes * Add the compaction job splitter based on size. * Schedule subcompactions using the compaction scheduler. * Test subcompaction scheduler in the smoke regress test. * Temporarily disable layer map checks --------- Signed-off-by: Alex Chi Z <[email protected]>
This was referenced Dec 6, 2024
This week:
|
github-merge-queue bot
pushed a commit
that referenced
this issue
Dec 9, 2024
## Problem close #10049, close #10030, close #8861 part of #9114 The legacy gc process calls `get_latest_gc_cutoff`, which uses a Rcu different than the gc_info struct. In the gc_compaction_smoke test case, the "latest" cutoff could be lower than the gc_info struct, causing gc-compaction to collect data that could be accessed by `latest_gc_cutoff`. Technically speaking, there's nothing wrong with gc-compaction using gc_info without considering latest_gc_cutoff, because gc_info is the source of truth. But anyways, let's fix it. ## Summary of changes * gc-compaction uses `latest_gc_cutoff` instead of gc_info to determine the gc horizon. * if a gc-compaction is scheduled via tenant compaction iteration, it will take the gc_block lock to avoid racing with functionalities like detach ancestor (if it's triggered via manual compaction API without scheduling, then it won't take the lock) --------- Signed-off-by: Alex Chi Z <[email protected]> Co-authored-by: Arpad Müller <[email protected]>
This was referenced Dec 10, 2024
github-merge-queue bot
pushed a commit
that referenced
this issue
Dec 12, 2024
## Problem part of #9114, stacked PR over #9897, partially refactored to help with #10031 ## Summary of changes * gc-compaction takes `above_lsn` parameter. We only compact the layers above this LSN, and all data below the LSN are treated as if they are on the ancestor branch. * refactored gc-compaction to take `GcCompactJob` that describes the rectangular range to be compacted. * Added unit test for this case. --------- Signed-off-by: Alex Chi Z <[email protected]> Co-authored-by: Christian Schwarz <[email protected]>
This week:
|
This was referenced Dec 16, 2024
github-merge-queue bot
pushed a commit
that referenced
this issue
Dec 18, 2024
…tests (#10164) ## Problem part of #9114 In #10127 we fixed the race, but we didn't add the errors to the allowlist. ## Summary of changes * Allow repartition errors in the gc-compaction smoke test. I think it might be worth to refactor the code to allow multiple threads getting a copy of repartition status (i.e., using Rcu) in the future. Signed-off-by: Alex Chi Z <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this issue
Dec 18, 2024
## Problem We cannot get the size of the compaction queue and access the info. Part of #9114 ## Summary of changes * Add an API endpoint to get the compaction queue. * gc_compaction test case now waits until the compaction finishes. --------- Signed-off-by: Alex Chi Z <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this issue
Dec 18, 2024
## Problem In #8103 we changed the test case to have more test coverage of gc_compaction. Now that we have `test_gc_compaction_smoke`, we can revert this test case to serve its original purpose and revert the parameter changes. part of #9114 ## Summary of changes * Revert pitr_interval from 60s to 10s. * Assert the physical/logical size ratio in the benchmark. --------- Signed-off-by: Alex Chi Z <[email protected]> Co-authored-by: Arpad Müller <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this issue
Dec 19, 2024
…#10044) ## Problem In #9897 we temporarily disabled the layer valid check because the current one only considers the end result of all compaction algorithms, but partial gc-compaction would temporarily produce an "invalid" layer map. part of #9114 ## Summary of changes Allow LSN splits to overlap in the slow path check. Currently, the valid check is only used in storage scrubber (background job) and during gc-compaction (without taking layer lock). Therefore, it's fine for such checks to be a little bit inefficient but more accurate. --------- Signed-off-by: Alex Chi Z <[email protected]> Co-authored-by: Arpad Müller <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this issue
Dec 19, 2024
…10209) ## Problem close #10208 part of #9114 ## Summary of changes * Ensure remote `latest_gc_cutoff` is up-to-date before removing any files for gc-compaction. Signed-off-by: Alex Chi Z <[email protected]>
This week: did a staging run among all root timelines >= 1GB. I spent some time fixing bugs so I didn't make it for all pageservers, and only did it for pageserver-27. After the holiday: finish up the automatic trigger work. Find a way to run experiments in the background without keeping an HTTP connection open. (May need to add new APIs) |
github-merge-queue bot
pushed a commit
that referenced
this issue
Jan 10, 2025
## Problem close #10031, part of #9114 ## Summary of changes Move the compaction job generation to `compaction.rs`, thus making the code more readable and debuggable. We now also return running job through the get compaction job API, versus before we only return scheduled jobs. --------- Signed-off-by: Alex Chi Z <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this issue
Jan 15, 2025
## Problem part of #9114 part of investigation of #10049 ## Summary of changes * If `cfg!(test) or cfg!(feature = testing)`, then we will always try generating an image to ensure the history is replayable, but not put the image layer into the final layer results, therefore discovering wrong key history before we hit a read error. * I suspect it's easier to trigger some races if gc-compaction is continuously run on a timeline, so I increased the frequency to twice per 10 churns. * Also, create branches in gc-compaction smoke tests to get more test coverage. --------- Signed-off-by: Alex Chi Z <[email protected]> Co-authored-by: Arpad Müller <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In #8002, we already finished the functionalities of gc-compaction, and run it successfully over some small tenants in staging. The next step is to run it on larger tenants and production tenants. This involves work to improve the gc-compaction process and add new features.
Functionality
This ensures that the compaction process can clean-up data (that couldn't be cleaned up in legacy compaction) and yield to more important compaction job if it will take a long time.
Misc
Testing
The text was updated successfully, but these errors were encountered: