-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace Append-only format with Sediment #57
Draft
ecton
wants to merge
8
commits into
main
Choose a base branch
from
sediment
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This introduces a third ActiveState that is managed on a per-committing-transaction basis. Rather than wait to release the locks on trees until after the transaction is fully synchronized, these changes release the locks after the file is fsynced but before the transaction log is synced. Before the transaction log is synced and the locks are released, the write state is cloned. After the transaction is confirmed, the state is published (taking care to ensure an older state doesn't overwrite a newer state since the OS controls waking threads). This works, but doesn't really offer parallelism due to the inability to batch the fsync operations. To complete this work, I need to figure out a good way to let the transaction manager issue the fsync commands, but be smart enough to only do it once per tree per batched commit.
The limiting factor of the last commit was that each write operation on each tree still had its own fsync operation. This change moves the responsibility of fsyncing to the transaction manager, which supports batching multiple transactions together already. Now the transaction manager keeps a unique list of paths that need to be synchronized as part of the batch. This allows multiple pending transactions for a single tree to pile up while waiting on the current batch to be fsynced. This is safe because when opening a file with a transaction manager, the file is automatically rolled back to the last confirmed transaction id. The transaction manager ensures each tree is fsynced before writing the transaction log entry and fsyncing the transaction. All new data written by pending transactions will be rolled back if a power loss occurs while the next batch is being fsynced unless the transaction log is fully synced. These changes allowed a significant speedup to BonsaiDb's commerce benchmark, and the new bottleneck is the reliance on two fsyncs.
Now returns the current read sequence rather than write.
This commit replaces the append-only format with Sediment to the point that the benchmarks function. The benchmarks do not use the Roots type, and most of that functionality is currently commented out. There's still significant work left before BonsaiDb can be used atop this implementation.
This is still a little ugly, but all unit tests are passing. Compaction is currently not implemented, and currently Sediment acts as an append-only file because Roots doesn't ever checkpoint it. This commit replaces the append-only transaction format with a commit-log powered transaction log using Sediment. When committing multiple trees, the transaction log thread will send the trees to the thread pool to be committed, while it commits the transaction log itself. Once the transaction log is fully synced, the log manager thread waits for all other trees to finish syncing, then continues to work like the previous implementation. The benefit of using Sediment is that this format now supports checkpointing the transaction log, and it also implements several other todos in the process. Closes #42. Closes #35. Closes #40.
Sediment's file manager now supports removing paths.
While testing BonsaiDb, I encountered several problems with the implementation: - Automatic checkpointing had to be disabled. When running a multithreaded workflow, we may have readers that require a BatchId to remain present until their read state has been dropped. Sediment will need some mechanism to track and prevent checkpointing beyond a certain point until in-memory handles have been dropped. Disabling automatic checkpointing puts Sediment at a performance disadvantage, as one of the benefits is the abilty to reuse space instead of writing to the end. - GrainIds shouldn't ever be 0. I replaced all GrainIds with Option<GrainId> to prevent asking Sediment for an invalid grain. - After getting the Commerce Benchmark running, I didn't really see an improvement -- given the disabling of automatic checkpointing, I shouldn't be surprised, but I felt like I should have seen *some* improvement. I noticed in profiling that sometimes the transaction batches were a single commit long, and so I theorized that by not allowing a little delay when trying to batch transactions that sometimes a single-transaction batch would get through, causing the remaining threads to block waiting on that one transaction to be written. This recv_deadline didn't move the needle, but I left it in because I still think this is one part of a better batching approach. My new theory this morning is that the transaction batcher isn't batching as much as it could. For example, if the transaction channel has ids [1, 3, 2, 4, 6, 5] in it, right now it would be committed as [1], [2, 3, 4], [5. 6]. This is because the first time a discontiguous ID is detected, the currently batched list is committed. I want to refactor the transaction batcher to always exhaust the channel before determining the list of safe-to-commit transactions. That change would also have benefited the old format. # Please enter the commit message for your changes. Lines starting
Sediment now has a CheckpointGuard which prevents checkpointing beyond a protected BatchId. The ActiveState now holds a reference to the CheckpointGuard pointed to by the most recent comment.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #58.
This TODO list is for the previous attempt at integrating Sediment before it was rewritten in December