Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LogSegmentBuilder to construct LogSegments #457

Closed
Closed
Changes from 1 commit
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
c7913dc
Move log segment into separate module
OussamaSaoudi-db Oct 28, 2024
6b331ac
Fix tests, make fields pub
OussamaSaoudi-db Oct 28, 2024
f1f9886
Improve comments
OussamaSaoudi-db Oct 28, 2024
8122113
Remove table changes
OussamaSaoudi-db Oct 28, 2024
46185ae
Merge branch 'main' into snapshot_cleanup
OussamaSaoudi-db Oct 28, 2024
471a858
change visibility
OussamaSaoudi-db Oct 28, 2024
4ac35da
Merge branch 'main' into snapshot_cleanup
OussamaSaoudi-db Nov 6, 2024
6297805
Merge branch 'main' into snapshot_cleanup
OussamaSaoudi-db Nov 6, 2024
e569719
Merge remote-tracking branch 'refs/remotes/origin/snapshot_cleanup' i…
OussamaSaoudi-db Nov 6, 2024
5edf4db
Merge branch 'main' into snapshot_cleanup
OussamaSaoudi-db Nov 6, 2024
0b8463a
Remove old log segment
OussamaSaoudi-db Nov 6, 2024
5300a7b
fix failing tests
OussamaSaoudi-db Nov 6, 2024
81d0de0
Get rid of warnings
OussamaSaoudi-db Nov 6, 2024
6b85932
Fix failing tests
OussamaSaoudi-db Nov 6, 2024
1384ea3
Apply suggestions from code review
OussamaSaoudi-db Nov 6, 2024
0182326
Address more pr comments
OussamaSaoudi-db Nov 6, 2024
d053a77
fix imports
OussamaSaoudi-db Nov 6, 2024
52f57e5
rebase onto git changes
OussamaSaoudi-db Nov 6, 2024
aa6c9f4
address nits
OussamaSaoudi-db Nov 6, 2024
dca491c
fix visibility issue
OussamaSaoudi-db Nov 6, 2024
bf5cdd4
Use LogSegmentBuilder
OussamaSaoudi-db Oct 30, 2024
748bab9
Introduce start and end versions
OussamaSaoudi-db Oct 30, 2024
2a6eb3e
Remove old code
OussamaSaoudi-db Nov 1, 2024
841f17f
Fix failing tests
OussamaSaoudi-db Nov 1, 2024
1ce29d8
Most up to date logsegment
OussamaSaoudi-db Nov 6, 2024
a2f9810
Fix failing test and remove unnecessary code
OussamaSaoudi-db Nov 6, 2024
5f7a680
remove table changes from this commit
OussamaSaoudi-db Nov 6, 2024
6d8e35f
remove table_changes
OussamaSaoudi-db Nov 6, 2024
6b78d56
Merge code
OussamaSaoudi-db Nov 8, 2024
c959b1d
Update log segment to latest version
OussamaSaoudi-db Nov 8, 2024
f121f67
Add doc comments
OussamaSaoudi-db Nov 8, 2024
6fbecb7
Fix tests and refactor
OussamaSaoudi-db Nov 9, 2024
7338834
small changes
OussamaSaoudi-db Nov 9, 2024
362900b
Move out checkpoint retain files
OussamaSaoudi-db Nov 9, 2024
de941f7
Merge branch 'main' into list_log_cleanup
OussamaSaoudi-db Nov 9, 2024
8c5a218
Address nits, upgrade to prasedlogpath
OussamaSaoudi-db Nov 11, 2024
a9c529d
merge main
OussamaSaoudi-db Nov 12, 2024
5dcff78
Address ommit
OussamaSaoudi-db Nov 12, 2024
ba079e6
Fix omit
OussamaSaoudi-db Nov 12, 2024
ccaed09
Address nit
OussamaSaoudi-db Nov 12, 2024
62be3d7
Apply suggestions from code review
OussamaSaoudi-db Nov 12, 2024
d22d661
address nit
OussamaSaoudi-db Nov 12, 2024
c7042ce
merge
OussamaSaoudi-db Nov 12, 2024
4b79a33
fix comment
OussamaSaoudi-db Nov 12, 2024
1a2fcb9
Merge branch 'main' into list_log_cleanup
OussamaSaoudi-db Nov 12, 2024
8aec522
Add docs and bring back omit
OussamaSaoudi-db Nov 12, 2024
d88e31e
Apply suggestions from code review
OussamaSaoudi-db Nov 13, 2024
4370737
Address pr comments
OussamaSaoudi-db Nov 13, 2024
6fccf58
Change checkpoint parts
OussamaSaoudi-db Nov 13, 2024
b83c74b
Fix test
OussamaSaoudi-db Nov 13, 2024
87b3247
address more comments
OussamaSaoudi-db Nov 13, 2024
83436d2
Added tests for log segment builder
OussamaSaoudi-db Nov 13, 2024
a7c2461
Add tests
OussamaSaoudi-db Nov 13, 2024
f7f05f2
Address comments
OussamaSaoudi-db Nov 13, 2024
11032a5
small nits
OussamaSaoudi-db Nov 13, 2024
84027cb
change checkpoint parts to checkpoint files
OussamaSaoudi-db Nov 13, 2024
4b7ff17
Make visibility the same between builder and logsegment
OussamaSaoudi-db Nov 13, 2024
4c9c96b
fix comment
OussamaSaoudi-db Nov 13, 2024
a53f38a
Update comments
OussamaSaoudi-db Nov 13, 2024
b0b6514
error message
OussamaSaoudi-db Nov 13, 2024
5654a6d
Merge branch 'main' into list_log_cleanup
OussamaSaoudi-db Nov 13, 2024
2630a9e
Use delta_path_for_version instead of get_path
OussamaSaoudi-db Nov 13, 2024
db5dddb
Add reference import
OussamaSaoudi-db Nov 13, 2024
6f9b459
Merge branch 'main' into list_log_cleanup
OussamaSaoudi-db Nov 13, 2024
a1531ee
Apply suggestions from code review
OussamaSaoudi-db Nov 14, 2024
33c4039
Update kernel/src/log_segment.rs
OussamaSaoudi-db Nov 14, 2024
1f077b0
More nits
OussamaSaoudi-db Nov 14, 2024
bee5659
Merge branch 'main' into list_log_cleanup
OussamaSaoudi-db Nov 14, 2024
9d0a24f
Return dev visibility
OussamaSaoudi-db Nov 14, 2024
f8457d9
make builder methods match builder visibility
OussamaSaoudi-db Nov 14, 2024
9c44a69
Fix doc issue
OussamaSaoudi-db Nov 14, 2024
3a214de
Address nits
OussamaSaoudi-db Nov 14, 2024
d5adae1
Update kernel/src/log_segment.rs
OussamaSaoudi-db Nov 14, 2024
6a5484c
add test, fix bug
OussamaSaoudi-db Nov 14, 2024
1f5b7f1
Merge branch 'list_log_cleanup' of github.com:OussamaSaoudi-db/delta-…
OussamaSaoudi-db Nov 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Move out checkpoint retain files
OussamaSaoudi-db committed Nov 9, 2024
commit 362900b39dd32008d0efa0c0228cc90620628263
18 changes: 10 additions & 8 deletions kernel/src/log_segment.rs
Original file line number Diff line number Diff line change
@@ -298,7 +298,7 @@ impl<'a> LogSegmentBuilder<'a> {
fs_client: &dyn FileSystemClient,
log_root: &Url,
) -> DeltaResult<(Vec<ParsedLogPath>, Vec<ParsedLogPath>)> {
let (mut commit_files, checkpoint_files, max_checkpoint_version) =
let (commit_files, checkpoint_files, max_checkpoint_version) =
Self::list_log_files_from_version(
fs_client,
log_root,
@@ -314,13 +314,10 @@ impl<'a> LogSegmentBuilder<'a> {

if max_checkpoint_version != checkpoint_metadata.version as i64 {
OussamaSaoudi-db marked this conversation as resolved.
Show resolved Hide resolved
warn!(
"_last_checkpoint hint is out of date. _last_checkpoint version: {}. Using actual most recent: {}",
checkpoint_metadata.version,
max_checkpoint_version
);
// we (may) need to drop commits that are before the _actual_ last checkpoint (that
// is, commits between a stale _last_checkpoint and the _actual_ last checkpoint)
commit_files.retain(|parsed_path| parsed_path.version as i64 > max_checkpoint_version);
"_last_checkpoint hint is out of date. _last_checkpoint version: {}. Using actual most recent: {}",
checkpoint_metadata.version,
max_checkpoint_version
);
} else if checkpoint_files.len() != checkpoint_metadata.parts.unwrap_or(1) {
return Err(Error::Generic(format!(
"_last_checkpoint indicated that checkpoint should have {} parts, but it has {}",
@@ -455,8 +452,13 @@ mod tests {
let (mut commit_files, checkpoint_files) =
LogSegmentBuilder::list_log_files_with_checkpoint(&checkpoint_metadata, &client, &url)
.unwrap();

// Make the most recent commit the first in iterator
commit_files.reverse();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to flag this section. We are effectively doing the same work as the build function would do. However, build maps the ParsedLogPath to file.location. We need more information in this test than location, so we can't build and make the same assertions.

I considered factoring out part of build into a process_files function, but it didn't seem worth adding an entire extra function when this is the only test that needs it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and i suppose if we have LogSegment hold ParsedLogPaths then this is moot?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Just updated the code to use ParsedLogPath and update the test.

let max_checkpoint_version = checkpoint_files.last().unwrap().version;
// we (may) need to drop commits that are before the _actual_ last checkpoint (that
// is, commits between a stale _last_checkpoint and the _actual_ last checkpoint)
commit_files.retain(|parsed_path| parsed_path.version > max_checkpoint_version);

assert_eq!(checkpoint_files.len(), 1);
println!("checkpoint: {:?}", checkpoint_files);