-
Notifications
You must be signed in to change notification settings - Fork 413
[server] Recover log and index file for unclean shutdown #1749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
b9b5f2d
to
ceaf704
Compare
Ready for CR @wuchong @swuferhong |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LiebingYu Thanks for your work. I left some comments.
new WriterStateManager( | ||
logSegments.getTableBucket(), | ||
logTabletDir, | ||
this.writerStateManager.writerExpirationMs()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to new a WriterStateManager? Maybe we can add a clear() method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic follow Kafka, and ceate a new WriterStateManager is a lightweight operation. I think it's ok.
fluss-server/src/test/java/org/apache/fluss/server/log/LogLoaderTest.java
Show resolved
Hide resolved
fluss-server/src/test/java/org/apache/fluss/server/log/LogLoaderTest.java
Show resolved
Hide resolved
fluss-server/src/test/java/org/apache/fluss/server/log/LogLoaderTest.java
Show resolved
Hide resolved
numUnflushed, | ||
logSegments.getTableBucket()); | ||
|
||
int truncatedBytes = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to be this too aggressive here? Deleting all subsequent logSegments
just because one cannot be repaired — I feel this might pose a risk of data loss. Also, we don't have test coverage for this logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic also follow Kafka. From my perspective, data loss is unlikely because the data is stored in multiple replicas. Once the file is truncated to the correct position, it can synchronize the latest data from the leader. If truncation is not carried out, the file appears to be unrecoverable, and if the host machine becomes the leader afterward, unforeseen problems might occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I add test to cover this.
ceaf704
to
1e8a582
Compare
Purpose
Linked issue: close #1716
Brief change log
Tests
API and Format
Documentation