-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix corrupted wal number when predecessor wal corrupts + minor cleanup #13359
base: main
Are you sure you want to change the base?
Conversation
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in review. I haven't looked into the details of PredecessorWALInfo
creation and handling. Let me know if my comments are infeasible.
db/log_reader.cc
Outdated
@@ -364,6 +366,7 @@ void Reader::MaybeVerifyPredecessorWALInfo( | |||
stop_replay_for_corruption_) { | |||
return; | |||
} | |||
assert(corrupted_wal_number_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe add assertion for *corrupted_wal_number_ == kMaxSequenceNumber
during construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
db/db_impl/db_impl_open.cc
Outdated
@@ -1400,7 +1403,7 @@ Status DBImpl::InitializeLogReader( | |||
immutable_db_options_.info_log, std::move(file_reader), reporter, | |||
true /*checksum*/, wal_number, | |||
immutable_db_options_.track_and_verify_wals, stop_replay_for_corruption, | |||
min_wal_number, predecessor_wal_info)); | |||
min_wal_number, predecessor_wal_info, corrupted_wal_number)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit unusual to pass this state tracking variable into a log Reader's constructor. It is in some sense a output argument, but there is no clear syntax for it.
Can we instead track this in Reporter
with an explicit API like GetCorruptedWalNumber
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we instead track this in Reporter
Agree - I decided to track the corrupted predecessor wal number in reporter instead. Fixed.
db/log_reader.cc
Outdated
@@ -403,6 +409,7 @@ void Reader::MaybeVerifyPredecessorWALInfo( | |||
std::to_string(observed_predecessor_wal_info_.GetSizeBytes()) + | |||
" bytes."; | |||
ReportCorruption(fragment.size(), reason.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking if we can use Reporter
as the source of truth for the corrupted wal number, we can also update this ReportCorruption
API to take an extra wal_number argument. It will either get the current wal number or the predecessor wal number passed depending on the type error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to chat more offline!
I'm thinking if we can use
Reporter
as the source of truth for the corrupted wal numbe
While it's a good direction to minimize the place we have "corrupted wal number", I found it difficult to use reporter as the source of truth for the corrupted_wal_number for the following reasons:
(1) Reporter, as the new name "DBOpenLogRecordReadReporter" suggests, it actually only tracks corruption per reading log record (e.g, physically corrupted log record, found corruption after reading the predecessor info record). But the corrupted_wal_number should be set even when there is log file level corruption e.g, fail to UpdatePredecessorWALInfo().
(2) Due to the messiness we still have with having multiple unconsolidated Status in ProcessLogFile(), I found it not easy to verify the log number at the time of ReportCorruption() will the be the same number as how we currently set corrupted_wal_number. This is to ensure no behavior change for all the corruption cases reported. That means it's better to accompany the change you suggested with status consolidation tech debt work.
Therefore, I decide to keep the "ugly" if-else statement in setting "corrupted_wal_number = ...." in HandleNonOkStatusOrOldLogRecord()
for now.
we can also update this ReportCorruption API to take an extra wal_number argument
Done with some notes.
I do agree with using Reporter to keep track of the corrupted log number though I eventually decided to not using the Reporter::Corruption() to take another wal_number, but another API to deal with the special case where it's the prior WAL being corrupted.
This is because most of the corruption doesn't need to track log number inside of the reporter because the reporter will be automatically associated with the corrupted log file if any corruption happens. So I feel the wal_number parameter is mostly useless and ends up with something like below
.../rocksdb/db/repair.cc
Status ConvertLogToTable(const std::string& wal_dir, uint64_t log) {
struct LogReporter : public log::Reader::Reporter {
Env* env;
std::shared_ptr<Logger> info_log;
uint64_t lognum;
void Corruption(size_t bytes, const Status& s, uint64_t log_number /* another log number!!!!! */) override {
// We print error messages for corruption, but continue repairing.
ROCKS_LOG_ERROR(info_log, "Log #%" PRIu64 ": dropping %d bytes; %s",
lognum, static_cast<int>(bytes), s.ToString().c_str());
}
.....
}
26b7aad
to
73aa9ee
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
73aa9ee
to
da73398
Compare
@hx235 has updated the pull request. You must reimport the pull request before landing. |
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@@ -162,15 +162,25 @@ class Directories { | |||
std::unique_ptr<FSDirectory> wal_dir_; | |||
}; | |||
|
|||
struct DBOpenLogReporter : public log::Reader::Reporter { | |||
struct DBOpenLogRecordReadReporter : public log::Reader::Reporter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to better indicate what level of corruption it reports. The level may change in the future once we figure out how to consolidate all the error status in ProcessLogFile() (see existing TODOs)
Status log_read_status, bool* old_log_record, | ||
bool* stop_replay_for_corruption, uint64_t* corrupted_wal_number, | ||
bool* corrupted_wal_found); | ||
Status status, const DBOpenLogRecordReadReporter& reporter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Status log_read_status
was a typo
@@ -1323,7 +1332,7 @@ Status DBImpl::ProcessLogFile( | |||
} | |||
|
|||
ROCKS_LOG_INFO(immutable_db_options_.info_log, | |||
"Recovered to log #%" PRIu64 " seq #%" PRIu64, wal_number, | |||
"Recovered to log #%" PRIu64 " next seq #%" PRIu64, wal_number, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"seq" should be "next seq"
FragmentBufferedReader(std::shared_ptr<Logger> info_log, | ||
std::unique_ptr<SequentialFileReader>&& _file, | ||
Reporter* reporter, bool checksum, uint64_t log_num) | ||
: Reader(info_log, std::move(_file), reporter, checksum, log_num, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only default values are used for these optional parameters in FragmentBufferedReader so removed them in the API
Context/Summary:
02b4197 recently added the ability to detect WAL hole presents in the predecessor WAL. It forgot to update the corrupted wal number to point to the predecessor WAL in that corruption case. This PR fixed it.
As a bonus, this PR also (1) fixed the
FragmentBufferedReader()
constructor API to expose less parameters as they are never explicitly passed in in the codebase (2) a INFO log wording (3) a parameter naming typo (4) the reporter namingTest: