-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk Loading Part 1: Transaction-Based User Interface #11343
base: main
Are you sure you want to change the base?
Conversation
192b0b1
to
4831a38
Compare
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
4831a38
to
42f624b
Compare
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New task: tr->set("\xff\xff/bulk_loading/task/", bulkLoadStateValue(range, path));
We don't want to read the key range from the SST file?
wait(tr->commit()); Succeed if no existing bulk loading task (in metadata) intersecting the input task range; Otherwise, throwing an error.
Will there be a check of the SST content on transaction commit to make sure it is wholesome? What if the SST does not exist? Will we get an error here?
What if the path-to-sst has keys that are not inside the specified range? Will we get an error on commit?
Cancel task: tr->clear("\xff\xff/bulk_loading/cancel/a", "\xff\xff/bulk_loading/cancel/b"); wait(tr->commit()); Cancel all tasks intersecting the input range.
This will cancel bulk loads that have 'succeeded' too? (It's a little confusing here around how successful bulk loads are handled.... we need to clear their task? Does above 'cancel' the successful bulk loads?)
Users can never overwrite a key range that is running a bulk loading task, unless the user has manually cancelled it.
What happens if the user does this? Is there protection? Will user get an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few notes...
@@ -74,6 +78,7 @@ std::unordered_map<SpecialKeySpace::MODULE, KeyRange> SpecialKeySpace::moduleToB | |||
{ SpecialKeySpace::MODULE::ACTOR_PROFILER_CONF, | |||
KeyRangeRef("\xff\xff/actor_profiler_conf/"_sr, "\xff\xff/actor_profiler_conf0"_sr) }, | |||
{ SpecialKeySpace::MODULE::CLUSTERID, singleKeyRange("\xff\xff/cluster_id"_sr) }, | |||
{ SpecialKeySpace::MODULE::BULKLOADING, KeyRangeRef("\xff\xff/bulk_loading/"_sr, "\xff\xff/bulk_loading0"_sr) }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think comment on what the content of this area in the special key space will contain will help.... here or probably better down in SystemData.
range.withPrefix(SpecialKeySpace::getModuleRange(SpecialKeySpace::MODULE::BULKLOADING).begin); | ||
auto ranges = ryw->getSpecialKeySpaceWriteMap().containedRanges(rangeToCheck); | ||
for (auto iter = ranges.begin(); iter != ranges.end(); ++iter) { | ||
if (iter->value().first && iter->value().second.present()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need 'present' on value().first?
ShardedRocksDB = 3, | ||
}; | ||
|
||
struct BulkLoadState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments? Perhaps paste the PR design principles in here? How to use this API?
Future<Optional<std::string>> BulkLoadCancelImpl::commit(ReadYourWritesTransaction* ryw) { | ||
return BulkLoadingCancelCommitActor(ryw, getKeyRange()); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to write unit tests for this range 'math'; i.e. checking inside range with prefix, w/o prefix etc.
116780b
to
ef6d45c
Compare
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
ERROR( bulkload_add_task_input_error, 1086, "Bulk loading add task input is error" ) | ||
ERROR( bulkload_cancel_task_input_error, 1087, "Bulk loading cancel task input is error" ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these 3 errors, is it possible to consolidate into one error type? The caller with the context should be able to figure out which task has failed.
ASSERT(bulkLoadState.isValid()); | ||
KeyRangeRef bulkLoadRange = bulkLoadState.range; | ||
ASSERT(!bulkLoadRange.empty()); | ||
if (bulkLoadRange.begin >= normalKeys.end || bulkLoadRange.end >= normalKeys.end) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be simplified as
if (bulkLoadRange.begin >= normalKeys.end || bulkLoadRange.end >= normalKeys.end) { | |
if (bulkLoadRange.end >= normalKeys.end) { |
Design principles to make it simple and avoid error-prone:
User interface:
Say,
tr
is a readwrite transaction. Each transaction can only include one of the three following types of operation:tr->set("\xff\xff/bulk_loading/task/", bulkLoadStateValue(range, path)); wait(tr->commit());
Succeed if no existing bulk loading task (in metadata) intersecting the input task range; Otherwise, throwing an error.wait(tr->getRange("\xff\xff/bulk_loading/status/a", "\xff\xff/bulk_loading/status/b"));
Return all tasks's status intersecting the input range.tr->clear("\xff\xff/bulk_loading/cancel/a", "\xff\xff/bulk_loading/cancel/b"); wait(tr->commit());
Cancel all tasks intersecting the input range.Those operations to the special key space are mapped to read/write operation on "\xff/bulkLoad/" system key space, which in turn to control the actual bulk loading in FDB internal.
Code-Reviewer Section
The general pull request guidelines can be found here.
Please check each of the following things and check all boxes before accepting a PR.
For Release-Branches
If this PR is made against a release-branch, please also check the following:
release-branch
ormain
if this is the youngest branch)