-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add serialization and API changes for post_array_schema_from_rest
.
#5237
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also apply this change to https://github.com/TileDB-Inc/TileDB/blob/dev/tiledb/sm/cpp_api/config.h?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
30d4a81
to
1829491
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left an inline comment which we'll need to triple check before merging.
@@ -1264,14 +1264,14 @@ struct LoadEnumerationsResponse { | |||
struct LoadArraySchemaRequest { | |||
config @0 :Config; | |||
# Config | |||
|
|||
includeEnumerations @1 :Bool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I think you might risk crashing older clients if you do this. Please check
TileDB/tiledb/sm/serialization/array_schema.cc
Line 1909 in f84be6c
return LoadArraySchemaRequest(reader.getIncludeEnumerations()); |
which is read whithout checking if the field exists on the message :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah bummer, it was exposed in #4399. I added it back and used it for this option, I documented that future options like this should be stored in the serialized Config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general note, whenever we see a field accessed without an if (has)
wrapped around it, we should at the minimum ask a question about it, these things bite hard and lock us into supporting fields forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I added the hasConfig
check during deserialization. I reran tests this morning in #5181 with these changes and everything is passing, this is ready for review when you have some time.
tiledb/sm/c_api/tiledb.h
Outdated
* @param array_schema The array schema to be retrieved, or `NULL` upon error. | ||
* @return `TILEDB_OK` for success and `TILEDB_OOM` or `TILEDB_ERR` for error. | ||
*/ | ||
TILEDB_EXPORT int32_t tiledb_array_schema_load_with_options( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tiledb_array_schema_load_with_config
or load_v2
sounds like a better name. The word options
does not appear in other APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tiledb/sm/c_api/tiledb.h
Outdated
* Retrieves the schema of an array from the disk, creating an array schema | ||
* struct. | ||
* struct. The schema retrieved will always be the latest schema for the array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be Retrieves the latest schema of an array from the disk.
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ss << "rest.load_enumerations_on_array_open false\n"; | ||
ss << "rest.load_metadata_on_array_open true\n"; | ||
ss << "rest.load_non_empty_domain_on_array_open true\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these options restricted to REST only? Especially the first two have use cases outside of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah also crossed my mind while working on this.. I didn't introduce any new config options in this PR but maybe sm.load_X_on_array_open
or sm.array_open.load_X
would be more fitting for these? I think they were each introduced to tune REST behavior and that's probably where the names originated.
I opened SC-53109 to consider renaming them, probably best to handle the renaming separately.
return config_.get<bool>( | ||
"rest.load_enumerations_on_array_open", config_.must_find); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we read this value on construction? There is no reason to copy a whole config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tiledb/sm/c_api/tiledb.cc
Outdated
LOG_STATUS_NO_RETURN_VALUE(st); | ||
save_error(ctx, st); | ||
return TILEDB_OOM; | ||
throw CAPIStatusException("Failed to allocate TileDB array schema object"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can throw bad_alloc
, which will map to TILEDB_OOM
in the C API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
auto&& [st, array_schema_response] = | ||
rest_client.get_array_schema_from_rest(uri); | ||
throw_if_not_ok(st); | ||
return std::move(array_schema_response.value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return std::move(array_schema_response.value()); | |
return std::move(array_schema_response).value(); |
Nit: This will have the same behavior and use the rvalue reference-qualified overload of value()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, nice!
@@ -123,7 +123,7 @@ shared_ptr<ArraySchema> ArrayDirectory::load_array_schema_latest( | |||
auto&& array_schema = load_array_schema_from_uri( | |||
resources_.get(), schema_uri, encryption_key, memory_tracker); | |||
|
|||
array_schema->set_array_uri(uri_); | |||
array_schema->set_array_uri(uri_.remove_trailing_slash()); | |||
|
|||
return std::move(array_schema); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had written a return std::move(x);
in the past and CI had failed with a warning-turned-error. I'm surprised it does not fail here.
279b18f
to
a588737
Compare
748eaea
to
3b206ae
Compare
tiledb/api/c_api/array/array_api.cc
Outdated
#include "tiledb/sm/enums/encryption_type.h" | ||
#include "tiledb/sm/rest/rest_client.h" | ||
|
||
namespace tiledb::api { | ||
|
||
capi_return_t tiledb_array_schema_load( | ||
int32_t tiledb_array_schema_load( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int32_t tiledb_array_schema_load( | |
capi_return_t tiledb_array_schema_load( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tiledb/api/c_api/array/array_api.cc
Outdated
return TILEDB_OK; | ||
} | ||
|
||
int32_t tiledb_array_schema_load_with_config( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int32_t tiledb_array_schema_load_with_config( | |
capi_return_t tiledb_array_schema_load_with_config( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
* @param ctx TileDB Context. | ||
* @param uri The URI of the array to load schema. | ||
* @param config TileDB Config. If null, the context config will be used. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bekadavis9 do you think this belongs here? I'm not sure we want to add the dependency to array directory if we don't have to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is fine. This file is intended to separate out the I/O-based ArraySchema
operations and will undoubtedly rely on ArrayDirectory
at some point; it just hasn't needed to yet. The array
CAPI Handle will already need to link against capi_array_schema_stub
to make use of its internals, so there's no additional overhead here that I can think of. It could be argued that it's a potentially unnecessary change, and the body could stay as-is in the C API, but I have no real qualms about the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…tion-update-load-schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great work @shaunrd0!
@@ -1850,6 +1850,8 @@ void load_array_schema_request_to_capnp( | |||
const LoadArraySchemaRequest& req) { | |||
auto config_builder = builder.initConfig(); | |||
throw_if_not_ok(config_to_capnp(config, &config_builder)); | |||
// This boolean is only serialized to support clients using TileDB < 2.26. | |||
// Future options should only be serialized within the Config object above. | |||
builder.setIncludeEnumerations(req.include_enumerations()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is excellent, thank you.
post_array_schema_from_rest
.post_array_schema_from_rest
.
/backport to release-2.26 |
Started backporting to release-2.26: https://github.com/TileDB-Inc/TileDB/actions/runs/10584654319 |
@KiterLuc backporting to release-2.26 failed, the patch most likely resulted in conflicts: $ git am --3way --ignore-whitespace --keep-non-patch changes.patch
Applying: Serialization and C API handler changes.
Applying: Changes to loading enumerations.
Applying: Array schema load changes.
Applying: Add config docs to cpp_api/config.h
Applying: Fix CI failures.
Applying: Add test for loading enumerations for all schemas.
Applying: Changes from review.
Applying: Check hasConfig during deserialization.
Applying: Fix C API return types.
error: sha1 information is lacking or useless (tiledb/api/c_api/array/array_api.cc).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0009 Fix C API return types.
Error: The process '/usr/bin/git' failed with exit code 128 Please backport manually! |
@KiterLuc an error occurred while backporting to release-2.26, please check the run log for details! Error: git am failed, most likely due to a merge conflict. |
This factors out serialization and API changes in #5181 that are required for the HandleGetArraySchema route. These changes will need to be available on REST before we can enable the new route for loading the array schema.
There is a quick summary of the changes required in SC-52877.
TYPE: IMPROVEMENT
DESC: Add serialization and API changes for post_array_schema_from_rest.