-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-37072: [C++] MakeArrayOfNulls should respect Field::nullable #38252
base: main
Are you sure you want to change the base?
Conversation
Many tests are failing now. |
4152f7a
to
807b476
Compare
"Invalid: Values array invalid: Invalid: Expected 2 buffers in array " | ||
"of type int32, got 3"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't a bit weird to have "Invalid: " twice in the message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, but this is the format of other child array validation failures (since the parent array's error message contains the child array's). Possibly a follow up issue is in order to simplify those; I think it'd be most valuable to provide a field path to the child which failed validation and the "leaf" error message.
class NullArrayFactory { | ||
public: | ||
struct GetBufferLength { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why delete this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't really delete it, just refactored its logic into NullArrayFactory. This way the code which requests preallocated zero buffer is in the same function as the code which uses the zero buffer, which seemed more clear to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, questions and suggestions below.
The PR description should note that this is a behavior change.
2b2d779
to
d73aa0b
Compare
Getting merged and me rebasing the LIST_VIEW PR would be a good idea. |
d73aa0b
to
b34ace5
Compare
@@ -43,7 +43,7 @@ class ARROW_EXPORT ExtensionType : public DataType { | |||
static constexpr const char* type_name() { return "extension"; } | |||
|
|||
/// \brief The type of array used to represent this extension type's data | |||
const std::shared_ptr<DataType>& storage_type() const { return storage_type_; } | |||
std::shared_ptr<DataType> storage_type() const override { return storage_type_; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not also override storage_type_ref
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intended to remove storage_type_ref altogether
cpp/src/arrow/array/array_test.cc
Outdated
auto req = [](auto type) { return field("", std::move(type), /*nullable=*/false); }; | ||
|
||
// union with no nullable fields cannot represent a null | ||
ASSERT_RAISES(Invalid, MakeArrayOfNull(dense_union({req(int8())}), length)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be successful if length is 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the type cannot support any number of nulls, I would say that the special case of zero length is not worth allowing
Rationale for this change
MakeArrayOfNulls didn't examine Field::nullable, so it could produce nested arrays whose children were null even if the schema said they couldn't have nulls. Additionally validation didn't look at Field::nullable so these malformed arrays don't fail tests.
Are these changes tested?
Yes, mostly by fixtures already in place.
Behavior
This PR includes breaking changes to public APIs.
The
Field::nullable
flag is now enforce in array validation, so now (for example) an array with nulls which corresponds to a non nullable field will fail validation.