-
Notifications
You must be signed in to change notification settings - Fork 835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: add datatype new_list #4561
Conversation
Signed-off-by: fansehep <[email protected]>
@@ -576,6 +576,11 @@ impl DataType { | |||
_ => self == other, | |||
} | |||
} | |||
|
|||
/// Create a List DataType default name is "item" | |||
pub fn new_list(data_type: DataType, nullable: bool) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for consistency with Field::new_list
we should take the item name as an argument. Thoughts @alamb ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think we should use the standard / convention "item" always (not take a new parameter) to encourage people to follow the convention. If people want a different name, they can construct the DataType directly.
In terms of consistency with, https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html#method.new_list I would recommend we remove the name parameter from that method, due to the same rationale above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recommend we remove the name parameter from that method
It can't be removed, it is part of the field parameter that is passed in, which needs to be kept for nullability, metadata, etc... I just think it will be more confusing for users if some APIs specify "item" and some don't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just think it will be more confusing for users if some APIs specify "item" and some don't
I don't think it will increase confusion as it is already there: "I am making a list, what name should I use for its child" when the name never is used
To reduce the confusion I think we should reduce the need for specifying that field name ever when making lists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @fansehep
Signed-off-by: fansehep <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method will be helpful and make downstream code working with Lists easier to manage. Thank you @fansehep
@tustvold and I had some conversation about this PR that I wanted to add as context I believe this PR is a step forward because the way So in my mind, it is a really bad user experience that users have know to cargo cult https://github.com/search?q=repo%3Aapache%2Farrow-datafusion%20%5C%22item%5C%22&type=code I believe that @tustvold is worried that hiding the details of While this is probably fine, it seems a bit incoherent to have some APIs in terms of DataType and some Field. I plan follow on PRs to:
|
Thanks again @fansehep |
Follow on: #4627 |
Which issue does this PR close?
Closes #4544
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?
add a simple func to and the default name is "item".