-
Notifications
You must be signed in to change notification settings - Fork 2.9k
support StructInternalRow.getVariant #14379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| if (value instanceof VariantVal) { | ||
| return (VariantVal) value; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[doubt] when can this have VariantVal ?
[optional] can we also add a test for it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the backing StructLike already holds Spark’s native VariantVal, it goes this path.
|
also ping @aihuaxu @amogh-jahagirdar @nastra |
| private VariantVal getVariantInternal(int ordinal) { | ||
| Object value = struct.get(ordinal, Object.class); | ||
|
|
||
| if (value instanceof VariantVal) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand, we will only have VariantVal in InternalRow. In what case do we have Variant object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ParentVariantReader returns Iceberg Variant, and the DataTask row path can wrap those records in StructInternalRow, so we can see Variant there.
| throw new UnsupportedOperationException( | ||
| "Unsupported value for VARIANT in StructInternalRow: " + value.getClass()); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to add the support in the following get(int ordinal, DataType dataType) and private ArrayData collectionToArrayData(Type elementType, Collection<?> values) - seems for nested variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. Thanks
|
|
||
| private static VariantVal toVariantVal(Object value) { | ||
| if (value instanceof VariantVal) { | ||
| return (VariantVal) value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me understand what kind of data will be present ? From other types such as Map, Struct, seems we should only expect Variant here and then we are converting to VariantVal.
By checking the usage, seems we should expect Variant and we should not see VariantVal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In test, we can construct GenericRecord with Spark-native values.
Line 95 in 4d22aa4
| InternalRow row = new StructInternalRow(structType).setStruct(rec); |
I am not sure if there are any real usage. I am OK to remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aokolnychyi Can you help take a look? I think we can remove.
No description provided.