Skip to content

Conversation

@huaxingao
Copy link
Contributor

No description provided.

@github-actions github-actions bot added the spark label Oct 20, 2025
Comment on lines 247 to 249
if (value instanceof VariantVal) {
return (VariantVal) value;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[doubt] when can this have VariantVal ?
[optional] can we also add a test for it ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the backing StructLike already holds Spark’s native VariantVal, it goes this path.

@huaxingao huaxingao closed this Oct 22, 2025
@huaxingao huaxingao reopened this Oct 22, 2025
@huaxingao
Copy link
Contributor Author

also ping @aihuaxu @amogh-jahagirdar @nastra

private VariantVal getVariantInternal(int ordinal) {
Object value = struct.get(ordinal, Object.class);

if (value instanceof VariantVal) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand, we will only have VariantVal in InternalRow. In what case do we have Variant object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParentVariantReader returns Iceberg Variant, and the DataTask row path can wrap those records in StructInternalRow, so we can see Variant there.

throw new UnsupportedOperationException(
"Unsupported value for VARIANT in StructInternalRow: " + value.getClass());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need to add the support in the following get(int ordinal, DataType dataType) and private ArrayData collectionToArrayData(Type elementType, Collection<?> values) - seems for nested variant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. Thanks


private static VariantVal toVariantVal(Object value) {
if (value instanceof VariantVal) {
return (VariantVal) value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand what kind of data will be present ? From other types such as Map, Struct, seems we should only expect Variant here and then we are converting to VariantVal.

By checking the usage, seems we should expect Variant and we should not see VariantVal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In test, we can construct GenericRecord with Spark-native values.

InternalRow row = new StructInternalRow(structType).setStruct(rec);

I am not sure if there are any real usage. I am OK to remove this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aokolnychyi Can you help take a look? I think we can remove.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants