-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Open
Copy link
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
DataFusion's make_map function has an overly strict null check that fails to distinguish between two different scenarios:
- NULL map values (entire map is NULL) - should be allowed
- Null keys within a map - should be rejected
The current implementation rejects ANY null in the keys array at line 66, even when the null represents a valid NULL map value rather than an invalid null key.
This causes failures when:
- Directly calling
make_map_batchwith Arrow arrays containing NULL list elements - Processing map columns with NULL values through certain code paths
- Using constant evaluation scenarios where NULL maps are present
To Reproduce
Unit Test Reproduction (demonstrates the core issue):
fn test_make_map_with_null_maps() {
// Test that NULL map values (entire map is NULL) are correctly handled
// This test directly calls make_map_batch with a List containing NULL elements
// On main branch: Would fail at line 66 with "map key cannot be null"
// With fix: Correctly handles NULL maps by routing to make_map_array_internal
// Create a List<List<String>> where one element is NULL (representing a NULL map)
// keys = [['a'], NULL, ['b']]
let key_values: Vec<Option<Vec<Option<&str>>>> = vec![
Some(vec![Some("a")]),
None, // This is a NULL map, not a null key
Some(vec![Some("b")]),
];
let mut key_builder = arrow::array::ListBuilder::new(
arrow::array::StringBuilder::new()
);
for key_list in key_values {
match key_list {
Some(keys) => {
for key in keys {
key_builder.values().append_option(key);
}
key_builder.append(true);
}
None => {
key_builder.append(false); // NULL map
}
}
}
let keys_array = Arc::new(key_builder.finish());
// Create corresponding values = [[1], [2], [3]]
let value_values: Vec<Option<Vec<Option<i32>>>> = vec![
Some(vec![Some(1)]),
Some(vec![Some(2)]),
Some(vec![Some(3)]),
];
let mut value_builder = arrow::array::ListBuilder::new(
arrow::array::Int32Builder::new()
);
for value_list in value_values {
match value_list {
Some(values) => {
for value in values {
value_builder.values().append_option(value);
}
value_builder.append(true);
}
None => {
value_builder.append(false);
}
}
}
let values_array = Arc::new(value_builder.finish());
// Call make_map_batch - this should succeed with the fix
let result = make_map_batch(&[
ColumnarValue::Array(keys_array),
ColumnarValue::Array(values_array),
]);
assert!(result.is_ok(), "Should handle NULL maps correctly");
if let Ok(ColumnarValue::Array(map_array)) = result {
assert_eq!(map_array.len(), 3);
assert!(map_array.is_null(1), "Second map should be NULL");
assert!(!map_array.is_null(0), "First map should not be NULL");
assert!(!map_array.is_null(2), "Third map should not be NULL");
} else {
panic!("Expected Array result");
}
}Expected behavior
NULL map values (entire map is NULL) should be correctly handled
Null keys within maps should be properly rejected with clear error message
The validation should happen at the appropriate place in the code, after distinguishing between NULL maps and null keys
Additional context
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working