Skip to content

make_map incorrectly rejects NULL map values due to overly strict null check #18530

@flarion-weijun

Description

@flarion-weijun

Describe the bug

DataFusion's make_map function has an overly strict null check that fails to distinguish between two different scenarios:

  1. NULL map values (entire map is NULL) - should be allowed
  2. Null keys within a map - should be rejected

The current implementation rejects ANY null in the keys array at line 66, even when the null represents a valid NULL map value rather than an invalid null key.

This causes failures when:

  • Directly calling make_map_batch with Arrow arrays containing NULL list elements
  • Processing map columns with NULL values through certain code paths
  • Using constant evaluation scenarios where NULL maps are present

To Reproduce

Unit Test Reproduction (demonstrates the core issue):

    fn test_make_map_with_null_maps() {
        // Test that NULL map values (entire map is NULL) are correctly handled
        // This test directly calls make_map_batch with a List containing NULL elements
        // On main branch: Would fail at line 66 with "map key cannot be null"
        // With fix: Correctly handles NULL maps by routing to make_map_array_internal

        // Create a List<List<String>> where one element is NULL (representing a NULL map)
        // keys = [['a'], NULL, ['b']]
        let key_values: Vec<Option<Vec<Option<&str>>>> = vec![
            Some(vec![Some("a")]),
            None,  // This is a NULL map, not a null key
            Some(vec![Some("b")]),
        ];

        let mut key_builder = arrow::array::ListBuilder::new(
            arrow::array::StringBuilder::new()
        );

        for key_list in key_values {
            match key_list {
                Some(keys) => {
                    for key in keys {
                        key_builder.values().append_option(key);
                    }
                    key_builder.append(true);
                }
                None => {
                    key_builder.append(false);  // NULL map
                }
            }
        }
        let keys_array = Arc::new(key_builder.finish());

        // Create corresponding values = [[1], [2], [3]]
        let value_values: Vec<Option<Vec<Option<i32>>>> = vec![
            Some(vec![Some(1)]),
            Some(vec![Some(2)]),
            Some(vec![Some(3)]),
        ];

        let mut value_builder = arrow::array::ListBuilder::new(
            arrow::array::Int32Builder::new()
        );

        for value_list in value_values {
            match value_list {
                Some(values) => {
                    for value in values {
                        value_builder.values().append_option(value);
                    }
                    value_builder.append(true);
                }
                None => {
                    value_builder.append(false);
                }
            }
        }
        let values_array = Arc::new(value_builder.finish());

        // Call make_map_batch - this should succeed with the fix
        let result = make_map_batch(&[
            ColumnarValue::Array(keys_array),
            ColumnarValue::Array(values_array),
        ]);

        assert!(result.is_ok(), "Should handle NULL maps correctly");

        if let Ok(ColumnarValue::Array(map_array)) = result {
            assert_eq!(map_array.len(), 3);
            assert!(map_array.is_null(1), "Second map should be NULL");
            assert!(!map_array.is_null(0), "First map should not be NULL");
            assert!(!map_array.is_null(2), "Third map should not be NULL");
        } else {
            panic!("Expected Array result");
        }
    }

Expected behavior

NULL map values (entire map is NULL) should be correctly handled
Null keys within maps should be properly rejected with clear error message
The validation should happen at the appropriate place in the code, after distinguishing between NULL maps and null keys

Additional context

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions