Make array_reverse faster for List and FixedSizeList #18500

vegarsti · 2025-11-05T18:30:53Z

Rationale for this change

Noticed while doing #18424 that the list types List and FixedSizeList uses MutableData to build the reverse array. Using take turns out to be a lot faster, ~70% for both List and FixedSizeList. This PR also reworks the benchmark added in #18425, and these are the results on that compared to the implementation on main:

# cargo bench --bench array_reverse
   Compiling datafusion-functions-nested v50.3.0 (/Users/vegard/dev/datafusion/datafusion/functions-nested)
    Finished `bench` profile [optimized] target(s) in 42.08s
     Running benches/array_reverse.rs (target/release/deps/array_reverse-2c473eed34a53d0a)
Gnuplot not found, using plotters backend
Benchmarking array_reverse_list: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.3s, or reduce sample count to 70.
array_reverse_list      time:   [62.201 ms 62.551 ms 62.946 ms]
                        change: [−70.137% −69.965% −69.785%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

Benchmarking array_reverse_list_view: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.3s, or reduce sample count to 70.
array_reverse_list_view time:   [61.649 ms 61.905 ms 62.185 ms]
                        change: [−16.122% −15.623% −15.087%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

array_reverse_fixed_size_list
                        time:   [4.7936 ms 4.8292 ms 4.8741 ms]
                        change: [−76.435% −76.196% −75.951%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 20 outliers among 100 measurements (20.00%)
  8 (8.00%) low mild
  5 (5.00%) high mild
  7 (7.00%) high severe

Are these changes tested?

Covered by existing sqllogic tests, and one new test for FixedSizeList.

vegarsti · 2025-11-05T18:31:05Z

cc @Jefffrey 👀

vegarsti · 2025-11-05T18:32:01Z

datafusion/functions-nested/src/reverse.rs

+    // Materialize values from underlying array with take
+    let indices_array: ArrayRef = if O::IS_LARGE {
+        Arc::new(UInt64Array::from(
+            indices
+                .iter()
+                .map(|i| i.as_usize() as u64)
+                .collect::<Vec<_>>(),
+        ))
+    } else {
+        Arc::new(UInt32Array::from(
+            indices
+                .iter()
+                .map(|i| i.as_usize() as u32)
+                .collect::<Vec<_>>(),
+        ))
+    };


This is duplicated for ListView. I figured twice was not enough to extract to a function, but if we find a nicer way to do it, we can improve both.

It would be nice if we figured out a nicer way of doing this but I still can't figure it out 😅

Went a bit crazy with generics in attempting once but didn't pan out; but it works and the if branch isn't a big deal, though I wonder if it's more efficient to just have Int32/Int64 arrays instead of their unsigned variants to avoid needing a map -> collect 🤔

Jefffrey

A note regarding benchmarks, we might need to add a bit of randomness + null spread for more accurate benchmarking perhaps. As they currently are I believe they always create arrays of fixed offsets for every child list.

Jefffrey · 2025-11-06T11:11:29Z

datafusion/functions-nested/src/reverse.rs

+    // Materialize values from underlying array with take
+    let indices_array: ArrayRef = if O::IS_LARGE {
+        Arc::new(UInt64Array::from(
+            indices
+                .iter()
+                .map(|i| i.as_usize() as u64)
+                .collect::<Vec<_>>(),
+        ))
+    } else {
+        Arc::new(UInt32Array::from(
+            indices
+                .iter()
+                .map(|i| i.as_usize() as u32)
+                .collect::<Vec<_>>(),
+        ))
+    };


It would be nice if we figured out a nicer way of doing this but I still can't figure it out 😅

Went a bit crazy with generics in attempting once but didn't pan out; but it works and the if branch isn't a big deal, though I wonder if it's more efficient to just have Int32/Int64 arrays instead of their unsigned variants to avoid needing a map -> collect 🤔

Jefffrey · 2025-11-06T11:15:38Z

datafusion/functions-nested/src/reverse.rs

-    let mut mutable =
-        MutableArrayData::with_capacities(vec![&original_data], false, capacity);
    let value_length = array.value_length() as usize;
+    let mut indices: Vec<u64> = Vec::with_capacity(values.len());


I do wonder for FSL's if we can take advantage of knowing its a fixed size list to do it in a more efficient way? For example we know a FSL like so:

FSL with fixed length 3: [[a, b, c], [d, e, f]]

Would always have reverse offsets like so:

[2, 1, 0, 5, 4, 3]

Without even needing to iterate the FSL array itself 🤔

Very elegant! Included in 41e1044

vegarsti · 2025-11-06T21:18:21Z

A note regarding benchmarks, we might need to add a bit of randomness + null spread for more accurate benchmarking perhaps. As they currently are I believe they always create arrays of fixed offsets for every child list.

I've started on this but didn't finish -- I couldn't figure out how to properly set a random seed. Any pointers? 🤔

Jefffrey · 2025-11-07T02:46:29Z

A note regarding benchmarks, we might need to add a bit of randomness + null spread for more accurate benchmarking perhaps. As they currently are I believe they always create arrays of fixed offsets for every child list.

I've started on this but didn't finish -- I couldn't figure out how to properly set a random seed. Any pointers? 🤔

I don't know exactly what you mean, but maybe you can use the map benchmark as a reference?

datafusion/datafusion/functions-nested/benches/map.rs

Lines 57 to 59 in 7591919

    
           let mut rng = rand::rng(); 
        
           let keys = keys(&mut rng); 
        
           let values = values(&mut rng);

They don't seem to set the seed.

vegarsti · 2025-11-07T06:58:39Z

A note regarding benchmarks, we might need to add a bit of randomness + null spread for more accurate benchmarking perhaps. As they currently are I believe they always create arrays of fixed offsets for every child list.

I've started on this but didn't finish -- I couldn't figure out how to properly set a random seed. Any pointers? 🤔

I don't know exactly what you mean, but maybe you can use the map benchmark as a reference?

datafusion/datafusion/functions-nested/benches/map.rs

Lines 57 to 59 in 7591919

let mut rng = rand::rng();

let keys = keys(&mut rng);

let values = values(&mut rng);

They don't seem to set the seed.

Thanks for the pointer! I thought I would need to set the seed because I was getting very variable results.

vegarsti · 2025-11-08T08:00:09Z

Updated benchmark results after reworking those cc @Jefffrey Baseline is main implementation of reverse but new benchmark. I didn't end up using rng but just setting variable array sizes and nullability (see 388ab48). I reckon that's fine! Reversing the indexes directly for FixedSizeList was ~10% better (from 41e1044) than the previous thing I had, so thanks @Jefffrey!

08:49 ~/dev/datafusion improve-perf-list-reverse # cargo bench --bench array_reverse
   Compiling datafusion-functions-nested v50.3.0 (/Users/vegard/dev/datafusion/datafusion/functions-nested)
    Finished `bench` profile [optimized] target(s) in 42.08s
     Running benches/array_reverse.rs (target/release/deps/array_reverse-2c473eed34a53d0a)
Gnuplot not found, using plotters backend
Benchmarking array_reverse_list: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.3s, or reduce sample count to 70.
array_reverse_list      time:   [62.201 ms 62.551 ms 62.946 ms]
                        change: [−70.137% −69.965% −69.785%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

Benchmarking array_reverse_list_view: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.3s, or reduce sample count to 70.
array_reverse_list_view time:   [61.649 ms 61.905 ms 62.185 ms]
                        change: [−16.122% −15.623% −15.087%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

array_reverse_fixed_size_list
                        time:   [4.7936 ms 4.8292 ms 4.8741 ms]
                        change: [−76.435% −76.196% −75.951%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 20 outliers among 100 measurements (20.00%)
  8 (8.00%) low mild
  5 (5.00%) high mild
  7 (7.00%) high severe

vegarsti · 2025-11-08T17:30:52Z

Thanks!

Improve performance of array_reverse 5x by using take, not MutableData

7f3dc5c

vegarsti commented Nov 5, 2025

View reviewed changes

vegarsti marked this pull request as ready for review November 5, 2025 18:32

vegarsti changed the title ~~Make array_reverse 5x faster for List, 2.5x for FixedSizeList, by using take~~ Make array_reverse a lot faster for List and FixedSizeList Nov 5, 2025

vegarsti marked this pull request as draft November 5, 2025 18:42

vegarsti marked this pull request as ready for review November 5, 2025 18:54

vegarsti changed the title ~~Make array_reverse a lot faster for List and FixedSizeList~~ Make array_reverse faster for List and FixedSizeList Nov 5, 2025

vegarsti changed the title ~~Make array_reverse faster for List and FixedSizeList~~ Make array_reverse faster for List (~5x) and FixedSizeList (~2.5x) Nov 5, 2025

vegarsti changed the title ~~Make array_reverse faster for List (~5x) and FixedSizeList (~2.5x)~~ Make array_reverse faster for List and FixedSizeList Nov 5, 2025

Add a test for reversing a FixedSizeList with nulls

38af34a

vegarsti force-pushed the improve-perf-list-reverse branch from 7f84c0e to 38af34a Compare November 5, 2025 19:55

vegarsti mentioned this pull request Nov 5, 2025

Add benchmark for array_reverse #18425

Merged

Jefffrey reviewed Nov 6, 2025

View reviewed changes

vegarsti added 2 commits November 6, 2025 14:56

Add test with empty arrays

ca9319e

Reverse indices directly for FixedSizeList

41e1044

Rework benchmark to have variable sized arrays and nulls

388ab48

Jefffrey approved these changes Nov 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make array_reverse faster for List and FixedSizeList #18500

Make array_reverse faster for List and FixedSizeList #18500

vegarsti commented Nov 5, 2025 •

edited

Loading

Uh oh!

vegarsti commented Nov 5, 2025

Uh oh!

vegarsti Nov 5, 2025

Uh oh!

Jefffrey Nov 6, 2025

Uh oh!

Jefffrey left a comment

Uh oh!

Jefffrey Nov 6, 2025

Uh oh!

Jefffrey Nov 6, 2025

Uh oh!

vegarsti Nov 6, 2025

Uh oh!

vegarsti commented Nov 6, 2025

Uh oh!

Jefffrey commented Nov 7, 2025

Uh oh!

vegarsti commented Nov 7, 2025

Uh oh!

vegarsti commented Nov 8, 2025 •

edited

Loading

Uh oh!

vegarsti commented Nov 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Make array_reverse faster for List and FixedSizeList #18500

Are you sure you want to change the base?

Make array_reverse faster for List and FixedSizeList #18500

Conversation

vegarsti commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale for this change

Are these changes tested?

Uh oh!

vegarsti commented Nov 5, 2025

Uh oh!

vegarsti Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Jefffrey Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Jefffrey left a comment

Choose a reason for hiding this comment

Uh oh!

Jefffrey Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Jefffrey Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

vegarsti Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

vegarsti commented Nov 6, 2025

Uh oh!

Jefffrey commented Nov 7, 2025

Uh oh!

vegarsti commented Nov 7, 2025

Uh oh!

vegarsti commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vegarsti commented Nov 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vegarsti commented Nov 5, 2025 •

edited

Loading

vegarsti commented Nov 8, 2025 •

edited

Loading