Skip to content

Conversation

@vegarsti
Copy link
Contributor

@vegarsti vegarsti commented Nov 1, 2025

There's no benchmarks for array_reverse. I used this while working on #18424 to confirm take was faster than MutableData for ListView. That might be the case for other List types as well, which are currently using MutableData.

The benchmark can be run with cargo bench --bench array_reverse.

@vegarsti

This comment was marked as outdated.

@vegarsti vegarsti force-pushed the list-view-reverse-benchmark branch from 44ad697 to 4460c17 Compare November 5, 2025 07:45
@vegarsti vegarsti changed the title Add benchmark for array_reverse of list view Add benchmark for array_reverse Nov 5, 2025
@vegarsti vegarsti force-pushed the list-view-reverse-benchmark branch 2 times, most recently from 341c1b7 to b89c6ad Compare November 5, 2025 08:10
@vegarsti vegarsti marked this pull request as ready for review November 5, 2025 08:10
@vegarsti
Copy link
Contributor Author

vegarsti commented Nov 5, 2025

@Jefffrey Here's the benchmark I used when deciding to use take for ListView: #18424 (comment)

@vegarsti
Copy link
Contributor Author

vegarsti commented Nov 5, 2025

Results from my machine

# cargo bench --bench array_reverse
array_reverse_list      time:   [440.73 µs 441.98 µs 443.74 µs]
array_reverse_fixed_size_list time:   [456.60 µs 460.60 µs 465.08 µs]
array_reverse_list_view time:   [70.679 µs 71.772 µs 73.142 µs]

@vegarsti vegarsti force-pushed the list-view-reverse-benchmark branch from b89c6ad to 49d3271 Compare November 5, 2025 08:14
@Jefffrey
Copy link
Contributor

Jefffrey commented Nov 5, 2025

I'm no expert on writing benchmark code, but I though usually we include a black_box hint somewhere? 🤔

@vegarsti
Copy link
Contributor Author

vegarsti commented Nov 5, 2025

I'm no expert on writing benchmark code, but I though usually we include a black_box hint somewhere? 🤔

Thank you! Updated

@vegarsti
Copy link
Contributor Author

vegarsti commented Nov 5, 2025

The CI step that failed was cargo examples (amd64) with

warning: spurious network error (1 try remaining): [28] Timeout was reached (Operation too slow. Less than 10 bytes/sec transferred the last 30 seconds)

@Jefffrey
Copy link
Contributor

Jefffrey commented Nov 5, 2025

The CI step that failed was cargo examples (amd64) with

warning: spurious network error (1 try remaining): [28] Timeout was reached (Operation too slow. Less than 10 bytes/sec transferred the last 30 seconds)

Rerunning the job

@Jefffrey Jefffrey added this pull request to the merge queue Nov 5, 2025
Merged via the queue into apache:main with commit 7b5685b Nov 5, 2025
54 of 55 checks passed
@vegarsti vegarsti deleted the list-view-reverse-benchmark branch November 5, 2025 11:41
@vegarsti
Copy link
Contributor Author

vegarsti commented Nov 5, 2025

Speedups! #18500

github-merge-queue bot pushed a commit that referenced this pull request Nov 9, 2025
## Rationale for this change

Noticed while doing #18424 that the list types `List` and
`FixedSizeList` uses `MutableData` to build the reverse array. Using
`take` turns out to be a lot faster, ~70% for both `List` and
`FixedSizeList`. This PR also reworks the benchmark added in #18425, and
these are the results on that compared to the implementation on main:

```
# cargo bench --bench array_reverse
   Compiling datafusion-functions-nested v50.3.0 (/Users/vegard/dev/datafusion/datafusion/functions-nested)
    Finished `bench` profile [optimized] target(s) in 42.08s
     Running benches/array_reverse.rs (target/release/deps/array_reverse-2c473eed34a53d0a)
Gnuplot not found, using plotters backend
Benchmarking array_reverse_list: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.3s, or reduce sample count to 70.
array_reverse_list      time:   [62.201 ms 62.551 ms 62.946 ms]
                        change: [−70.137% −69.965% −69.785%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

Benchmarking array_reverse_list_view: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.3s, or reduce sample count to 70.
array_reverse_list_view time:   [61.649 ms 61.905 ms 62.185 ms]
                        change: [−16.122% −15.623% −15.087%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

array_reverse_fixed_size_list
                        time:   [4.7936 ms 4.8292 ms 4.8741 ms]
                        change: [−76.435% −76.196% −75.951%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 20 outliers among 100 measurements (20.00%)
  8 (8.00%) low mild
  5 (5.00%) high mild
  7 (7.00%) high severe
```

## Are these changes tested?
Covered by existing sqllogic tests, and one new test for
`FixedSizeList`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants