chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against previous versions #398

paleolimbot · 2024-03-07T21:54:04Z

I imagine there are a few ways to go about this, but I found moving the benchmarks to their own subdirectory and using FetchContent to build against various versions/source checkouts to be an intuitive way to do this. This also nicely separates benchmark-related CMake from non-benchmark related CMake and provides a nice way to benchmark locally against a few previous versions (via build presets). If we add more benchmarks in the future (or discover a flaw in an existing benchmark), it also provides a nice way to retrospectively run them against previous releases.

I've added a more verbose description of the setup to the benchmarks README, but the general idea is:

Benchmarks are documented using Doxygen, which is really good at parsing documentation. Reading the XML is a bit of a pain but is better than undocumented or difficult-to-locate benchmarks and better than parsing source files yourself.
Configurations are CMake build presets, and CMake handles pulling a previous or local nanoarrow using FetchContent. This means that the only action needed on release to update the report is to add a configure preset.
The provided benchmark-run-all.sh effectively reuses build directories for minimal rebuilding during benchmark development.
The report is a Quarto document that renders to markdown. It is not the flashiest of reports but gets the job done. It could be replaced by something like conbench in the future.

Example report in details below:

Benchmark Report

Configurations

These benchmarks were run with the following configurations:

preset_name	preset_description
local	Uses the nanoarrow C sources from this checkout.
v0.4.0	Uses the nanoarrow C sources the 0.4.0 release.

Summary

A quick and dirty summary of benchmark results between this checkout and
the last released version.

benchmark_label	v0.4.0	local	change	pct_change
ArrayViewGetIntUnsafeInt16	635.33µs	631.47µs	1ns	-0.6%
ArrayViewGetIntUnsafeInt32	635.96µs	636.71µs	753.7ns	0.1%
ArrayViewGetIntUnsafeInt64	669.22µs	680.5µs	11.3µs	1.7%
ArrayViewGetIntUnsafeInt64CheckNull	1.03ms	1.21ms	178.7µs	17.4%
ArrayViewGetIntUnsafeInt8	948.13µs	946.34µs	1ns	-0.2%
SchemaInitWideStruct	1.04ms	1.02ms	1ns	-2.1%
SchemaViewInitWideStruct	106.08µs	104.56µs	1ns	-1.4%

ArrowArrayView-related benchmarks

Benchmarks for consuming ArrowArrays using the ArrowArrayViewXXX()
functions.

ArrayViewGetIntUnsafeInt8

Use ArrowArrayViewGetIntUnsafe() to consume an int8 array.

View
Source

preset_name	iterations	real_time	cpu_time	items_per_second
local	746	946µs	945µs	1,058,678,610
v0.4.0	745	948µs	947µs	1,056,345,018

ArrayViewGetIntUnsafeInt16

Use ArrowArrayViewGetIntUnsafe() to consume an int16 array.

View
Source

preset_name	iterations	real_time	cpu_time	items_per_second
local	1115	631µs	630µs	1,586,161,276
v0.4.0	1110	635µs	634µs	1,576,482,853

ArrayViewGetIntUnsafeInt32

Use ArrowArrayViewGetIntUnsafe() to consume an int32 array.

View
Source

preset_name	iterations	real_time	cpu_time	items_per_second
local	1106	637µs	636µs	1,572,865,930
v0.4.0	1116	636µs	635µs	1,574,396,587

ArrayViewGetIntUnsafeInt64

Use ArrowArrayViewGetIntUnsafe() to consume an int64 array.

View
Source

preset_name	iterations	real_time	cpu_time	items_per_second
local	1036	680µs	680µs	1,471,241,907
v0.4.0	1039	669µs	668µs	1,496,471,266

ArrayViewGetIntUnsafeInt64CheckNull

Use ArrowArrayViewGetIntUnsafe() to consume an int64 array (checking for
nulls)

View
Source

preset_name	iterations	real_time	cpu_time	items_per_second
local	581	1.21ms	1.2ms	830,641,968
v0.4.0	697	1.03ms	1.02ms	976,185,007

Schema-related benchmarks

Benchmarks for producing and consuming ArrowSchema.

SchemaInitWideStruct

Benchmark ArrowSchema creation for very wide tables.

Simulates part of the process of creating a very wide table with a
simple column type (integer).

View
Source

preset_name	iterations	real_time	cpu_time	items_per_second
local	684	1.02ms	1.02ms	9,788,166
v0.4.0	686	1.04ms	1.04ms	9,606,888

SchemaViewInitWideStruct

Benchmark ArrowSchema parsing for very wide tables.

Simulates part of the process of consuming a very wide table. Typically
the ArrowSchemaViewInit() is done by ArrowArrayViewInit() but uses a
similar pattern.

View
Source

preset_name	iterations	real_time	cpu_time	items_per_second
local	6753	105µs	104µs	95,812,784
v0.4.0	6762	106µs	106µs	94,630,337

codecov-commenter · 2024-03-08T01:03:39Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.74%. Comparing base (5756b76) to head (a0363c3).
Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #398   +/-   ##
=======================================
  Coverage   88.74%   88.74%           
=======================================
  Files          81       81           
  Lines       14398    14398           
=======================================
  Hits        12778    12778           
  Misses       1620     1620

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

assignUser

Very nice! I really like the clean structure, usage of presets etc.. I ran it locally and outside of 2 minor fixes (shebang and remote fix) it worked great!

assignUser · 2024-03-14T03:59:05Z

.github/workflows/benchmarks.yaml

+      - 'dev/benchmarks/**'
+
+permissions:
+  contents: read


assignUser · 2024-03-14T03:59:44Z

dev/benchmarks/.gitignore

sometimes git is so weird xD

assignUser · 2024-03-14T04:01:30Z

dev/benchmarks/CMakeLists.txt

+project(nanoarrow_benchmarks)
+
+if(NOT DEFINED CMAKE_C_STANDARD)
+  set(CMAKE_C_STANDARD 99)


You probably want to add C(XX)_STANDARD_REQUIRED otherwise this can decay to a previous standard.

assignUser · 2024-03-14T04:05:27Z

dev/benchmarks/CMakeLists.txt

+option(NANOARROW_BENCHMARK_VERSION "nanoarrow version to benchmark" OFF)
+option(NANOARROW_BENCHMARK_SOURCE_DIR "path to a nanoarrow source checkout to benchmark"


This is a bit funky as options can only be boolean but you are also using it as a string. If you set it as a cache variable it would behave similar to an option (a passed -DNANORARROW... would overwrite it) but be more idiomatic. I do wish string options/selections would be a thing (arrow has a custom function for them).

This was a 🤯 for me and explains a lot of the very strange errors I have had with CMake over the last few years!

assignUser · 2024-03-14T04:08:44Z

dev/benchmarks/CMakeLists.txt

+  fetchcontent_makeavailable(nanoarrow)
+elseif(NANOARROW_BENCHMARK_VERSION)
+  fetchcontent_declare(nanoarrow
+                       URL https://github.com/apache/arrow-nanoarrow/archive/refs/tags/apache-arrow-nanoarrow-${NANOARROW_BENCHMARK_VERSION}.zip


fyi: You can actually use URL for a local dir too!

A URL may be an ordinary path in the local file system (in which case it must be the only URL provided) or any downloadable URL supported by the file(DOWNLOAD) command. A local filesystem path may refer to either an existing directory or to an archive file, whereas a URL is expected to point to a file which can be treated as an archive.

I couldn't get this to work (the configure step hangs), but I like parameterizing this way since you could easily run benchmarks against any branch on any fork or any local checkout without much trouble so I emulated it with if(IS_DIRECTORY ...).

assignUser · 2024-03-14T04:10:13Z

dev/benchmarks/CMakeLists.txt

+include(CTest)
+enable_testing()


This is technically redundant as including CTest also enables it but I have had mixed results with that so I also usually call it explicitly :)

The tests don't show up in VSCode unless I put it there 🤷

assignUser · 2024-03-14T04:14:40Z

dev/benchmarks/benchmark-run-all.sh

Just overall very nice, short and sweet 🍦

dev/benchmarks/benchmark-run-all.sh

assignUser · 2024-03-14T04:25:37Z

dev/benchmarks/benchmark-report.qmd

+if (github_ref == "main") {
+  github_repo <- "apache/arrow-nanoarrow"
+} else {
+  remote <- gert::git_remote_info()


This actually errors for me but that is due to the way gh pr checkout sets up the branch (no remote apparently?). Probably not something that will happen in PRs?

I wrapped this in try() just in case. The source links aren't that useful if you're in a branch, anyway, but at least the report will run (and just give possibly inaccurate source links).

assignUser · 2024-03-14T04:30:36Z

dev/benchmarks/README.md

+```shell
+./benchmark-run-all.sh
+cd apidoc && doxygen && cd ..
+quarto render benchmark-report.qmd


Is quarto known outside of the R/posit sphere? Otherwise a quick link on how to get it might be good?
https://quarto.org/docs/get-started/

(I missed the link to quarto.org further up 🤦 )

Co-authored-by: Jacob Wujciak-Jens <[email protected]>

…/run against previous versions (apache#398) I imagine there are a few ways to go about this, but I found moving the benchmarks to their own subdirectory and using `FetchContent` to build against various versions/source checkouts to be an intuitive way to do this. This also nicely separates benchmark-related CMake from non-benchmark related CMake and provides a nice way to benchmark locally against a few previous versions (via build presets). If we add more benchmarks in the future (or discover a flaw in an existing benchmark), it also provides a nice way to retrospectively run them against previous releases. I've added a more verbose description of the setup to the benchmarks README, but the general idea is: - Benchmarks are documented using Doxygen, which is really good at parsing documentation. Reading the XML is a bit of a pain but is better than undocumented or difficult-to-locate benchmarks and better than parsing source files yourself. - Configurations are CMake build presets, and CMake handles pulling a previous or local nanoarrow using `FetchContent`. This means that the only action needed on release to update the report is to add a configure preset. - The provided `benchmark-run-all.sh` effectively reuses build directories for minimal rebuilding during benchmark development. - The report is a [Quarto](https://quarto.org) document that renders to markdown. It is not the flashiest of reports but gets the job done. It could be replaced by something like [conbench](https://github.com/conbench/conbench) in the future. Example report in details below: <details> # Benchmark Report ## Configurations These benchmarks were run with the following configurations: | preset_name | preset_description | |:------------|:-------------------------------------------------| | local | Uses the nanoarrow C sources from this checkout. | | v0.4.0 | Uses the nanoarrow C sources the 0.4.0 release. | ## Summary A quick and dirty summary of benchmark results between this checkout and the last released version. | benchmark_label | v0.4.0 | local | change | pct_change | |:----------------------------------------------------------------------------|---------:|---------:|--------:|-----------:| | [ArrayViewGetIntUnsafeInt16](#arrayviewgetintunsafeint16) | 635.33µs | 631.47µs | 1ns | -0.6% | | [ArrayViewGetIntUnsafeInt32](#arrayviewgetintunsafeint32) | 635.96µs | 636.71µs | 753.7ns | 0.1% | | [ArrayViewGetIntUnsafeInt64](#arrayviewgetintunsafeint64) | 669.22µs | 680.5µs | 11.3µs | 1.7% | | [ArrayViewGetIntUnsafeInt64CheckNull](#arrayviewgetintunsafeint64checknull) | 1.03ms | 1.21ms | 178.7µs | 17.4% | | [ArrayViewGetIntUnsafeInt8](#arrayviewgetintunsafeint8) | 948.13µs | 946.34µs | 1ns | -0.2% | | [SchemaInitWideStruct](#schemainitwidestruct) | 1.04ms | 1.02ms | 1ns | -2.1% | | [SchemaViewInitWideStruct](#schemaviewinitwidestruct) | 106.08µs | 104.56µs | 1ns | -1.4% | ## ArrowArrayView-related benchmarks Benchmarks for consuming ArrowArrays using the ArrowArrayViewXXX() functions. ### ArrayViewGetIntUnsafeInt8 Use ArrowArrayViewGetIntUnsafe() to consume an int8 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L108-L110) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 746 | 946µs | 945µs | 1,058,678,610 | | v0.4.0 | 745 | 948µs | 947µs | 1,056,345,018 | ### ArrayViewGetIntUnsafeInt16 Use ArrowArrayViewGetIntUnsafe() to consume an int16 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L113-L115) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1115 | 631µs | 630µs | 1,586,161,276 | | v0.4.0 | 1110 | 635µs | 634µs | 1,576,482,853 | ### ArrayViewGetIntUnsafeInt32 Use ArrowArrayViewGetIntUnsafe() to consume an int32 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L118-L120) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1106 | 637µs | 636µs | 1,572,865,930 | | v0.4.0 | 1116 | 636µs | 635µs | 1,574,396,587 | ### ArrayViewGetIntUnsafeInt64 Use ArrowArrayViewGetIntUnsafe() to consume an int64 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L123-L125) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1036 | 680µs | 680µs | 1,471,241,907 | | v0.4.0 | 1039 | 669µs | 668µs | 1,496,471,266 | ### ArrayViewGetIntUnsafeInt64CheckNull Use ArrowArrayViewGetIntUnsafe() to consume an int64 array (checking for nulls) [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L128-L130) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 581 | 1.21ms | 1.2ms | 830,641,968 | | v0.4.0 | 697 | 1.03ms | 1.02ms | 976,185,007 | ## Schema-related benchmarks Benchmarks for producing and consuming ArrowSchema. ### SchemaInitWideStruct Benchmark ArrowSchema creation for very wide tables. Simulates part of the process of creating a very wide table with a simple column type (integer). [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/schema_benchmark.cc#L45-L56) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 684 | 1.02ms | 1.02ms | 9,788,166 | | v0.4.0 | 686 | 1.04ms | 1.04ms | 9,606,888 | ### SchemaViewInitWideStruct Benchmark ArrowSchema parsing for very wide tables. Simulates part of the process of consuming a very wide table. Typically the ArrowSchemaViewInit() is done by ArrowArrayViewInit() but uses a similar pattern. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/schema_benchmark.cc#L78-L91) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 6753 | 105µs | 104µs | 95,812,784 | | v0.4.0 | 6762 | 106µs | 106µs | 94,630,337 | </details> --------- Co-authored-by: Jacob Wujciak-Jens <[email protected]>

paleolimbot marked this pull request as ready for review March 8, 2024 15:26

paleolimbot changed the title ~~chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against arbitrary commits~~ chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against previous versions Mar 8, 2024

assignUser approved these changes Mar 14, 2024

View reviewed changes

paleolimbot added 15 commits March 14, 2024 11:14

separate benchmarks

da961d1

maybe CI for benchmarks

c7d67d0

add documentation

3adca9d

format

7b700e7

maybe use presets

e259531

consolidate to script

825d85e

dods

188375a

maybe make the build dir

65756f8

reporting

9cdc50e

format

02b3c6e

maybe fix rat

778aff0

flush out report

9dea3c6

one more rat

b60187c

readme

6684560

whitespace

42944e0

paleolimbot force-pushed the c-more-benchmarks branch from 73ee6c3 to 42944e0 Compare March 14, 2024 14:14

paleolimbot and others added 2 commits March 14, 2024 11:15

Update dev/benchmarks/benchmark-run-all.sh

a0363c3

Co-authored-by: Jacob Wujciak-Jens <[email protected]>

review edits

4de0d7e

paleolimbot merged commit c7a1236 into apache:main Mar 14, 2024
32 checks passed

paleolimbot deleted the c-more-benchmarks branch March 14, 2024 20:06

paleolimbot added this to the nanoarrow 0.5.0 milestone May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against previous versions #398

chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against previous versions #398

paleolimbot commented Mar 7, 2024 •

edited

Loading

codecov-commenter commented Mar 8, 2024 •

edited

Loading

assignUser left a comment

assignUser Mar 14, 2024

assignUser Mar 14, 2024

assignUser Mar 14, 2024

paleolimbot Mar 14, 2024

assignUser Mar 14, 2024

paleolimbot Mar 14, 2024

assignUser Mar 14, 2024

paleolimbot Mar 14, 2024

assignUser Mar 14, 2024

paleolimbot Mar 14, 2024

assignUser Mar 14, 2024

assignUser Mar 14, 2024

paleolimbot Mar 14, 2024

assignUser Mar 14, 2024

assignUser Mar 14, 2024

		option(NANOARROW_BENCHMARK_VERSION "nanoarrow version to benchmark" OFF)
		option(NANOARROW_BENCHMARK_SOURCE_DIR "path to a nanoarrow source checkout to benchmark"

		include(CTest)
		enable_testing()

chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against previous versions #398

chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against previous versions #398

Conversation

paleolimbot commented Mar 7, 2024 • edited Loading

Benchmark Report

Configurations

Summary

ArrowArrayView-related benchmarks

ArrayViewGetIntUnsafeInt8

ArrayViewGetIntUnsafeInt16

ArrayViewGetIntUnsafeInt32

ArrayViewGetIntUnsafeInt64

ArrayViewGetIntUnsafeInt64CheckNull

Schema-related benchmarks

SchemaInitWideStruct

SchemaViewInitWideStruct

codecov-commenter commented Mar 8, 2024 • edited Loading

Codecov Report

assignUser left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paleolimbot commented Mar 7, 2024 •

edited

Loading

codecov-commenter commented Mar 8, 2024 •

edited

Loading