Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against previous versions #398

Merged
merged 17 commits into from
Mar 14, 2024

Conversation

paleolimbot
Copy link
Member

@paleolimbot paleolimbot commented Mar 7, 2024

I imagine there are a few ways to go about this, but I found moving the benchmarks to their own subdirectory and using FetchContent to build against various versions/source checkouts to be an intuitive way to do this. This also nicely separates benchmark-related CMake from non-benchmark related CMake and provides a nice way to benchmark locally against a few previous versions (via build presets). If we add more benchmarks in the future (or discover a flaw in an existing benchmark), it also provides a nice way to retrospectively run them against previous releases.

I've added a more verbose description of the setup to the benchmarks README, but the general idea is:

  • Benchmarks are documented using Doxygen, which is really good at parsing documentation. Reading the XML is a bit of a pain but is better than undocumented or difficult-to-locate benchmarks and better than parsing source files yourself.
  • Configurations are CMake build presets, and CMake handles pulling a previous or local nanoarrow using FetchContent. This means that the only action needed on release to update the report is to add a configure preset.
  • The provided benchmark-run-all.sh effectively reuses build directories for minimal rebuilding during benchmark development.
  • The report is a Quarto document that renders to markdown. It is not the flashiest of reports but gets the job done. It could be replaced by something like conbench in the future.

Example report in details below:

Benchmark Report

Configurations

These benchmarks were run with the following configurations:

preset_name preset_description
local Uses the nanoarrow C sources from this checkout.
v0.4.0 Uses the nanoarrow C sources the 0.4.0 release.

Summary

A quick and dirty summary of benchmark results between this checkout and
the last released version.

benchmark_label v0.4.0 local change pct_change
ArrayViewGetIntUnsafeInt16 635.33µs 631.47µs 1ns -0.6%
ArrayViewGetIntUnsafeInt32 635.96µs 636.71µs 753.7ns 0.1%
ArrayViewGetIntUnsafeInt64 669.22µs 680.5µs 11.3µs 1.7%
ArrayViewGetIntUnsafeInt64CheckNull 1.03ms 1.21ms 178.7µs 17.4%
ArrayViewGetIntUnsafeInt8 948.13µs 946.34µs 1ns -0.2%
SchemaInitWideStruct 1.04ms 1.02ms 1ns -2.1%
SchemaViewInitWideStruct 106.08µs 104.56µs 1ns -1.4%

ArrowArrayView-related benchmarks

Benchmarks for consuming ArrowArrays using the ArrowArrayViewXXX()
functions.

ArrayViewGetIntUnsafeInt8

Use ArrowArrayViewGetIntUnsafe() to consume an int8 array.

View
Source

preset_name iterations real_time cpu_time items_per_second
local 746 946µs 945µs 1,058,678,610
v0.4.0 745 948µs 947µs 1,056,345,018

ArrayViewGetIntUnsafeInt16

Use ArrowArrayViewGetIntUnsafe() to consume an int16 array.

View
Source

preset_name iterations real_time cpu_time items_per_second
local 1115 631µs 630µs 1,586,161,276
v0.4.0 1110 635µs 634µs 1,576,482,853

ArrayViewGetIntUnsafeInt32

Use ArrowArrayViewGetIntUnsafe() to consume an int32 array.

View
Source

preset_name iterations real_time cpu_time items_per_second
local 1106 637µs 636µs 1,572,865,930
v0.4.0 1116 636µs 635µs 1,574,396,587

ArrayViewGetIntUnsafeInt64

Use ArrowArrayViewGetIntUnsafe() to consume an int64 array.

View
Source

preset_name iterations real_time cpu_time items_per_second
local 1036 680µs 680µs 1,471,241,907
v0.4.0 1039 669µs 668µs 1,496,471,266

ArrayViewGetIntUnsafeInt64CheckNull

Use ArrowArrayViewGetIntUnsafe() to consume an int64 array (checking for
nulls)

View
Source

preset_name iterations real_time cpu_time items_per_second
local 581 1.21ms 1.2ms 830,641,968
v0.4.0 697 1.03ms 1.02ms 976,185,007

Schema-related benchmarks

Benchmarks for producing and consuming ArrowSchema.

SchemaInitWideStruct

Benchmark ArrowSchema creation for very wide tables.

Simulates part of the process of creating a very wide table with a
simple column type (integer).

View
Source

preset_name iterations real_time cpu_time items_per_second
local 684 1.02ms 1.02ms 9,788,166
v0.4.0 686 1.04ms 1.04ms 9,606,888

SchemaViewInitWideStruct

Benchmark ArrowSchema parsing for very wide tables.

Simulates part of the process of consuming a very wide table. Typically
the ArrowSchemaViewInit() is done by ArrowArrayViewInit() but uses a
similar pattern.

View
Source

preset_name iterations real_time cpu_time items_per_second
local 6753 105µs 104µs 95,812,784
v0.4.0 6762 106µs 106µs 94,630,337

@codecov-commenter
Copy link

codecov-commenter commented Mar 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.74%. Comparing base (5756b76) to head (a0363c3).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #398   +/-   ##
=======================================
  Coverage   88.74%   88.74%           
=======================================
  Files          81       81           
  Lines       14398    14398           
=======================================
  Hits        12778    12778           
  Misses       1620     1620           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@paleolimbot paleolimbot marked this pull request as ready for review March 8, 2024 15:26
@paleolimbot paleolimbot changed the title chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against arbitrary commits chore(dev/benchmarks): Reorganize benchmarks such that they can build/run against previous versions Mar 8, 2024
Copy link
Member

@assignUser assignUser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! I really like the clean structure, usage of presets etc.. I ran it locally and outside of 2 minor fixes (shebang and remote fix) it worked great!

- 'dev/benchmarks/**'

permissions:
contents: read
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sometimes git is so weird xD

project(nanoarrow_benchmarks)

if(NOT DEFINED CMAKE_C_STANDARD)
set(CMAKE_C_STANDARD 99)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to add C(XX)_STANDARD_REQUIRED otherwise this can decay to a previous standard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment on lines 32 to 33
option(NANOARROW_BENCHMARK_VERSION "nanoarrow version to benchmark" OFF)
option(NANOARROW_BENCHMARK_SOURCE_DIR "path to a nanoarrow source checkout to benchmark"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit funky as options can only be boolean but you are also using it as a string. If you set it as a cache variable it would behave similar to an option (a passed -DNANORARROW... would overwrite it) but be more idiomatic. I do wish string options/selections would be a thing (arrow has a custom function for them).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a 🤯 for me and explains a lot of the very strange errors I have had with CMake over the last few years!

fetchcontent_makeavailable(nanoarrow)
elseif(NANOARROW_BENCHMARK_VERSION)
fetchcontent_declare(nanoarrow
URL https://github.com/apache/arrow-nanoarrow/archive/refs/tags/apache-arrow-nanoarrow-${NANOARROW_BENCHMARK_VERSION}.zip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi: You can actually use URL for a local dir too!

A URL may be an ordinary path in the local file system (in which case it must be the only URL provided) or any downloadable URL supported by the file(DOWNLOAD) command. A local filesystem path may refer to either an existing directory or to an archive file, whereas a URL is expected to point to a file which can be treated as an archive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't get this to work (the configure step hangs), but I like parameterizing this way since you could easily run benchmarks against any branch on any fork or any local checkout without much trouble so I emulated it with if(IS_DIRECTORY ...).

Comment on lines +75 to +76
include(CTest)
enable_testing()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically redundant as including CTest also enables it but I have had mixed results with that so I also usually call it explicitly :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests don't show up in VSCode unless I put it there 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just overall very nice, short and sweet 🍦

if (github_ref == "main") {
github_repo <- "apache/arrow-nanoarrow"
} else {
remote <- gert::git_remote_info()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually errors for me but that is due to the way gh pr checkout sets up the branch (no remote apparently?). Probably not something that will happen in PRs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrapped this in try() just in case. The source links aren't that useful if you're in a branch, anyway, but at least the report will run (and just give possibly inaccurate source links).

```shell
./benchmark-run-all.sh
cd apidoc && doxygen && cd ..
quarto render benchmark-report.qmd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is quarto known outside of the R/posit sphere? Otherwise a quick link on how to get it might be good?
https://quarto.org/docs/get-started/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I missed the link to quarto.org further up 🤦 )

@paleolimbot paleolimbot merged commit c7a1236 into apache:main Mar 14, 2024
32 checks passed
@paleolimbot paleolimbot deleted the c-more-benchmarks branch March 14, 2024 20:06
eddelbuettel pushed a commit to eddelbuettel/arrow-nanoarrow that referenced this pull request Apr 10, 2024
…/run against previous versions (apache#398)

I imagine there are a few ways to go about this, but I found moving the
benchmarks to their own subdirectory and using `FetchContent` to build
against various versions/source checkouts to be an intuitive way to do
this. This also nicely separates benchmark-related CMake from
non-benchmark related CMake and provides a nice way to benchmark locally
against a few previous versions (via build presets). If we add more
benchmarks in the future (or discover a flaw in an existing benchmark),
it also provides a nice way to retrospectively run them against previous
releases.

I've added a more verbose description of the setup to the benchmarks
README, but the general idea is:

- Benchmarks are documented using Doxygen, which is really good at
parsing documentation. Reading the XML is a bit of a pain but is better
than undocumented or difficult-to-locate benchmarks and better than
parsing source files yourself.
- Configurations are CMake build presets, and CMake handles pulling a
previous or local nanoarrow using `FetchContent`. This means that the
only action needed on release to update the report is to add a configure
preset.
- The provided `benchmark-run-all.sh` effectively reuses build
directories for minimal rebuilding during benchmark development.
- The report is a [Quarto](https://quarto.org) document that renders to
markdown. It is not the flashiest of reports but gets the job done. It
could be replaced by something like
[conbench](https://github.com/conbench/conbench) in the future.

Example report in details below:

<details>

# Benchmark Report

## Configurations

These benchmarks were run with the following configurations:

| preset_name | preset_description                               |
|:------------|:-------------------------------------------------|
| local       | Uses the nanoarrow C sources from this checkout. |
| v0.4.0      | Uses the nanoarrow C sources the 0.4.0 release.  |

## Summary

A quick and dirty summary of benchmark results between this checkout and
the last released version.

| benchmark_label | v0.4.0 | local | change | pct_change |

|:----------------------------------------------------------------------------|---------:|---------:|--------:|-----------:|
| [ArrayViewGetIntUnsafeInt16](#arrayviewgetintunsafeint16) | 635.33µs |
631.47µs | 1ns | -0.6% |
| [ArrayViewGetIntUnsafeInt32](#arrayviewgetintunsafeint32) | 635.96µs |
636.71µs | 753.7ns | 0.1% |
| [ArrayViewGetIntUnsafeInt64](#arrayviewgetintunsafeint64) | 669.22µs |
680.5µs | 11.3µs | 1.7% |
|
[ArrayViewGetIntUnsafeInt64CheckNull](#arrayviewgetintunsafeint64checknull)
| 1.03ms | 1.21ms | 178.7µs | 17.4% |
| [ArrayViewGetIntUnsafeInt8](#arrayviewgetintunsafeint8) | 948.13µs |
946.34µs | 1ns | -0.2% |
| [SchemaInitWideStruct](#schemainitwidestruct) | 1.04ms | 1.02ms | 1ns
| -2.1% |
| [SchemaViewInitWideStruct](#schemaviewinitwidestruct) | 106.08µs |
104.56µs | 1ns | -1.4% |

## ArrowArrayView-related benchmarks

Benchmarks for consuming ArrowArrays using the ArrowArrayViewXXX()
functions.

### ArrayViewGetIntUnsafeInt8

Use ArrowArrayViewGetIntUnsafe() to consume an int8 array.

[View

Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L108-L110)

| preset_name | iterations | real_time | cpu_time | items_per_second |
|:------------|-----------:|----------:|---------:|-----------------:|
| local       |        746 |     946µs |    945µs |    1,058,678,610 |
| v0.4.0      |        745 |     948µs |    947µs |    1,056,345,018 |

### ArrayViewGetIntUnsafeInt16

Use ArrowArrayViewGetIntUnsafe() to consume an int16 array.

[View

Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L113-L115)

| preset_name | iterations | real_time | cpu_time | items_per_second |
|:------------|-----------:|----------:|---------:|-----------------:|
| local       |       1115 |     631µs |    630µs |    1,586,161,276 |
| v0.4.0      |       1110 |     635µs |    634µs |    1,576,482,853 |

### ArrayViewGetIntUnsafeInt32

Use ArrowArrayViewGetIntUnsafe() to consume an int32 array.

[View

Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L118-L120)

| preset_name | iterations | real_time | cpu_time | items_per_second |
|:------------|-----------:|----------:|---------:|-----------------:|
| local       |       1106 |     637µs |    636µs |    1,572,865,930 |
| v0.4.0      |       1116 |     636µs |    635µs |    1,574,396,587 |

### ArrayViewGetIntUnsafeInt64

Use ArrowArrayViewGetIntUnsafe() to consume an int64 array.

[View

Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L123-L125)

| preset_name | iterations | real_time | cpu_time | items_per_second |
|:------------|-----------:|----------:|---------:|-----------------:|
| local       |       1036 |     680µs |    680µs |    1,471,241,907 |
| v0.4.0      |       1039 |     669µs |    668µs |    1,496,471,266 |

### ArrayViewGetIntUnsafeInt64CheckNull

Use ArrowArrayViewGetIntUnsafe() to consume an int64 array (checking for
nulls)

[View

Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/array_benchmark.cc#L128-L130)

| preset_name | iterations | real_time | cpu_time | items_per_second |
|:------------|-----------:|----------:|---------:|-----------------:|
| local       |        581 |    1.21ms |    1.2ms |      830,641,968 |
| v0.4.0      |        697 |    1.03ms |   1.02ms |      976,185,007 |

## Schema-related benchmarks

Benchmarks for producing and consuming ArrowSchema.

### SchemaInitWideStruct

Benchmark ArrowSchema creation for very wide tables.

Simulates part of the process of creating a very wide table with a
simple column type (integer).

[View

Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/schema_benchmark.cc#L45-L56)

| preset_name | iterations | real_time | cpu_time | items_per_second |
|:------------|-----------:|----------:|---------:|-----------------:|
| local       |        684 |    1.02ms |   1.02ms |        9,788,166 |
| v0.4.0      |        686 |    1.04ms |   1.04ms |        9,606,888 |

### SchemaViewInitWideStruct

Benchmark ArrowSchema parsing for very wide tables.

Simulates part of the process of consuming a very wide table. Typically
the ArrowSchemaViewInit() is done by ArrowArrayViewInit() but uses a
similar pattern.

[View

Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/c-more-benchmarks/dev/benchmarks/c/schema_benchmark.cc#L78-L91)

| preset_name | iterations | real_time | cpu_time | items_per_second |
|:------------|-----------:|----------:|---------:|-----------------:|
| local       |       6753 |     105µs |    104µs |       95,812,784 |
| v0.4.0      |       6762 |     106µs |    106µs |       94,630,337 |


</details>

---------

Co-authored-by: Jacob Wujciak-Jens <[email protected]>
@paleolimbot paleolimbot added this to the nanoarrow 0.5.0 milestone May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants