-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(dev/benchmarks): Add benchmarks for ArrowArrayAppend()
#401
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nits which might just be style differences
static void BenchmarkArrayAppendInt8(benchmark::State& state) { | ||
BaseBenchmarkArrayAppendInt<int8_t, NANOARROW_TYPE_INT8>(state); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, you can use BENCHMARK with template instantiations:
BENCHMARK(BenchmarkArrayAppend<int8_t, NANOARROW_TYPE_INT8>);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know that...thanks! (The reporting is simplified slightly by the Doxygen comment above each of these at the moment and so I left the wrapper function for now.)
state.SetItemsProcessed(n_values * state.iterations()); | ||
} | ||
|
||
/// \brief Use ArrowArrayViewGetStringUnsafe() to consume a string array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is style here ending with a period or not? Worth checking all added comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! (I think the style is not ending with a period for \brief, at least in everything else I added)
dev/benchmarks/c/array_benchmark.cc
Outdated
|
||
int64_t n_values = kNumItemsPrettyBig; | ||
int64_t value_size = 7; | ||
std::string alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see multiple alphabets, perhaps make them into a well-known constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Co-authored-by: Benjamin Kietzman <[email protected]>
Co-authored-by: Benjamin Kietzman <[email protected]>
…e#401) This PR adds a set of benchmarks for building arrays using `ArrowArrayAppendXXX()` and adds a few missing ones for `ArrowArrayView` like `ArrowArrayViewGetString()`. (Report output in details) <details> # Benchmark Report ## Configurations These benchmarks were run with the following configurations: | preset_name | preset_description | |:------------|:-------------------------------------------------| | local | Uses the nanoarrow C sources from this checkout. | | v0.4.0 | Uses the nanoarrow C sources the 0.4.0 release. | ## Summary A quick and dirty summary of benchmark results between this checkout and the last released version. | benchmark_label | v0.4.0 | local | change | pct_change | |:----------------------------------------------------------|---------:|---------:|-------:|-----------:| | [ArrayAppendInt16](#arrayappendint16) | 2.68ms | 2.66ms | 1ns | -0.9% | | [ArrayAppendInt32](#arrayappendint32) | 3.12ms | 3.08ms | 1ns | -1.3% | | [ArrayAppendInt64](#arrayappendint64) | 3.79ms | 3.47ms | 1ns | -8.4% | | [ArrayAppendInt8](#arrayappendint8) | 2.39ms | 2.38ms | 1ns | -0.1% | | [ArrayAppendNulls](#arrayappendnulls) | 12.05ms | 12.04ms | 1ns | -0.1% | | [ArrayAppendString](#arrayappendstring) | 8.96ms | 8.67ms | 1ns | -3.2% | | [ArrayViewGetInt16](#arrayviewgetint16) | 628.79µs | 627.1µs | 1ns | -0.3% | | [ArrayViewGetInt32](#arrayviewgetint32) | 634.21µs | 625.86µs | 1ns | -1.3% | | [ArrayViewGetInt64](#arrayviewgetint64) | 672.81µs | 676.99µs | 4.18µs | 0.6% | | [ArrayViewGetInt8](#arrayviewgetint8) | 783.55µs | 784.61µs | 1.05µs | 0.1% | | [ArrayViewGetString](#arrayviewgetstring) | 1.26ms | 1.25ms | 1ns | -0.4% | | [ArrayViewIsNull](#arrayviewisnull) | 1.21ms | 1.19ms | 1ns | -1.8% | | [ArrayViewIsNullNonNullable](#arrayviewisnullnonnullable) | 938.36µs | 940.65µs | 2.28µs | 0.2% | | [SchemaInitWideStruct](#schemainitwidestruct) | 1.02ms | 1.02ms | 1ns | -0.2% | | [SchemaViewInitWideStruct](#schemaviewinitwidestruct) | 103.62µs | 103.53µs | 1ns | -0.1% | ## ArrowArray-related benchmarks Benchmarks for producing ArrowArrays using the ArrowArrayXXX() functions. ### ArrayAppendString Use ArrowArrayAppendString() to build a string array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L288-L315) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 83 | 8.67ms | 8.64ms | 115,712,019 | | v0.4.0 | 77 | 8.96ms | 8.81ms | 113,455,364 | ### ArrayAppendInt8 Use ArrowArrayAppendInt() to build an int8 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L339-L341) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 292 | 2.38ms | 2.38ms | 420,186,810 | | v0.4.0 | 296 | 2.39ms | 2.38ms | 419,740,272 | ### ArrayAppendInt16 Use ArrowArrayAppendInt() to build an int16 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L344-L346) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 264 | 2.66ms | 2.66ms | 376,369,150 | | v0.4.0 | 261 | 2.68ms | 2.68ms | 373,079,925 | ### ArrayAppendInt32 Use ArrowArrayAppendInt() to build an int32 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L349-L351) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 228 | 3.08ms | 3.08ms | 324,738,215 | | v0.4.0 | 225 | 3.12ms | 3.12ms | 320,760,473 | ### ArrayAppendInt64 Use ArrowArrayAppendInt() to build an int64 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L354-L356) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 206 | 3.47ms | 3.46ms | 289,089,536 | | v0.4.0 | 186 | 3.79ms | 3.77ms | 265,070,543 | ### ArrayAppendNulls Use ArrowArrayAppendNulls() to build an int32 array that contains 80% null values. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L379-L401) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 59 | 12ms | 12ms | 83,199,603 | | v0.4.0 | 58 | 12ms | 12ms | 83,135,409 | ## ArrowArrayView-related benchmarks Benchmarks for consuming ArrowArrays using the ArrowArrayViewXXX() functions. ### ArrayViewGetInt8 Use ArrowArrayViewGet() to consume an int8 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L118-L120) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 893 | 785µs | 784µs | 1,276,321,450 | | v0.4.0 | 894 | 784µs | 782µs | 1,278,021,040 | ### ArrayViewGetInt16 Use ArrowArrayViewGet() to consume an int16 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L123-L125) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1114 | 627µs | 626µs | 1,597,100,560 | | v0.4.0 | 1115 | 629µs | 628µs | 1,593,178,054 | ### ArrayViewGetInt32 Use ArrowArrayViewGet() to consume an int32 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L128-L130) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1115 | 626µs | 625µs | 1,600,061,993 | | v0.4.0 | 1114 | 634µs | 633µs | 1,580,536,418 | ### ArrayViewGetInt64 Use ArrowArrayViewGet() to consume an int64 array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L133-L135) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 1023 | 677µs | 676µs | 1,480,375,260 | | v0.4.0 | 1018 | 673µs | 671µs | 1,490,177,709 | ### ArrayViewIsNullNonNullable Use ArrowArrayViewIsNull() to check for nulls while consuming an int32 array that does not contain a validity buffer. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L139-L168) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 746 | 941µs | 940µs | 1,064,112,037 | | v0.4.0 | 745 | 938µs | 937µs | 1,066,931,705 | ### ArrayViewIsNull Use ArrowArrayViewIsNull() to check for nulls while consuming an int32 array that contains 20% nulls. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L172-L211) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 588 | 1.19ms | 1.19ms | 842,447,913 | | v0.4.0 | 588 | 1.21ms | 1.2ms | 830,223,525 | ### ArrayViewGetString Use ArrowArrayViewGetStringUnsafe() to consume a string array. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/array_benchmark.cc#L214-L245) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 557 | 1.25ms | 1.25ms | 800,060,902 | | v0.4.0 | 546 | 1.26ms | 1.25ms | 797,048,875 | ## Schema-related benchmarks Benchmarks for producing and consuming ArrowSchema. ### SchemaInitWideStruct Benchmark ArrowSchema creation for very wide tables. Simulates part of the process of creating a very wide table with a simple column type (integer). [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/schema_benchmark.cc#L45-L56) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 690 | 1.02ms | 1.02ms | 9,843,783 | | v0.4.0 | 683 | 1.02ms | 1.02ms | 9,831,837 | ### SchemaViewInitWideStruct Benchmark ArrowSchema parsing for very wide tables. Simulates part of the process of consuming a very wide table. Typically the ArrowSchemaViewInit() is done by ArrowArrayViewInit() but uses a similar pattern. [View Source](https://github.com/paleolimbot/arrow-nanoarrow/blob/benchmarks-read-create/dev/benchmarks/c/schema_benchmark.cc#L78-L91) | preset_name | iterations | real_time | cpu_time | items_per_second | |:------------|-----------:|----------:|---------:|-----------------:| | local | 6772 | 104µs | 103µs | 96,669,664 | | v0.4.0 | 6749 | 104µs | 103µs | 96,625,343 | </details> --------- Co-authored-by: Benjamin Kietzman <[email protected]>
This PR adds a set of benchmarks for building arrays using
ArrowArrayAppendXXX()
and adds a few missing ones forArrowArrayView
likeArrowArrayViewGetString()
.(Report output in details)
Benchmark Report
Configurations
These benchmarks were run with the following configurations:
Summary
A quick and dirty summary of benchmark results between this checkout and the last released version.
ArrowArray-related benchmarks
Benchmarks for producing ArrowArrays using the ArrowArrayXXX() functions.
ArrayAppendString
Use ArrowArrayAppendString() to build a string array.
View Source
ArrayAppendInt8
Use ArrowArrayAppendInt() to build an int8 array.
View Source
ArrayAppendInt16
Use ArrowArrayAppendInt() to build an int16 array.
View Source
ArrayAppendInt32
Use ArrowArrayAppendInt() to build an int32 array.
View Source
ArrayAppendInt64
Use ArrowArrayAppendInt() to build an int64 array.
View Source
ArrayAppendNulls
Use ArrowArrayAppendNulls() to build an int32 array that contains 80% null values.
View Source
ArrowArrayView-related benchmarks
Benchmarks for consuming ArrowArrays using the ArrowArrayViewXXX() functions.
ArrayViewGetInt8
Use ArrowArrayViewGet() to consume an int8 array.
View Source
ArrayViewGetInt16
Use ArrowArrayViewGet() to consume an int16 array.
View Source
ArrayViewGetInt32
Use ArrowArrayViewGet() to consume an int32 array.
View Source
ArrayViewGetInt64
Use ArrowArrayViewGet() to consume an int64 array.
View Source
ArrayViewIsNullNonNullable
Use ArrowArrayViewIsNull() to check for nulls while consuming an int32 array that does not contain a validity buffer.
View Source
ArrayViewIsNull
Use ArrowArrayViewIsNull() to check for nulls while consuming an int32 array that contains 20% nulls.
View Source
ArrayViewGetString
Use ArrowArrayViewGetStringUnsafe() to consume a string array.
View Source
Schema-related benchmarks
Benchmarks for producing and consuming ArrowSchema.
SchemaInitWideStruct
Benchmark ArrowSchema creation for very wide tables.
Simulates part of the process of creating a very wide table with a simple column type (integer).
View Source
SchemaViewInitWideStruct
Benchmark ArrowSchema parsing for very wide tables.
Simulates part of the process of consuming a very wide table. Typically the ArrowSchemaViewInit() is done by ArrowArrayViewInit() but uses a similar pattern.
View Source