-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JSON writer benchmarks (#5314) #5317
Conversation
1eecd07
to
7f3c74c
Compare
7f3c74c
to
3c90e2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me -- thank you @tustvold
I also applaud your data driven approach
I had some comments / suggestions but nothing I think is required
fn create_mixed(len: usize) -> RecordBatch { | ||
let c1 = Arc::new(create_string_array::<i32>(len, 0.)); | ||
let c2 = Arc::new(create_primitive_array::<Int32Type>(len, 0.)); | ||
let c3 = Arc::new(create_primitive_array::<UInt32Type>(len, 0.)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should also include a 64 bit type (like U64IntType
rather than U32IntType)?
arrow/benches/json_writer.rs
Outdated
} | ||
|
||
fn bench_dict_array(c: &mut Criterion) { | ||
let c1 = Arc::new(create_string_array::<i32>(NUM_ROWS, 0.)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to include a string array in a benchmark for dictionary ? Maybe this is already covered by the separate benchmark for just strings?
do_bench(c, "bench_struct", &batch) | ||
} | ||
|
||
fn bench_nullable_struct(c: &mut Criterion) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 for including structs
Though again here I wonder if it is important to also have a string array or if we should just have a single struct array column)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to have more than one column at the struct level, to break branch prediction
} | ||
|
||
fn bench_primitive(c: &mut Criterion) { | ||
let c1 = Arc::new(create_primitive_array::<Float32Type>(NUM_ROWS, 0.)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason to lump them together or would it make more sense to have a separate one for ints and flots?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Habit, this is a good shout as float formatting is much more complex for JSON
Which issue does this PR close?
Part of #5314
Rationale for this change
We should get benchmarks before making changes to the serialization logic.
What changes are included in this PR?
Are there any user-facing changes?