-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests/integration/: Add support for decoding binaries #67
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #67 +/- ##
==========================================
+ Coverage 77.22% 80.52% +3.29%
==========================================
Files 34 30 -4
Lines 3302 3106 -196
==========================================
- Hits 2550 2501 -49
+ Misses 752 605 -147 |
fn binaries_to_uint8_list(batch: RecordBatch) -> RecordBatch { | ||
RecordBatch::try_new( | ||
Arc::new(Schema::new( | ||
batch | ||
.schema() | ||
.fields | ||
.into_iter() | ||
.map(|field| { | ||
Field::new( | ||
field.name(), | ||
match field.data_type() { | ||
DataType::Binary => DataType::List(Arc::new(Field::new( | ||
"value", | ||
DataType::UInt8, | ||
false, | ||
))), | ||
DataType::LargeBinary => DataType::LargeList(Arc::new(Field::new( | ||
"value", | ||
DataType::UInt8, | ||
false, | ||
))), | ||
data_type => data_type.clone(), | ||
}, | ||
field.is_nullable(), | ||
) | ||
}) | ||
.collect::<Vec<_>>(), | ||
)), | ||
batch | ||
.columns() | ||
.iter() | ||
.map(|array| match array.as_binary_opt() { | ||
Some(array) => { | ||
let (offsets, values, nulls) = array.clone().into_parts(); | ||
ListArray::try_new( | ||
Arc::new(Field::new("value", DataType::UInt8, false)), | ||
offsets, | ||
Arc::new(PrimitiveArray::<UInt8Type>::new(values.into(), None)), | ||
nulls, | ||
) | ||
.map(Arc::new) | ||
.expect("Could not create ListArray") | ||
} | ||
None => array.clone(), | ||
}) | ||
.collect(), | ||
) | ||
.expect("Could not rebuild RecordBatch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there's a way to use cast kernels to achieve the same effect instead of recreating a new recordbatch 🤔
FYI will go with this approach #73 So won't have to muck about with JSON parsing |
Sounds good! |
This goes in the direction opposite to #66, but I already had the code so it doesn't hurt. (At worst it can be deleted when implementing #66)
Also fixes the comment on
decimal