Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/integration/: Add support for decoding binaries #67

Closed
wants to merge 1 commit into from

Conversation

progval
Copy link
Contributor

@progval progval commented Mar 13, 2024

This goes in the direction opposite to #66, but I already had the code so it doesn't hurt. (At worst it can be deleted when implementing #66)

Also fixes the comment on decimal

Copy link

codecov bot commented Mar 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.52%. Comparing base (424b021) to head (873c07a).
Report is 46 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #67      +/-   ##
==========================================
+ Coverage   77.22%   80.52%   +3.29%     
==========================================
  Files          34       30       -4     
  Lines        3302     3106     -196     
==========================================
- Hits         2550     2501      -49     
+ Misses        752      605     -147     

Comment on lines +71 to +118
fn binaries_to_uint8_list(batch: RecordBatch) -> RecordBatch {
RecordBatch::try_new(
Arc::new(Schema::new(
batch
.schema()
.fields
.into_iter()
.map(|field| {
Field::new(
field.name(),
match field.data_type() {
DataType::Binary => DataType::List(Arc::new(Field::new(
"value",
DataType::UInt8,
false,
))),
DataType::LargeBinary => DataType::LargeList(Arc::new(Field::new(
"value",
DataType::UInt8,
false,
))),
data_type => data_type.clone(),
},
field.is_nullable(),
)
})
.collect::<Vec<_>>(),
)),
batch
.columns()
.iter()
.map(|array| match array.as_binary_opt() {
Some(array) => {
let (offsets, values, nulls) = array.clone().into_parts();
ListArray::try_new(
Arc::new(Field::new("value", DataType::UInt8, false)),
offsets,
Arc::new(PrimitiveArray::<UInt8Type>::new(values.into(), None)),
nulls,
)
.map(Arc::new)
.expect("Could not create ListArray")
}
None => array.clone(),
})
.collect(),
)
.expect("Could not rebuild RecordBatch")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's a way to use cast kernels to achieve the same effect instead of recreating a new recordbatch 🤔

@Jefffrey
Copy link
Collaborator

FYI will go with this approach #73

So won't have to muck about with JSON parsing

@progval
Copy link
Contributor Author

progval commented Mar 23, 2024

Sounds good!

@progval progval closed this Mar 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants