Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[arrow-cast] Support cast from Numeric (Int, UInt, etc) to Utf8View #6719

Closed
wants to merge 9 commits into from
45 changes: 40 additions & 5 deletions arrow-array/src/builder/generic_bytes_view_builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -466,24 +466,59 @@ impl<T: ByteViewType + ?Sized, V: AsRef<T::Native>> Extend<Option<V>>
/// Array builder for [`StringViewArray`][crate::StringViewArray]
///
/// Values can be appended using [`GenericByteViewBuilder::append_value`], and nulls with
/// [`GenericByteViewBuilder::append_null`] as normal.
/// [`GenericByteViewBuilder::append_null`].
///
/// # Example
/// This builder also implements [`std::fmt::Write`] with any written data
/// included in the next appended value. This allows using [`std::fmt::Display`]
/// with standard Rust idioms like `write!` and `writeln!` to write data
/// directly to the builder without intermediate allocations.
///
/// # Example writing strings with `append_value`
/// ```
/// # use arrow_array::builder::StringViewBuilder;
/// # use arrow_array::StringViewArray;
/// let mut builder = StringViewBuilder::new();
/// builder.append_value("hello");
/// builder.append_null();
/// builder.append_value("world");
/// builder.append_value("hello"); // row 0 is a value
/// builder.append_null(); // row 1 is null
/// builder.append_value("world"); // row 2 is a value
/// let array = builder.finish();
///
/// let expected = vec![Some("hello"), None, Some("world")];
/// let actual: Vec<_> = array.iter().collect();
/// assert_eq!(expected, actual);
/// ```
///
/// /// # Example incrementally writing strings with `std::fmt::Write`
/// ```
/// # use std::fmt::Write;
/// # use arrow_array::builder::StringViewBuilder;
/// let mut builder = StringViewBuilder::new();
///
/// // Write data in multiple `write!` calls
/// write!(builder, "foo").unwrap();
/// write!(builder, "bar").unwrap();
/// // The next call to append_value finishes the current string
/// // including all previously written strings.
/// builder.append_value("baz");
///
/// // Write second value with a single write call
/// write!(builder, "v2").unwrap();
/// // finish the value by calling append_value with an empty string
/// builder.append_value("");
///
/// let array = builder.finish();
/// assert_eq!(array.value(0), "foobarbaz");
/// assert_eq!(array.value(1), "v2");
/// ```
pub type StringViewBuilder = GenericByteViewBuilder<StringViewType>;

impl std::fmt::Write for StringViewBuilder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also what is contemplated by #6373 (aka I think this PR fixes that ticket as well)

fn write_str(&mut self, s: &str) -> std::fmt::Result {
self.append_value(s);
Copy link
Contributor

@alamb alamb Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was writing some tests for this, and it turns out this is different behavior than StringViewBuilder

https://docs.rs/arrow/latest/arrow/array/builder/type.GenericStringBuilder.html#example-incrementally-writing-strings-with-stdfmtwrite

Specifically, calling write_str doesn't compete the row 🤔

I made a PR showing the problem: tlm365#1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working on a potential solution so we can unblock this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok(())
}
}

/// Array builder for [`BinaryViewArray`][crate::BinaryViewArray]
///
/// Values can be appended using [`GenericByteViewBuilder::append_value`], and nulls with
Expand Down
Loading
Loading