Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Help] Implement arrow_serialize/deserialize for newtype over num_complex #94

Closed
graybc opened this issue Jan 24, 2023 · 2 comments
Closed

Comments

@graybc
Copy link

graybc commented Jan 24, 2023

I'd like to create a newtype struct that contains num_complex::Complex32 and can convert to/from arrow.

struct Phasor(num_complex::Complex32);

I'm having trouble understanding the documentation for implementing ArrowSerialize and ArrowDeserialize with types other than privatives (as shown in the complex_example.rs) can anyone help point me in the right direction? Here is what I have so far, though I know I'm not approaching this right...

impl arrow2_convert::field::ArrowField for Phasor {
    type Type = Self;

    fn data_type() -> arrow2::datatypes::DataType {
        arrow2::datatypes::DataType::Extension(
            "phasor".to_string(),
            Box::new(arrow2::datatypes::DataType::Struct(vec![
                Field::new("re", arrow2::datatypes::DataType::Float32, false),
                Field::new("im", arrow2::datatypes::DataType::Float32, false),
            ])),
            None,
        )
    }
}

impl arrow2_convert::serialize::ArrowSerialize for Phasor {
    type MutableArrayType = arrow2::array::MutableStructArray;

    fn new_array() -> Self::MutableArrayType {
        Self::MutableArrayType::new(
            <Self as arrow2_convert::field::ArrowField>::data_type(),
            vec![],
        )
    }

    fn arrow_serialize(v: &Self, array: &mut Self::MutableArrayType) -> arrow2::error::Result<()> {
        let real: &mut MutablePrimitiveArray<PhasorType> = array.value(0).unwrap();
        real.try_push(Some(v.re()));
        let imag: &mut MutablePrimitiveArray<PhasorType> = array.value(1).unwrap();
        imag.try_push(Some(v.im()));

        array.push(true);
        Ok(())
    }
}

impl arrow2_convert::deserialize::ArrowDeserialize for Phasor {
    type ArrayType = arrow2::array::StructArray;

    fn arrow_deserialize(v: Option<???>) -> Option<Self> {
        v.map(|t| Phasor::new(t.get(0).unwrap(), t.get(1).unwrap()))
    }
}

arrow2_convert::arrow_enable_vec_for_type!(Phasor);

I've seen issue #79 reference adding support for remote types, but until then and to better understand arrow2 I'd like to understand how to do this manually. Thanks in advance!

@ncpenke
Copy link
Collaborator

ncpenke commented Jan 25, 2023

@graybc thanks for raising this. You're right. Customizing non-primitive types needs more work. As a long-term solution, in addition to #79 some of the reworking of the ArrowSerialize and ArrowDeserialize traits in the next release should help support this use-case better.

For the short-term, the code below is a possible workaround. It manually implements part of what we would do for remote type. However, the way data types are specified, it might be a bit tricky to specify the data type exactly the way you have it in your example. It should be possible by manually implementing the ArrowField trait for the LocalComplex32 struct below, but I haven't tried it.

#[test]
fn test_complex() {    
    use num_complex::Complex32;

    #[derive(Debug, PartialEq)]
    struct Phasor(num_complex::Complex32);

    #[derive(arrow2_convert::ArrowField, arrow2_convert::ArrowSerialize, arrow2_convert::ArrowDeserialize)]
    struct LocalComplex32 {
        re: f32,
        im: f32,
    }

    impl From<&num_complex::Complex32> for LocalComplex32 {
        fn from(c: &num_complex::Complex32) -> Self {
            Self { re: c.re, im: c.im }
        }
    }

    impl From<LocalComplex32> for num_complex::Complex32 {
        fn from(c: LocalComplex32) -> Self {
            Self::new(c.re, c.im)
        }
    }

    impl arrow2_convert::field::ArrowField for Phasor {
        type Type = Self;
    
        fn data_type() -> arrow2::datatypes::DataType {
            LocalComplex32::data_type()
        }
    }
    
    impl arrow2_convert::serialize::ArrowSerialize for Phasor {
        type MutableArrayType = <LocalComplex32 as arrow2_convert::serialize::ArrowSerialize>::MutableArrayType;
    
        fn new_array() -> Self::MutableArrayType {
            LocalComplex32::new_array()
        }
    
        fn arrow_serialize(v: &Self, array: &mut Self::MutableArrayType) -> arrow2::error::Result<()> {
            let local: LocalComplex32 = (&v.0).into();
            LocalComplex32::arrow_serialize(&local, array)
        }
    }
    
    impl arrow2_convert::deserialize::ArrowDeserialize for Phasor {
        type ArrayType = <LocalComplex32 as arrow2_convert::deserialize::ArrowDeserialize>::ArrayType;
    
        fn arrow_deserialize(v: <&Self::ArrayType as IntoIterator>::Item) -> Option<Self> {
            v.map(|t| Phasor(t.into()))
        }
    }
    
    arrow2_convert::arrow_enable_vec_for_type!(Phasor);

    let original = vec![Phasor(Complex32::new(10., 10.)), Phasor(Complex32::new(20., 20.)), Phasor(Complex32::new(30., 30.))];
    let array: Box<dyn Array> = original.try_into_arrow().unwrap();
    let round_trip: Vec<Phasor> = array.try_into_collection().unwrap();
    assert_eq!(round_trip, original);
}

@graybc
Copy link
Author

graybc commented Jan 26, 2023

Hi @ncpenke,

Thanks so much, that makes a lot of sense! I'll keep an eye out for the next release as well but this is a very cool project and has really helped me get started with Apache Arrow and arrow2.

@graybc graybc closed this as completed Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants