-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Rust and Python APIs #9
base: main
Are you sure you want to change the base?
Conversation
change arrow converter make datatypes work again
add python lib python structure with stubs remove config file remove config file
1af2b01
to
8b7137d
Compare
…o have fastformat for python
…egant for compatibility with Python but you have to use ArrowArray and not Rust Vec
Hello Enzo! Thanks a lot for this! After spending a bit of time working on image, I think it would be nice, if we avoid using complex arrow type such as UnionArray. I understand your effort in making arrow message compact. But I think that we really need to make things simple if we want to make it accessible to master student. Can we do couple of changes:
The Hashmap will enable us to support arbitrary data format, jpeg, yuv422, ... without having to worry of breaking definition signature and introducing a breaking change. I'm sorry if this is going to make a couple of refactoring, but I think that making it easy for people to build there own Arrow.Array + Hashmap without using fastformat should be a feature and we should not expect contributors to know Arrow.UnionArray. from fastformat.datatypes import Image
metadata = {"width": 1, "height":1, "encoding": "bgr8"}
bgr8_image = Image.new_bgr8(np.array([0, 0, 0], dtype=np.uint8), metadata)
# Or even
bgr8_image = Image.new(np.array([0, 0, 0]), metadata)
array_data, metadata = bgr8_image.into_arrow() # Note the additional metadata parameter
# Array_data is a simple one dimensional storage array.
reconstructed_image = Image.from_arrow(array_data, metadata) This is going to be slightly less performant, but much more readable for the beginner few. |
I'm just going to say that, unfortunately there is a lot of robotics that plain simply don't know coding, and so, our expectation of what is acceptable for dora, should be extremely low. |
🚀 New Rust and Python APIs for Custom and Integrated Datatypes with Arrow Format
Hello! 😊 Here’s the latest PR update. I was meant to work on
fastformat
and release sooner, but schoolwork piled up—as it does!🎯 Objective
This PR introduces a streamlined and more user-friendly Rust API for creating custom types and using integrated datatypes with Arrow. We’re adding three new traits:
IntoArrow
,FromArrow
, andViewArrow
. More details and examples are provided below.Additionally, this PR brings in the Python API! 🎉 Now you can use Rust-coded, integrated datatypes directly within Python, and even create custom types in Python. Since everything is built on Arrow format, types defined in Python can seamlessly interact with Rust and vice-versa.
Let’s dive into the details! 🔍
🦀 The New Rust API
The Rust API is packaged as a single
crate
in theapis/rust
folder, complete with aprelude
module for ease of use.Creating custom types compatible with Arrow is straightforward. Here’s a quick example from
examples/consume-arrow
:First, define a basic Rust data type:
To make this datatype compatible with
arrow::ArrayData
, implement theIntoArrow/FromArrow
trait:In cases where
consuming
the data is not feasible (e.g., when all buffers are in a single large allocation), you can use theViewArrow
trait and modify theCustomDataType
structure slightly:Then integrate conversion from a
viewer
that manages the lifetime:However, this approach to viewing Arrow objects is not compatible with PyO3 for datatype portability. While you can create and use the structure in Rust and Python, it’s recommended to code everything on the Rust side and port the structure with PyO3. To handle Python's shared ownership, Rust structures need to use Arrow Arrays instead of Rust
Vec
andCow
:And of course, you can still use our integrated datatypes:
🐍 Python API
The Python API is now live! Check out our
python-view-arrow
example for a quick start.With this update, you can define custom datatypes in Python, making them fully compatible with Arrow for cross-language support!
Here’s how it works:
Define a simple Python dataclass:
Add two methods for Arrow compatibility:
And of course, similar to Rust, you can use integrated datatypes:
🛠️ Roadmap to Close This PR:
datatypes
module for access to integrated modules.IntoArrow
trait into two traits (Into
andFrom
, for better clarity).pyfastformat
instead offastformat
).🧐 Current Limitations
custom datatyper
conversion to Arrow is entirely in Python, without Rust integration. Future updates may replace this with a Rust-based implementation.