Skip to content

Commit 31a88c7

Browse files
authored
Merge pull request #131 from alamb/alamb/update_readme
Add examples for `arrow` crate to readme and library documentation
2 parents 4480ca5 + 55747a4 commit 31a88c7

File tree

2 files changed

+109
-5
lines changed

2 files changed

+109
-5
lines changed

Readme.md

+66-4
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,53 @@ arrays, and deserialization from arrays to Rust structs.
3232
[datafusion]: https://github.com/apache/arrow-datafusion/
3333

3434
## Example
35+
Given this Rust structure
36+
```rust
37+
#[derive(Serialize, Deserialize)]
38+
struct Record {
39+
a: f32,
40+
b: i32,
41+
}
3542

43+
let records = vec![
44+
Record { a: 1.0, b: 1 },
45+
Record { a: 2.0, b: 2 },
46+
Record { a: 3.0, b: 3 },
47+
];
48+
```
49+
50+
### Serialize to `arrow` `RecordBatch`
51+
```rust
52+
use serde_arrow::schema::{TracingOptions, SerdeArrowSchema};
53+
54+
// Determine Arrow schema
55+
let fields =
56+
SerdeArrowSchema::from_type::<Record>(TracingOptions::default())?
57+
.to_arrow_fields()
58+
59+
// Convert to Arrow arrays
60+
let arrays = serde_arrow::to_arrow(&fields, &records)?;
61+
62+
// Form a RecordBatch
63+
let schema = Schema::new(&fields);
64+
let batch = RecordBatch::try_new(schema.into(), arrays)?;
65+
```
66+
67+
This `RecordBatch` can now be written to disk using [ArrowWriter] from the [parquet] crate.
68+
69+
[ArrowWriter]: https://docs.rs/parquet/latest/parquet/arrow/arrow_writer/struct.ArrowWriter.html
70+
[parquet]: https://docs.rs/parquet/latest/parquet/
71+
72+
73+
```rust
74+
let file = File::create("example.pq");
75+
let mut writer = ArrowWriter::try_new(file, batch.schema(), None)?;
76+
writer.write(&batch)?;
77+
writer.close()?;
78+
```
79+
80+
81+
### Serialize to `arrow2` arrays
3682
```rust
3783
use serde_arrow::schema::{TracingOptions, SerdeArrowSchema};
3884

@@ -69,16 +115,32 @@ write_chunk(
69115
)?;
70116
```
71117

118+
### Usage from python
119+
72120
The written file can now be read in Python via
73121

74122
```python
75123
# using polars
76-
import polars as pl
77-
pl.read_parquet("example.pq")
124+
>>> import polars as pl
125+
>>> pl.read_parquet("example.pq")
126+
shape: (3, 2)
127+
┌─────┬─────┐
128+
│ a ┆ b │
129+
------
130+
│ f32 ┆ i32 │
131+
╞═════╪═════╡
132+
1.01
133+
2.02
134+
3.03
135+
└─────┴─────┘
78136

79137
# using pandas
80-
import pandas as pd
81-
pd.read_parquet("example.pq")
138+
>>> import pandas as pd
139+
>>> pd.read_parquet("example.pq")
140+
a b
141+
0 1.0 1
142+
1 2.0 2
143+
2 3.0 3
82144
```
83145

84146
[arrow2-guide]: https://jorgecarleitao.github.io/arrow2

serde_arrow/src/lib.rs

+43-1
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,49 @@
4545
//! - the [status summary][_impl::docs::status] for an overview over the
4646
//! supported Arrow and Rust constructs
4747
//!
48-
//! ## Example
48+
//! ## `arrow` Example
49+
//! ```rust
50+
//! # use serde::{Deserialize, Serialize};
51+
//! # #[cfg(feature = "has_arrow")]
52+
//! # fn main() -> serde_arrow::Result<()> {
53+
//! use arrow::datatypes::Schema;
54+
//! use arrow::record_batch::RecordBatch;
55+
//! use serde_arrow::schema::{TracingOptions, SerdeArrowSchema};
56+
//!
57+
//! ##[derive(Serialize, Deserialize)]
58+
//! struct Record {
59+
//! a: f32,
60+
//! b: i32,
61+
//! }
62+
//!
63+
//! let records = vec![
64+
//! Record { a: 1.0, b: 1 },
65+
//! Record { a: 2.0, b: 2 },
66+
//! Record { a: 3.0, b: 3 },
67+
//! ];
68+
//!
69+
//! // Determine Arrow schema
70+
//! let fields = Vec::<Field>::from_type::<Record>(TracingOptions::default())?;
71+
//!
72+
//! // Convert Rust records to Arrow arrays
73+
//! let arrays = serde_arrow::to_arrow(&fields, &records)?;
74+
//!
75+
//! // Create RecordBatch
76+
//! let schema = Schema::new(fields);
77+
//! let batch = RecordBatch::try_new(schema, arrays)?;
78+
//! # Ok(())
79+
//! # }
80+
//! # #[cfg(not(feature = "has_arrow"))]
81+
//! # fn main() { }
82+
//! ```
83+
//!
84+
//! The `RecordBatch` can then be written to disk, e.g., as parquet using
85+
//! the [`ArrowWriter`] from the [`parquet`] crate.
86+
//!
87+
//! [`ArrowWriter`]: https://docs.rs/parquet/latest/parquet/arrow/arrow_writer/struct.ArrowWriter.html
88+
//! [`parquet`]: https://docs.rs/parquet/latest/parquet/
89+
//!
90+
//! ## `arrow2` Example
4991
//!
5092
//! Requires one of `arrow2` feature (see below).
5193
//!

0 commit comments

Comments
 (0)