fixed size list type is not retained when writing to parquet

When I create a parquet file from an arrow table with a fixed size array as one of the columns, then read back the resulting parquet, the column is no longer a fixed size array, but instead a dynamically sized array.

Example:
```python
import datafusion as df
import pyarrow as pa

FILENAME = "/tmp/fixed_array_example.parquet"
ctx = df.SessionContext()

array = pa.array([[1.0, 2.0], [3.0, 4.0]], type=pa.list_(pa.float32(), 2))
table = pa.Table.from_pydict({"array": array})
df_table = ctx.from_arrow(table)
print("original schema:")
print(df_table.schema())

df_table.write_parquet(FILENAME)
print("roundtrip schema:")
print(ctx.read_parquet(FILENAME).schema())
```
Output:
```
original schema:
array: fixed_size_list<item: float>[2]
  child 0, item: float
roundtrip schema:
array: list<item: float>
  child 0, item: float
```

As the output demonstrates, the datafusion dataframe that is written out has the proper schema. Nevertheless, the file that is read back does not.

If instead of datafusion, I use pyarrow to write the parquet file, I do get the expected schema when I read it back using datafusion.
```python
import datafusion as df
import pyarrow as pa
import pyarrow.parquet as pq

FILENAME = "/tmp/fixed_array_example_pyarrow.parquet"
ctx = df.SessionContext()

array = pa.array([[1.0, 2.0], [3.0, 4.0]], type=pa.list_(pa.float32(), 2))
table = pa.Table.from_pydict({"array": array})

print("original schema:")
print(table.schema)

pq.write_table(table, FILENAME)
print("roundtrip schema:")
print(ctx.read_parquet(FILENAME).schema())
```
output:
```
original schema:
array: fixed_size_list<item: float>[2]
  child 0, item: float
roundtrip schema:
array: fixed_size_list<element: float>[2]
  child 0, element: float
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fixed size list type is not retained when writing to parquet #957

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fixed size list type is not retained when writing to parquet #957

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions