Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] PyArrow allows creating non-nullable columns containing nulls #43145

Closed
adamreeve opened this issue Jul 4, 2024 · 1 comment
Closed

Comments

@adamreeve
Copy link
Contributor

adamreeve commented Jul 4, 2024

Describe the bug, including details regarding any error messages, version, and platform.

Reproduced using PyArrow 16.1.0 and Python 3.12 on Fedora 39 Linux. The null values are still displayed as null when printing, but if round tripped via Parquet they are lost:

import pyarrow as pa
import pyarrow.parquet as pq

schema = pa.schema([
    pa.field("x", pa.int64(), nullable=False)])

table = pa.Table.from_pydict({
        "x": [1, None, 3],
    }, schema=schema)

print(f"Original table:\n{table}\n")

pq.write_table(table, 'data.parquet')
read = pq.read_table('data.parquet')

print(f"Table from Parquet:\n{read}")

This outputs:

Original table:
pyarrow.Table
x: int64 not null
----
x: [[1,null,3]]

Table from Parquet:
pyarrow.Table
x: int64 not null
----
x: [[1,0,3]]

Is this expected behaviour? I would have thought this should raise an exception when the table is created.

Component(s)

Python

@adamreeve
Copy link
Contributor Author

Closing this as a duplicate of #41667

@adamreeve adamreeve closed this as not planned Won't fix, can't repro, duplicate, stale Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant