-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(python): add schema conversion of FixedSizeBinaryArray and FixedSizeListType #2005
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
@balbok0 can you extend the tests also to include this fixedSizedBinary |
I changed one test case to include a check for fixed binary. I think it should be enough (unit test-wise), since it's a really small change, but let me know if you would like more checks, just in case. For more-integration like, is there a test case with repeated write to a |
Actually, I just found the same exact issue with |
You need to set the type hints properly in the functions and also define the stubs |
Hi, sorry for being slow on resolving issues. I've fixed a typo on one of the tests I added. Just to be sure:
|
@balbok0 no the tests are related, I suggest you execute it locally to debug. Decimal is hit as a false positive while doing the schema change, so the logic needs fixing |
Ok, switched it to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! @balbok0
…SizeListType (delta-io#2005) # Description Map `FixedSizeBinaryType` to `BinaryArray`, since Delta does not support fixed arrays. # Related Issue(s) None. # Documentation N/A # Minimal Example I've noticed this error when doing subsequent calls to like so: ``` import deltalake as dl import pyarrow as pa schema = pa.schema([ ("field_a", pa.binary(4)), # To simulate fix, switch this line to: ("field_a", pa.binary()), ]) table = pa.Table.from_pylist( [ {"field_a": val.to_bytes(4, "little")} for val in range(0, 100) ], schema=schema ) # This works dl.write_deltalake( "bad_table", data=table, mode="append", ) # This fails dl.write_deltalake( "bad_table", data=table, mode="append", ) ``` with error: ``` ValueError: Schema of data does not match table schema Data schema: field_a: fixed_size_binary[4] Table Schema: field_a: binary ``` --------- Co-authored-by: Jakub Filipek <[email protected]> Co-authored-by: Jakub Filipek <[email protected]>
Description
Map
FixedSizeBinaryType
toBinaryArray
, since Delta does not support fixed arrays.Related Issue(s)
None.
Documentation
N/A
Minimal Example
I've noticed this error when doing subsequent calls to like so:
with error: