Support add columns with all nulls #3044

wjones127 · 2024-10-24T21:35:43Z

User should be able to write:

dataset.add_columns([pa.field("new_col1", pa.int32()), pa.field("new_col2", pa.string())])

and it would add the new_col1 and new_col2 to the end of the schema. The values would all be null.

After finishing #3016, this can just be a metadata-only operation. Would just add to the schema, and not modify the data files.

The text was updated successfully, but these errors were encountered:

westonpace · 2024-10-24T22:02:27Z

Is this feature the only motivation for #3016? All null data files should be quite small, regardless of the data type so they might not be so bad.

wjones127 · 2024-10-25T18:03:02Z

Not at all. Motivation for #3016 is being able to create a schema and omit columns when inserting later.

wjones127 added the enhancement New feature or request label Oct 24, 2024

wjones127 mentioned this issue Oct 24, 2024

Allow inserting subschemas #3016

Closed

2 tasks

Provide feedback