Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Trouble creating a modified schema #507

Open
tschaub opened this issue Jun 21, 2023 · 1 comment
Open

Trouble creating a modified schema #507

tschaub opened this issue Jun 21, 2023 · 1 comment

Comments

@tschaub
Copy link
Contributor

tschaub commented Jun 21, 2023

I'm trying to use this module to read an existing Parquet file and write out a modified Parquet file. For example, I would like to transform a single column from one type to another. Or in other cases I would like to modify the compression used when writing.

I've had some luck creating a writer using the schema from an input file:

writerConfig, _ := parquet.NewWriterConfig(input.Schema())
writer := parquet.NewGenericWriter[any](output, writerConfig)

And then I later write a modified version of the rows (e.g. after transforming some values to a different type):

writer.WriteRows(modifiedRows)

This works except that the output schema is wrong for the columns where I have transformed values.

I recognize that this may not be how this module is intended to be used. If it does sound like a reasonable thing to do, can anyone provide advice on how I might clone an existing schema and make modifications to one or more of the field types?

@kevinburkesegment
Copy link
Contributor

Apologies to make more work for you, but we've decided to move development on this project to a new organization at https://github.com/parquet-go/parquet-go to ensure its long term success. We appreciate your contribution and would appreciate if you could reopen this ticket there if it is still relevant.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants