-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parquet file doesn't mark date and timestamp columns #7
Comments
Yes, these Parquet files are just converted from the source data. What will be the easiest way to do this? |
It would definitely be useful to include this information in the Parquet file so that I don't have to explicitly convert it on load, especially for a stateless table engine. |
Same happen with Strings, which are just binaries that are not being marked as Strings. Apache Pinot is quite strict about that and when importing data from parquet (which is quite faster than using tsv) the data is not imported as UTF-8 but as a byte array (because that is what parquet says) For example:
|
The Parquet file metadata doesn't include logical type information that (for instance) EventDate is a date and EventTime is a timestamp.
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md
The text was updated successfully, but these errors were encountered: