-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: Fix numeric overflow of timestamp nano literal #11775
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -104,16 +104,14 @@ public void testByteBufferConversions() { | |||||
.isEqualTo(new byte[] {-128, 26, 6, 0, 0, 0, 0, 0}); | ||||||
assertThat(Literal.of(400000L).to(TimestampType.withZone()).toByteBuffer().array()) | ||||||
.isEqualTo(new byte[] {-128, 26, 6, 0, 0, 0, 0, 0}); | ||||||
// values passed to assertConversion and Literal.of differ because Literal.of(...) assumes | ||||||
// the value is in micros, which gets converted when to(TimestampNanoType) is called | ||||||
assertConversion( | ||||||
400000000L, TimestampNanoType.withoutZone(), new byte[] {0, -124, -41, 23, 0, 0, 0, 0}); | ||||||
400000L, TimestampNanoType.withoutZone(), new byte[] {-128, 26, 6, 0, 0, 0, 0, 0}); | ||||||
assertConversion( | ||||||
400000000L, TimestampNanoType.withZone(), new byte[] {0, -124, -41, 23, 0, 0, 0, 0}); | ||||||
400000L, TimestampNanoType.withZone(), new byte[] {-128, 26, 6, 0, 0, 0, 0, 0}); | ||||||
assertThat(Literal.of(400000L).to(TimestampNanoType.withoutZone()).toByteBuffer().array()) | ||||||
.isEqualTo(new byte[] {0, -124, -41, 23, 0, 0, 0, 0}); | ||||||
.isEqualTo(new byte[] {-128, 26, 6, 0, 0, 0, 0, 0}); | ||||||
assertThat(Literal.of(400000L).to(TimestampNanoType.withZone()).toByteBuffer().array()) | ||||||
Comment on lines
111
to
113
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm a bit confused how the original assertion was passing? Shouldn't have this always been equivalent to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the cause is the original logic called iceberg/api/src/main/java/org/apache/iceberg/expressions/Literals.java Lines 444 to 445 in b9b61b1
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah I see the comment on line 107/108. Could we update the |
||||||
.isEqualTo(new byte[] {0, -124, -41, 23, 0, 0, 0, 0}); | ||||||
.isEqualTo(new byte[] {-128, 26, 6, 0, 0, 0, 0, 0}); | ||||||
|
||||||
// strings are stored as UTF-8 bytes (without length) | ||||||
// 'A' -> 65, 'B' -> 66, 'C' -> 67 | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems correct to me, the previous behavior for this was to assume the value was in microseconds and then pass that through to TimestampLiteral but that can overflow and does not actually represent a nanosecond timestamp!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I still think this is correct but it's worth getting other's perspective on this since we are changing one of the assumptions of how value is interpreted when the type to convert to is nanoseconds.
CC @nastra @epgif @jacobmarble @rdblue thoughts?