Skip to content

Commit

Permalink
Fix Presto's date_trunc function when dealing with daylight savings (#…
Browse files Browse the repository at this point in the history
…11509)

Summary:
Pull Request resolved: #11509

date_trunc truncates a datetime to a unit (second/day/year/etc.). It is currently implemented for
TSwTZ to do this on the local time and convert it back to a system time which is used as the
timestamp portion of the result TSwTZ. Much like date_add and date_diff, it runs into an issue with
ambiguous times when the unit is less than a day. In the case of ambigous times it will return the
earliest system time that can correspond to the local time, Presto Java however behaves as if the
calculation was done on the system time, meaning there's no ambiguity and it can return either of the
two or more ambiguous times depending on the original timestamp.

To fix this, again like with date_add and date_diff, I've split the logic based on whether the unit is less
than a day or at least a day. If it's less than a day, we convert the timestamp to a local time and do
the truncation then compute the number of milliseconds between the original local timestamp and
the local truncated timestamp.  We then subtract that value from the original timestamp's system
time. This removes the ambiguity, and gets behavior consistent with Presto Java.

If the unit is at least a day, we still do the computation on the local time and convert it back to a
system time as we did originally. This is necessary to handle cases like the days when we switch to
or from daylight savings time being 23 and 25 hours respectively. Again, this is consistent with
Presto Java's behavior.

Reviewed By: kgpai, zacw7

Differential Revision: D65791388

fbshipit-source-id: ac80c97b0e4a83b68d52d8785e92346d52a87882
  • Loading branch information
Kevin Wilfong authored and facebook-github-bot committed Nov 13, 2024
1 parent 08dba0f commit beada2b
Show file tree
Hide file tree
Showing 2 changed files with 91 additions and 6 deletions.
33 changes: 27 additions & 6 deletions velox/functions/prestosql/DateTimeFunctions.h
Original file line number Diff line number Diff line change
Expand Up @@ -1158,21 +1158,42 @@ struct DateTruncFunction : public TimestampWithTimezoneSupport<T> {
}

if (unit == DateTimeUnit::kSecond) {
auto utcTimestamp = unpackTimestampUtc(*timestampWithTimezone);
const auto utcTimestamp = unpackTimestampUtc(*timestampWithTimezone);
result = pack(
utcTimestamp.getSeconds() * 1000,
unpackZoneKeyId(*timestampWithTimezone));
return;
}

auto timestamp = this->toTimestamp(timestampWithTimezone);
const auto timestamp = this->toTimestamp(timestampWithTimezone);
auto dateTime = getDateTime(timestamp, nullptr);
adjustDateTime(dateTime, unit);
timestamp =
Timestamp::fromMillis(Timestamp::calendarUtcToEpoch(dateTime) * 1000);
timestamp.toGMT(*tz::locateZone(unpackZoneKeyId(*timestampWithTimezone)));

result = pack(timestamp, unpackZoneKeyId(*timestampWithTimezone));
uint64_t resultMillis;

if (unit < DateTimeUnit::kDay) {
// If the unit is less than a day, we compute the difference in
// milliseconds between the local timestamp and the truncated local
// timestamp. We then subtract this difference from the UTC timestamp,
// this handles things like ambiguous timestamps in the local time zone.
const auto millisDifference =
timestamp.toMillis() - Timestamp::calendarUtcToEpoch(dateTime) * 1000;

resultMillis = unpackMillisUtc(*timestampWithTimezone) - millisDifference;
} else {
// If the unit is at least a day, we do the truncation on the local
// timestamp and then convert it to a system time directly. This handles
// cases like when a time zone has daylight savings time, a "day" can be
// 25 or 23 hours at the transition points.
auto updatedTimestamp =
Timestamp::fromMillis(Timestamp::calendarUtcToEpoch(dateTime) * 1000);
updatedTimestamp.toGMT(
*tz::locateZone(unpackZoneKeyId(*timestampWithTimezone)));

resultMillis = updatedTimestamp.toMillis();
}

result = pack(resultMillis, unpackZoneKeyId(*timestampWithTimezone));
}

private:
Expand Down
64 changes: 64 additions & 0 deletions velox/functions/prestosql/tests/DateTimeFunctionsTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1923,6 +1923,70 @@ TEST_F(DateTimeFunctionsTest, dateTruncTimestampWithTimezone) {
evaluateDateTrunc("year", 123456789000, "+14:00", 94644000000);
evaluateDateTrunc("year", -123456789000, "-09:30", -126196200000);

// Test cases that land on an ambiguous time.
// The first 1 AM
// 11/3/2024 01:01:01.01 AM GMT-07:00
evaluateDateTrunc(
"second", 1730620861100, "America/Los_Angeles", 1730620861000);
evaluateDateTrunc(
"minute", 1730620861100, "America/Los_Angeles", 1730620860000);
evaluateDateTrunc(
"hour", 1730620861100, "America/Los_Angeles", 1730620800000);

// The second 1AM
// 11/3/2024 01:01:01.01 AM GMT-08:00
evaluateDateTrunc(
"second", 1730624461100, "America/Los_Angeles", 1730624461000);
evaluateDateTrunc(
"minute", 1730624461100, "America/Los_Angeles", 1730624460000);
evaluateDateTrunc(
"hour", 1730624461100, "America/Los_Angeles", 1730624400000);

// Test cases that go back across a "fall back" daylight savings time
// boundary. (GMT-07:00 -> GMT-08:00)
// 11/3/2024 01:01:01.01 AM GMT-08:00
evaluateDateTrunc("day", 1730624461100, "America/Los_Angeles", 1730617200000);
evaluateDateTrunc(
"month", 1730624461100, "America/Los_Angeles", 1730444400000);
evaluateDateTrunc(
"quarter", 1730624461100, "America/Los_Angeles", 1727766000000);
// Technically this circles back again to the same daylight savings time zone,
// but just to make sure we're covered (and it also test leap years).
evaluateDateTrunc(
"year", 1730624461100, "America/Los_Angeles", 1704096000000);

// Test cases that go back across a "spring forward" daylight savings time
// boundary. (GMT-08:00 -> GMT-07:00)
// 3/10/2024 03:00:00 AM GMT-08:00
evaluateDateTrunc("day", 1710064800000, "America/Los_Angeles", 1710057600000);
evaluateDateTrunc(
"month", 1710064800000, "America/Los_Angeles", 1709280000000);
evaluateDateTrunc(
"quarter", 1710064800000, "America/Los_Angeles", 1704096000000);
// Technically this circles back again to the same daylight savings time zone,
// but just to make sure we're covered (and it also test leap years).
evaluateDateTrunc(
"year", 1710064800000, "America/Los_Angeles", 1704096000000);

// Test some cases that are close to hours that don't exist due to DST (it's
// impossible to truncate to a time in the hour that doesn't exist, so we
// don't test that case).
// 3/10/2024 03:01:01.01 AM GMT-08:00
evaluateDateTrunc(
"second", 1710064861100, "America/Los_Angeles", 1710064861000);
evaluateDateTrunc(
"minute", 1710064861100, "America/Los_Angeles", 1710064860000);
evaluateDateTrunc(
"hour", 1710064861100, "America/Los_Angeles", 1710064800000);

// 3/10/2024 01:59:59.999AM GMT-07:00
evaluateDateTrunc(
"second", 1710064799999, "America/Los_Angeles", 1710064799000);
evaluateDateTrunc(
"minute", 1710064799999, "America/Los_Angeles", 1710064740000);
evaluateDateTrunc(
"hour", 1710064799999, "America/Los_Angeles", 1710061200000);

const auto evaluateDateTruncFromStrings = [&](const std::string& truncUnit,
const std::string&
inputTimestamp,
Expand Down

0 comments on commit beada2b

Please sign in to comment.