Skip to content

Commit

Permalink
parse_datetime() should allow UCT|UCT|GMT|GMT0 as 'Z'
Browse files Browse the repository at this point in the history
Summary:
The Joda library accepts UCT|GMT|UCT and GMT0 as input for the 'Z'
(capital case) identifier, even though capital Z means timezone offset. Adding
support for compatibility with Joda, which is used in Presto java.

Reviewed By: mbasmanova

Differential Revision: D54346046

fbshipit-source-id: 82af2fea7eee47bd0eaf854b16c693e3d43d059a
  • Loading branch information
pedroerp authored and facebook-github-bot committed Feb 29, 2024
1 parent 93b2544 commit 2745069
Show file tree
Hide file tree
Showing 3 changed files with 36 additions and 2 deletions.
7 changes: 6 additions & 1 deletion velox/docs/functions/presto/datetime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,12 @@ The functions in this section leverage a native cpp implementation that follows
a format string compatible with JodaTime’s `DateTimeFormat
<http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html>`_
pattern format. The symbols currently supported are ``y``, ``Y``, ``M`` , ``d``,
``H``, ``m``, ``s``, ``S``, and ``Z``.
``H``, ``m``, ``s``, ``S``, ``z`` and ``Z``.

``z`` represents a timezone name (3-letter format), and ``Z`` a timezone offset
specified using the format ``+00``, ``+00:00`` or ``+0000`` (or ``-``). ``Z``
also accepts ``UTC``, ``UCT``, ``GMT``, and ``GMT0`` as valid representations
of GMT.

.. function:: parse_datetime(string, format) -> timestamp with time zone

Expand Down
16 changes: 15 additions & 1 deletion velox/functions/lib/DateTimeFormatter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -410,11 +410,25 @@ int64_t parseTimezoneOffset(const char* cur, const char* end, Date& date) {
return 3;
}
}
// Single 'Z' character maps to GMT
// Single 'Z' character maps to GMT.
else if (*cur == 'Z') {
date.timezoneId = 0;
return 1;
}
// "UTC", "UCT", "GMT" and "GMT0" are also acceptable by joda.
else if ((end - cur) >= 3) {
if (std::strncmp(cur, "UTC", 3) == 0 ||
std::strncmp(cur, "UCT", 3) == 0) {
date.timezoneId = 0;
return 3;
} else if (std::strncmp(cur, "GMT", 3) == 0) {
date.timezoneId = 0;
if ((end - cur) >= 4 && *(cur + 3) == '0') {
return 4;
}
return 3;
}
}
}
return -1;
}
Expand Down
15 changes: 15 additions & 0 deletions velox/functions/prestosql/tests/DateTimeFunctionsTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2638,6 +2638,21 @@ TEST_F(DateTimeFunctionsTest, parseDatetime) {
EXPECT_EQ(
TimestampWithTimezone(-66600000, util::getTimeZoneID("+02:00")),
parseDatetime("1969-12-31+07:30+02:00", "YYYY-MM-dd+HH:mmZZ"));

// Joda also lets 'Z' to be UTC|UCT|GMT|GMT0.
auto ts = TimestampWithTimezone(1708840800000, util::getTimeZoneID("GMT"));
EXPECT_EQ(
ts, parseDatetime("2024-02-25+06:00:99 GMT", "yyyy-MM-dd+HH:mm:99 ZZZ"));
EXPECT_EQ(
ts, parseDatetime("2024-02-25+06:00:99 GMT0", "yyyy-MM-dd+HH:mm:99 ZZZ"));
EXPECT_EQ(
ts, parseDatetime("2024-02-25+06:00:99 UTC", "yyyy-MM-dd+HH:mm:99 ZZZ"));
EXPECT_EQ(
ts, parseDatetime("2024-02-25+06:00:99 UTC", "yyyy-MM-dd+HH:mm:99 ZZZ"));

VELOX_ASSERT_THROW(
parseDatetime("2024-02-25+06:00:99 PST", "yyyy-MM-dd+HH:mm:99 ZZZ"),
"Invalid format: \"2024-02-25+06:00:99 PST\" is malformed at \"PST\"");
}

TEST_F(DateTimeFunctionsTest, formatDateTime) {
Expand Down

0 comments on commit 2745069

Please sign in to comment.