Skip to content

[SPARK-51423][SQL] Add the current_time() function for TIME datatype #50336

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from

Conversation

the-sakthi
Copy link
Member

@the-sakthi the-sakthi commented Mar 20, 2025

What changes were proposed in this pull request?

This PR adds support for a new function current_time() which returns the current time at the start of query evaluation.

# happy cases
scala> spark.sql("SELECT current_time(0);").show()
+---------------+
|current_time(0)|
+---------------+
|       17:11:26|
+---------------+

scala> spark.sql("SELECT current_time(3);").show()
+---------------+
|current_time(3)|
+---------------+
|   17:11:50.225|
+---------------+

scala> spark.sql("SELECT current_time(6);").show()
+---------------+
|current_time(6)|
+---------------+
|17:12:00.734735|
+---------------+

# No braces and Empty braces
scala> spark.sql("SELECT current_time;").show()
+---------------+
|current_time(6)|
+---------------+
|17:12:23.132088|
+---------------+


scala> spark.sql("SELECT current_time();").show()
+---------------+
|current_time(6)|
+---------------+
|17:12:26.718602|
+---------------+

# foldability
## Nested Arithmetic
scala> spark.sql("SELECT current_time((4 - 2) * (1 + 1));").show()
+---------------------------------+
|current_time(((4 - 2) * (1 + 1)))|
+---------------------------------+
|                    17:13:04.4647|
+---------------------------------+

## Casting String literals
scala> spark.sql("SELECT current_time(CAST(' 0005 ' AS INT));").show()
+---------------------------------+
|current_time(CAST( 0005  AS INT))|
+---------------------------------+
|                   17:13:26.28039|
+---------------------------------+

scala> spark.sql("SELECT current_time('5');").show()
+---------------+
|current_time(5)|
+---------------+
| 22:34:07.65007|
+---------------+


## Combine Cast and Arithmetic
scala> spark.sql("SELECT current_time(CAST('4' AS INT) * CAST('1' AS INT));").show()
+-----------------------------------------------+
|current_time((CAST(4 AS INT) * CAST(1 AS INT)))|
+-----------------------------------------------+
|                                  17:14:06.7151|
+-----------------------------------------------+

# failure cases
scala> spark.sql("SELECT current_time(-1);").show()
org.apache.spark.sql.catalyst.ExtendedAnalysisException: [DATATYPE_MISMATCH.VALUE_OUT_OF_RANGE] Cannot resolve "current_time(-1)" due to data type mismatch: The `precision` must be between [0, 6] (current value = -1). SQLSTATE: 42K09; line 1 pos 7;
'Project [unresolvedalias(current_time(-1))]
+- OneRowRelation'

scala> spark.sql("SELECT current_time('foo');").show()
org.apache.spark.SparkNumberFormatException: [CAST_INVALID_INPUT] The value 'foo' of the type "STRING" cannot be cast to "INT" because it is malformed. Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead. SQLSTATE: 22018
== SQL (line 1, position 8) ==
SELECT current_time('foo');

scala> spark.sql("SELECT current_time(2,2);").show()
org.apache.spark.sql.AnalysisException: [WRONG_NUM_ARGS.WITHOUT_SUGGESTION] The `current_time` requires [0, 1] parameters but the actual number is 2. Please, refer to 'https://spark.apache.org/docs/latest/sql-ref-functions.html' for a fix. SQLSTATE: 42605; line 1 pos 7


# All calls of current_time within the same query should return the same value.
scala> val df = spark.sql("""
     |   SELECT
     |     current_time AS col1,
     |     current_time() AS col2,
     |  current_time(0) AS col3,
     |  current_time(1) AS col4,
     |  current_time(2) AS col5,
     |  current_time(3) AS col6,
     |  current_time(4) AS col7,
     |  current_time(5) AS col8,
     |  current_time(6) AS col9,
     |     current_time AS col10
     | """)
val df: org.apache.spark.sql.DataFrame = [col1: time(6), col2: time(6) ... 8 more fields]

scala> df.show()
+---------------+---------------+--------+----------+-----------+-----------+-------------+--------------+---------------+---------------+
|           col1|           col2|    col3|      col4|       col5|       col6|         col7|          col8|           col9|          col10|
+---------------+---------------+--------+----------+-----------+-----------+-------------+--------------+---------------+---------------+
|17:15:47.680648|17:15:47.680648|17:15:47|17:15:47.6|17:15:47.68|17:15:47.68|17:15:47.6806|17:15:47.68064|17:15:47.680648|17:15:47.680648|
+---------------+---------------+--------+----------+-----------+-----------+-------------+--------------+---------------+---------------+

Why are the changes needed?

Adds a built-in current_time([n]) function returning just the time portion (in a TIME(n) type). This aligns Spark with other SQL systems offering a native time function, improves convenience for time-only queries, and complements existing functions like current_date and current_timestamp.

Does this PR introduce any user-facing change?

Yes, adds a new function. Users can now get the current time using this function.

How was this patch tested?

Manual testing as shown above and running UTs added:

$ build/sbt "test:testOnly *TimeExpressionsSuite.scala"
$ build/sbt "test:testOnly *ComputeCurrentTimeSuite.scala"
$ build/sbt "test:testOnly *ResolveInlineTablesSuite.scala
$ build/sbt "test:testOnly *AnalysisSuite.scala

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Mar 20, 2025
Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... which returns the current time at the start of query evaluation.

I think the current implementation doesn't satisfy the requirement, and every call in a query returns new current time. Could you look at the CurrentTimestampLike, and the rule:

/**
* Computes the current date and time to make sure we return the same result in a single query.
*/
object ComputeCurrentTime extends Rule[LogicalPlan] {

@the-sakthi
Copy link
Member Author

Ahh sure! Thanks for the early review, Max. Will refer CurrentTimestampLike to better cater to the requirement.

@the-sakthi the-sakthi marked this pull request as ready for review March 31, 2025 12:07
@the-sakthi the-sakthi requested a review from MaxGekk March 31, 2025 12:07
@the-sakthi
Copy link
Member Author

Thanks for the feedback @MaxGekk , let me know how this one looks!

@the-sakthi the-sakthi changed the title [WIP][SPARK-51423][SQL] Add the current_time() function for TIME datatype [SPARK-51423][SQL] Add the current_time() function for TIME datatype Mar 31, 2025
@the-sakthi
Copy link
Member Author

Thanks for the feedback @MaxGekk , let me know how this revision one looks! Hopefully this is better aligned to the review comments!

@MaxGekk
Copy link
Member

MaxGekk commented Apr 3, 2025

@the-sakthi Thank you for the ping. I am traveling right now. I will look at the PR slightly latter.

@the-sakthi
Copy link
Member Author

Thanks for the response, Max, even while traveling. Will await your review!

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm in general. Let me review it again tomorrow.

@the-sakthi
Copy link
Member Author

Sure thanks @MaxGekk ! Will wait.

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except of a comment.

@MaxGekk
Copy link
Member

MaxGekk commented Apr 9, 2025

@the-sakthi Please, resolve conflicts.

- Handling foldable expressions via multiple constructors approach
  rather than expressionBuilder
- Added more tests
- Split lines to avoid using scalastyle line limit config
- Other review comments
@the-sakthi
Copy link
Member Author

@MaxGekk The latest push has the updated user description and rebased with latest main branch. Let me know!
Thanks.

@the-sakthi the-sakthi requested a review from MaxGekk April 9, 2025 21:13
@the-sakthi the-sakthi requested a review from MaxGekk April 10, 2025 08:57
@MaxGekk
Copy link
Member

MaxGekk commented Apr 11, 2025

+1, LGTM. Merging to master.
Thank you, @the-sakthi.

@MaxGekk MaxGekk closed this in c577ae7 Apr 11, 2025
@the-sakthi
Copy link
Member Author

Thank you very much @MaxGekk !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants