Introduce `UserDefinedLogicalNodeUnparser` for User-defined Logical Plan unparsing #13880

goldmedal · 2024-12-22T10:07:45Z

Which issue does this PR close?

Closes #13753 .

Rationale for this change

See the previous discussion for the design: #13753 (comment)

UserDefinedLogicalNodeUnparser provides two APIs for user-defined behavior:

unparse: Unparse the custom logical node to SQL within a statement.
unparse_to_statement: Unparse the custom logical node to a statement.

What changes are included in this PR?

Introduce UserDefinedLogicalNodeUnparser
Make the AST builders public
Add examples for unparsing custom logical plan

Are these changes tested?

yes

Are there any user-facing changes?

New trait and API

goldmedal · 2024-12-22T12:24:58Z

Could @phillipleblanc or @sgrebnov take a look at this? Thanks

sgrebnov · 2024-12-22T20:32:56Z

datafusion/sql/src/unparser/mod.rs

+    /// The child unparsers are called iteratively.
+    /// There are two methods in [`Unparser`] will be called:
+    /// - `extension_to_statement`: This method is called when the custom logical node is a custom statement.
+    ///     If multiple child unparsers return a non-None value, the last unparsing result will be returned.


@goldmedal - I'm not sure using the last unparsing result is the expected behavior. As a user, I would expect to get the result from the first udlp_unparser that supports this node and stop checking the remaining udlp_unparsers instead.

Is there a specific use case / reason for using the last supported udlp_unparser? They can be dynamically registered and the last one should override perviously registered? To match unparse behavior where we don't know/track if unparsing is applied so we always apply all?

Yeah, I would also expect this to short-circuit and have the first one win.

Is there a specific use case / reason for using the last supported udlp_unparser? They can be dynamically registered and the last one should override perviously registered.

Actually, I don't have a real case for it but yes, I imagined the user can append the new unparser to override the previous.
However, I guess it could be a rare use case (?

As a user, I would expect to get the result from the first udlp_unparser that supports this node and stop checking the remaining udlp_unparsers instead.

Maybe you guys are right. We should make the first one win. It's more efficient and simpler.
It's also how ExprPlanner worked in the planner.

datafusion/datafusion/sql/src/expr/mod.rs

Lines 222 to 229 in 75202b5

for planner in self.context_provider.get_expr_planners() {

match planner.plan_extract(extract_args)? {

PlannerResult::Planned(expr) => return Ok(expr),

PlannerResult::Original(args) => {

extract_args = args;

}

}

}

Anyway, I'll change it. Thanks!

sgrebnov · 2024-12-22T20:44:58Z

datafusion/sql/src/unparser/plan.rs

+        select: &mut Option<&mut SelectBuilder>,
+        relation: &mut Option<&mut RelationBuilder>,
+    ) -> Result<()> {
+        for unparser in &self.udlp_unparsers {


It might be good to add indication that unparse applied to be consistent with unparse_to_statement and throw error if non of registered udlps applied / successfully processed the node.

Yeah, if none of the extension unparsers were able to process it, it should throw an error IMO

Sound greats. I think we could provide some enum for it like

UnparsedResult::Unparsed() UnparsedResult::Original() ..

I'll try it

sgrebnov · 2024-12-22T20:45:39Z

@goldmedal - looks great, two minor questions/comments

phillipleblanc

Thanks @goldmedal! I have a few minor comments, but this this a good upgrade for the unparser!

phillipleblanc · 2024-12-23T02:08:35Z

datafusion/sql/src/unparser/mod.rs

+    ///     If multiple child unparsers return a non-None value, the last unparsing result will be returned.
+    /// - `extension_to_sql`: This method is called when the custom logical node is part of a statement.
+    ///    If multiple child unparsers are registered for the same custom logical node, all of them will be called in order.
+    pub fn with_udlp_unparsers(


Not a fan of this name - udlp takes effort to understand what it means. How about renaming udlp_* to extension_*? i.e. with_extension_unparsers. It conveys the same meaning in an easier to understand way.

Sounds great. I'll rename it.

phillipleblanc · 2024-12-23T02:09:35Z

datafusion/sql/src/unparser/mod.rs

+    /// The child unparsers are called iteratively.
+    /// There are two methods in [`Unparser`] will be called:
+    /// - `extension_to_statement`: This method is called when the custom logical node is a custom statement.
+    ///     If multiple child unparsers return a non-None value, the last unparsing result will be returned.


Yeah, I would also expect this to short-circuit and have the first one win.

phillipleblanc · 2024-12-23T02:10:10Z

datafusion/sql/src/unparser/mod.rs

+    /// - `extension_to_statement`: This method is called when the custom logical node is a custom statement.
+    ///     If multiple child unparsers return a non-None value, the last unparsing result will be returned.
+    /// - `extension_to_sql`: This method is called when the custom logical node is part of a statement.
+    ///    If multiple child unparsers are registered for the same custom logical node, all of them will be called in order.


I think this should also short-circuit and only do the first one?

phillipleblanc · 2024-12-23T02:17:46Z

datafusion/sql/src/unparser/plan.rs

+        select: &mut Option<&mut SelectBuilder>,
+        relation: &mut Option<&mut RelationBuilder>,
+    ) -> Result<()> {
+        for unparser in &self.udlp_unparsers {


Yeah, if none of the extension unparsers were able to process it, it should throw an error IMO

phillipleblanc · 2024-12-23T02:18:26Z

datafusion/sql/src/unparser/udlp_unparser.rs

I would also rename this file to extension_unparser.rs

phillipleblanc · 2024-12-23T02:21:27Z

datafusion/sql/tests/cases/plan_to_sql.rs

+    if let Some(err) = plan_to_sql(&plan).err() {
+        assert_eq!(
+            err.to_string(),
+            "External error: `relation` must be initialized"


This error is expected?

It's the original error for unsupported relation plan 🤔 I guess it could be improved by #13880 (comment)

phillipleblanc

✅

phillipleblanc · 2024-12-23T13:55:59Z

datafusion/sql/src/unparser/extension_unparser.rs

+}
+
+/// The result of unparsing a custom logical node.
+pub enum UnparseResult {


I would use two separate enums here since one variant can never be emitted in one of the functions, and vice-versa - better type safety, and removes one arm of the match clause that just returns an error anyway.

Also perhaps Unmodified or Unmatched instead of Original? It's not immediately obvious what an UnparseResult::Original means, but that is quite minor - also fine to leave as is.

Sounds great. It can avoid the potential user error. I did it in 7b6c37f. Thanks!

phillipleblanc

LGTM!

alamb

I took a very brief look at this PR and it looks really nice -- thank you @goldmedal

I suspect part of why it looks great is due to the reviews from @phillipleblanc and @sgrebnov -- thank you both as well for helping to keep the code flowing

alamb · 2024-12-24T14:26:08Z

datafusion-examples/examples/plan_to_sql.rs

+/// This example demonstrates how to unparse a custom logical plan as a statement.
+/// The custom logical plan is a simple extension of the logical plan that reads from a parquet file.
+/// It can be unparse as a statement that reads from the same parquet file.
+async fn unparse_my_logical_plan_as_statement() -> Result<()> {


this is very cool

…parser

alamb · 2024-12-25T12:21:25Z

I merged up this PR to resolve a conflict

goldmedal · 2024-12-25T12:46:57Z

I merged up this PR to resolve a conflict

Thanks!

goldmedal · 2024-12-25T14:25:38Z

Thanks @phillipleblanc , @sgrebnov and @alamb for the reviews 🙇

goldmedal added 5 commits December 22, 2024 14:33

make ast builder public

c76dbae

introduce udlp unparser

2335276

add documents

640bc93

add examples

4a32991

add negative tests and fmt

35dac96

github-actions bot added the sql SQL Planner label Dec 22, 2024

goldmedal mentioned this pull request Dec 22, 2024

Support unparsing LogicalPlan::Extension to SQL tesxt #13753

Closed

fix the doc

85fb3a4

goldmedal marked this pull request as ready for review December 22, 2024 12:23

sgrebnov reviewed Dec 22, 2024

View reviewed changes

sgrebnov approved these changes Dec 22, 2024

View reviewed changes

phillipleblanc reviewed Dec 23, 2024

View reviewed changes

goldmedal marked this pull request as draft December 23, 2024 09:14

goldmedal added 3 commits December 23, 2024 17:22

rename udlp to extension

abc23f0

apply the first unparsing result only

5233e4a

improve the doc

856a5f5

goldmedal marked this pull request as ready for review December 23, 2024 12:01

phillipleblanc approved these changes Dec 23, 2024

View reviewed changes

goldmedal added 2 commits December 24, 2024 19:18

seperate the enum for the unparsing result

7b6c37f

fix the doc

b2654df

phillipleblanc approved these changes Dec 24, 2024

View reviewed changes

alamb approved these changes Dec 24, 2024

View reviewed changes

Merge remote-tracking branch 'apache/main' into feature/13753-udlp-un…

0bae30a

…parser

goldmedal merged commit 482b489 into apache:main Dec 25, 2024
24 checks passed

goldmedal deleted the feature/13753-udlp-unparser branch December 25, 2024 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce `UserDefinedLogicalNodeUnparser` for User-defined Logical Plan unparsing #13880

Introduce `UserDefinedLogicalNodeUnparser` for User-defined Logical Plan unparsing #13880

goldmedal commented Dec 22, 2024 •

edited

Loading

goldmedal commented Dec 22, 2024

sgrebnov Dec 22, 2024 •

edited

Loading

phillipleblanc Dec 23, 2024

goldmedal Dec 23, 2024

sgrebnov Dec 22, 2024

phillipleblanc Dec 23, 2024

goldmedal Dec 23, 2024

sgrebnov commented Dec 22, 2024

phillipleblanc left a comment

phillipleblanc Dec 23, 2024

goldmedal Dec 23, 2024

phillipleblanc Dec 23, 2024

phillipleblanc Dec 23, 2024

phillipleblanc Dec 23, 2024

phillipleblanc Dec 23, 2024

phillipleblanc Dec 23, 2024

goldmedal Dec 23, 2024

phillipleblanc left a comment

phillipleblanc Dec 23, 2024

goldmedal Dec 24, 2024

phillipleblanc left a comment

alamb left a comment

alamb Dec 24, 2024

alamb commented Dec 25, 2024

goldmedal commented Dec 25, 2024

goldmedal commented Dec 25, 2024

	for planner in self.context_provider.get_expr_planners() {
	match planner.plan_extract(extract_args)? {
	PlannerResult::Planned(expr) => return Ok(expr),
	PlannerResult::Original(args) => {
	extract_args = args;
	}
	}
	}

Introduce UserDefinedLogicalNodeUnparser for User-defined Logical Plan unparsing #13880

Introduce UserDefinedLogicalNodeUnparser for User-defined Logical Plan unparsing #13880

Conversation

goldmedal commented Dec 22, 2024 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

goldmedal commented Dec 22, 2024

sgrebnov Dec 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sgrebnov commented Dec 22, 2024

phillipleblanc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phillipleblanc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phillipleblanc left a comment

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb commented Dec 25, 2024

goldmedal commented Dec 25, 2024

goldmedal commented Dec 25, 2024

Introduce `UserDefinedLogicalNodeUnparser` for User-defined Logical Plan unparsing #13880

Introduce `UserDefinedLogicalNodeUnparser` for User-defined Logical Plan unparsing #13880

goldmedal commented Dec 22, 2024 •

edited

Loading

sgrebnov Dec 22, 2024 •

edited

Loading