Skip to content

Conversation

bvolpato
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

Recent validations added at #15334 started to cause problems for some LEFT JOIN queries that don't have conditions, for example:

SELECT a, (SELECT MAX(b) FROM data2) as max_b FROM data

This query gets optimized as a left join without conditions.

This basically follows the same approach from #15334 :

vec![Expr::Literal(ScalarValue::Boolean(Some(true)), None)],

What changes are included in this PR?

For joins without conditions, use cross_join if inner, otherwise use a filter with Boolean(true)

Are these changes tested?

Added a roundtrip test case.

Are there any user-facing changes?

n/a

@github-actions github-actions bot added the substrait Changes to the substrait crate label Oct 15, 2025
@bvolpato-dd bvolpato-dd force-pushed the fix/substrait-scalar-subquery branch from 88a4027 to 5cbfa7b Compare October 15, 2025 04:19
Comment on lines +79 to +82
// For joins without conditions, use cross_join if inner, otherwise use a filter with Boolean(true)
if join_type == JoinType::Inner {
left.cross_join(right.build()?)?.build()
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to take into account is that Postgres does not allow this. In Postgres, it's invalid to provide an inner join without a join condition (although other DBs like MySQL allow it).

My impression is that the change in #15334 was actually trying to adhere to this standard, which is a fair decision, so that means that probably the Substrait path should also adhere to that standard, unless there's a strong reason to diverge.

Don't have a strong opinion about whether if we should automatically convert inner joins without conditions to cross joins or not, but I'd advocate for consistency in both the normal SQL path and the Substrait path

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

substrait Changes to the substrait crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[substrait] Scalar subquery in select not supported

2 participants