Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Section 9.2, scalar subquery coercion is incorrect for row value expressions. #5

Open
rchowell opened this issue Nov 1, 2023 · 0 comments

Comments

@rchowell
Copy link
Contributor

rchowell commented Nov 1, 2023

Page 34 states,

SELECT VALUE {'foo': v.foo}
FROM anotherDataSet v
WHERE (v.a, v.b) = (SELECT VALUE [w.c, w,d]    -- call this `subq`
                    FROM someDataSet w
                    WHERE w.sth = v.sthelse)

The SELECT VALUE constructor is an array, which means the type of subq is <bag<list<w.c,w.d>>. This is incorrect and not SQL compatible. What this section should read is something along the lines of

> A subquery with the SQL-SELECT coerces into an array when it is the rhs (respectively, lhs) of 
a comparison operator whose other argument is an array.

The value is coerced to an array via COLL_TO_ROW which asserts the bag has one tuple and each binding tuple 
value becomes an array element in order via `{ a_0: v_0, ...., a:_n: v_n } -> [ v_0, ..., v_n ]`.  

We must also include static assertion of row value comparison compatibility.

Update

After implementing as a planning step, I have missed the importance of " (as opposed to a subquery starting with SELECT VALUE or PIVOT)" from section 9 which essentially says only coerce SELECT subqueries. For compatibility with SQL and row-value queries, I suggest the rewrite of COLL_SCALAR is generalized to

-- example 1, lhs is scalar

1 = (SELECT a FROM T)

1 = ONE(SELECT VALUE a FROM T)           -- this is different than normal SELECT to SELECT-VALUE rewrite

-- example 2, lhs is array

(1,2) = (SELECT a, b FROM T)             -- we can check degree before applying the rewrite

(1,2) = ONE(SELECT VALUE [a, b] FROM T)  -- 

(1,2) = ONE(SELECT * FROM T)             -- we are able to check degree here with the resolved plan

-- `ONE` returns the one and only element of the collection, throwing a runtime exception if more than one
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant