-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support tuples as types #11896
Support tuples as types #11896
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why
However, InListExpr is not able to compare with nested type. fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
let num_rows = batch.num_rows();
let value = self.expr.evaluate(batch)?;
let r = match &self.static_filter {
Some(f) => f.contains(value.into_array(num_rows)?.as_ref(), self.negated)?,
None => {
let value = value.into_array(num_rows)?;
let found = self.list.iter().map(|expr| expr.evaluate(batch)).try_fold(
BooleanArray::new(BooleanBuffer::new_unset(num_rows), None),
|result, expr| -> Result<BooleanArray> {
Ok(or_kleene(
&result,
&eq(&value, &expr?.into_array(num_rows)?)?,
)?)
},
)?;
if self.negated {
not(&found)?
} else {
found
}
}
};
Ok(ColumnarValue::Array(Arc::new(r)))
} We need to change the |
Thanks so much for the pointer @jayzhan211, this should now be fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @samuelcolvin -- this looks very impressive to me. Beautifully simple
I had some small suggestions, but I don't think they are required to merge
Very nicely done
@@ -780,6 +781,31 @@ fn coerce_numeric_type_to_decimal256(numeric_type: &DataType) -> Option<DataType | |||
} | |||
} | |||
|
|||
fn struct_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please document what the intended rules for coercion are, in English, as a comment? That would help me understand if the code implements the intent correctly
I think the result is if all the types can be cocerced, then the output is a struct with fields named c1
, c2
, etc
I wonder if it would be better / easier to understand if the names of the fields were preserved (aka the field names from lhs_fields)
self.parse_struct(schema, planner_context, values, vec![]) | ||
} | ||
None => not_impl_err!("Empty tuple not supported yet"), | ||
_ => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to figure out how to write a negative test for this case, however anything I could try seems to work
DataFusion CLI v41.0.0
create table foo as values (1);
> select * from foo where (column1) IN ((column1+1), (2));
+---------+
| column1 |
+---------+
+---------+
0 row(s) fetched.
Elapsed 0.008 seconds.
> select * from foo where (column1) IN ((column1+1), (2), (1));
+---------+
| column1 |
+---------+
| 1 |
+---------+
1 row(s) fetched.
Elapsed 0.010 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
query error DataFusion error: This feature is not implemented: Only identifiers and literals are supported in tuples
select a from values where (a, c) IN ((abs(a), 1));
Query success in duckdb
D select a from t where (a, b) in ((1, abs(a)));
┌────────┐
│ a │
│ int32 │
├────────┤
│ 0 rows │
└────────┘
We could support abs()
and other complex type in tuple in follow PR, but we should check not only first
element of the values but all the values in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will file a ticket to track the feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
query error DataFusion error: This feature is not implemented: Only identifiers and literals are supported in tuples select a from values where (a, c) IN ((abs(a), 1));Query success in duckdb
D select a from t where (a, b) in ((1, abs(a))); ┌────────┐ │ a │ │ int32 │ ├────────┤ │ 0 rows │ └────────┘
We could support
abs()
and other complex type in tuple in follow PR, but we should check not onlyfirst
element of the values but all the values in this PR
I am not sure about whether our evaulation support struct array like {(struct(Expr1{column1}, 1)), struct (Expr2{column2}, 2))}
?
true | ||
|
||
query I | ||
select a from values where (a, c) = (1, 'a'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend we also do a negative test, like
select a from values where (a, c) = (100, 'a');
.into_iter() | ||
.enumerate() | ||
.map(|(i, datatype)| { | ||
Arc::new(Field::new(format!("c{i}"), datatype, true)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should preserve field name and nullable, only type is updated
planner_context: &mut PlannerContext, | ||
values: Vec<SQLExpr>, | ||
) -> Result<Expr> { | ||
match values.first() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check all
Thanks @samuelcolvin. This is exactly what I wanted to do. |
I took the liberty of merging up from main to resolve a conflict -- I think we could merge this PR and address the comments as a follow on PR, or we can do it prior to merging this one. Please let me know what you prefer @samuelcolvin |
Thanks again everyone 🚀 |
Which issue does this PR close?
Closes #6635, continues from #8590.
Rationale for this change
See #6635 — I want to be able to use tuples as per that issue
What changes are included in this PR?
Are these changes tested?
yes.
Are there any user-facing changes?
More SQL supported.
(Fixed, ignore this)
This is mostly working and(a, b) in (('x', 'y'))
is working correctly, however if you have more than one value in the lookup, e.g.('x', 'y') IN (('x', 'y'), ('y', 'x'))
an error occurs:There are two tests that should fail.