Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify type casting in CESQL spec #1281

Merged
23 changes: 23 additions & 0 deletions cesql/spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,10 @@ For example, the pattern `_b*` will accept values `ab`, `abc`, `abcd1` but won't
Both `%` and `_` can be escaped with `\`, in order to be matched literally. For example, the pattern `abc\%` will match
`abc%` but won't match `abcd`.

In cases where the left operand is not a `String`, it MUST be cast to a `String` before the comparison is made.
The pattern of the `LIKE` operator (that is, the right operand of the operator) MUST be a valid string predicate,
otherwise the parse MUST return a parse error.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear... subject LIKE TRUE isn't valid because TRUE isn't a valid string, however, we talk about how there's implicit casting all over the place... so someone may wonder why TRUE isn't implicitly converted into "true". Am I correct in this thinking? Should we be explicit and say that casting isn't allowed in this case? Or perhaps should we allow it for consistency? @jskeet thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would definitely be good to be clear. Even without the "LIKE" aspect, I can't tell offhand whether "TRUE" = true is casting the LHS to Boolean, or the RHS to String. I suspect it's the former, based on bullet 2 in the list in 3.7.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duglin I think the main problem is that currently the LIKE expression isn't defined with an expression as the right operand. It is defined as:

expression NOT? LIKE stringLiteral

So, any value on the right operand which is not a string literal MUST be a parse error currently. We can change this, but I'm not sure if it makes sense. The whole point of a LIKE expression is to compare against a pattern, not a specific value. If someone were to compare against another string value wouldn't it just makes sense to use = or something like that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type casting is whenever an operator is defined for values, but the LIKE operator is currently only defined for a literal, not a value. I'm not 100% sure why this was the initial decision, but that's how it currently works

Copy link
Collaborator

@duglin duglin May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to leave it as "must be a string literal w/o casting", but let's be extra clear in the text that implicit type casting isn't allowed in this one spot. I agree defining it as stringLiteral sort of implies it, but saying that things should be implicitly case in all other spaces (even when the spec says the arg is a string) could be unclear to people.


#### 3.4.4. Exists operator

| Definition | Semantics |
Expand Down Expand Up @@ -353,6 +357,25 @@ left operand of the OR operation evalues to `true`, the right operand MUST NOT b

#### 3.7. Type casting

A CESQL engine MUST support the following type casts:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This talks about "returning" an error - elsewhere, the spec talks about "raising" an error but also returning a value. I suspect we need to be consistent here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see your point. Personally, I think "returning" makes more sense because it's not like an exception that is raised can be caught somewhere - it is just a second return value you get on every evaluation (which may be null/nil). WDYT? If you think "returning" makes more sense, I'll open a PR to change that everywhere, otherwise I'll switch these to be "raising" an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wasn't clear - I'm fine with the "raising" terminology, but the problem is that these casts don't say what the non-error part of the return is when there's a problem. If I cast "BOGUS" to Boolean, an error will be raised... but what's the value of the expression?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see your point - let me add some clarification here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some clarification, could you recheck when you have time @jskeet ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that definitely works.


| Definition | Semantics |
| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `Integer -> String` | Returns the string representation of the integer value in base 10. If the value is less than 0, the '-' character is prepended to the result. |
| `String -> Integer` | Returns the result of interpreting the string as a 32 bit base 10 integer. The string may begin with a leading sign '+' or '-'. If the result will overflow, an error is returned. |
Cali0707 marked this conversation as resolved.
Show resolved Hide resolved
| `String -> Boolean` | Returns `true` or `false` if the lower case representation of the string is exatly "true" or "false, respectively. Otherwise returns an error. |
Cali0707 marked this conversation as resolved.
Show resolved Hide resolved
| `Boolean -> String` | Returns `"true"` if the boolean is `true`, and `"false"` if the boolean is `false`. |

An example of how _Boolean_ values cast to _String_ combines with the case insensitivity of CESQL keywords is that:
```
TRUE = "true" AND FALSE = "false"
```
will evaluate to `true`, while
```
TRUE = "true" OR FALSE = "false"
```
will evaluate to `false.
Cali0707 marked this conversation as resolved.
Show resolved Hide resolved

When the argument types of an operator/function invocation don't match the signature of the operator/function being invoked, the CESQL engine MUST try to perform an implicit cast.

This section defines an **ambiguous** operator/function as an operator/function that is overloaded with another
Expand Down
Loading