Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: trim string before rowsort #137

Merged
merged 1 commit into from
Dec 30, 2022
Merged

fix: trim string before rowsort #137

merged 1 commit into from
Dec 30, 2022

Conversation

wangrunji0408
Copy link
Member

@wangrunji0408 wangrunji0408 commented Dec 30, 2022

Signed-off-by: Runji Wang [email protected]

If the engine returns a value with leading spaces, it will affect the result of rowsort, but won't be recognized in error message since we trim the string before reporting error but after the rowsort. This PR trims the string before rowsort to avoid this problem. Despite this, engines should not return a value with leading or tailing space.

Signed-off-by: Runji Wang <[email protected]>
@TennyZhuang TennyZhuang merged commit 693583b into main Dec 30, 2022
@TennyZhuang TennyZhuang deleted the wrj/trim-before-sort branch December 30, 2022 07:24
xxchan added a commit that referenced this pull request Dec 30, 2022
@xxchan
Copy link
Member

xxchan commented Dec 30, 2022

Hmmm I think this is wrong. First spaces should be respected. Trimming spaces affects correctness.

As for the problem you mentioned, I didn't get it. Can you give an example?

@wangrunji0408
Copy link
Member Author

wangrunji0408 commented Dec 30, 2022

Suppose the engine somehow gives a result: [["1"], [" 2"]] (casted from INT). The rows being compared after rowsort would be [[" 2", "1"]]. However, the error message would look like this:

 (-expected|+actual)
+ 2
  1
- 2

I know it looks weird, but it's possible. Any better solution for this problem? 🤔

First spaces should be respected. Trimming spaces affects correctness.

I'm afraid string values with leading or tailing spaces are not even supported for now. They are not representable in the script.

@xxchan
Copy link
Member

xxchan commented Dec 30, 2022

I'm afraid string values with leading or tailing spaces are not even supported for now. They are not representable in the script.

They are actually supported. Note that validator is fn(actual: &[Vec<String>], expected: &[String]). expected is the whole line AS-IS. So if we use an agreed separator, e.g., \t, and split by \t in a custom validator, we can compare the spaces. (This is duckdb's approach) 😇 related issue #109

@xxchan
Copy link
Member

xxchan commented Dec 30, 2022

casted from INT

What does the "cast" mean? 🤔

And is your example a real case? IIRC the error message should contain original rows instead of normalized rows.

@wangrunji0408
Copy link
Member Author

wangrunji0408 commented Dec 30, 2022

What does the "cast" mean? 🤔

The value type is INT, which is then converted to string.

And is your example a real case?

The case was found by @TennyZhuang when developing the JDBC driver.
You can find the original error message here.

BTW, @TennyZhuang Does this change fix your problem? 👀

@xxchan
Copy link
Member

xxchan commented Dec 30, 2022

That failure is because of this:

image

There's only one row, so sorting does nothing for actual results...

@xxchan
Copy link
Member

xxchan commented Dec 30, 2022

Let me fix it

xxchan added a commit that referenced this pull request Dec 30, 2022
This reverts commit 693583b.

Signed-off-by: xxchan <[email protected]>
wangrunji0408 pushed a commit that referenced this pull request Dec 30, 2022
* Revert "trim string before rowsort (#137)"

This reverts commit 693583b.

Signed-off-by: xxchan <[email protected]>

* fix: use Vec<Vec<String>> for external engine

Signed-off-by: xxchan <[email protected]>

* fix doctest

Signed-off-by: xxchan <[email protected]>

Signed-off-by: xxchan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants