Support unordered row_number() on Snowflake, MSSQL, and Teradata #1331

fh-mthomson · 2023-07-08T19:43:12Z

Snowflake: Closes Support row_number() on Snowflake #1332
MSSQL: Expands / revises logic in MS SQL Fix Translation of Unordered row_number() #1316 (Closes row_number() portion of MSSQL Incorrect Translation of Boolean and row_number while Filtering Zero Rows #1233 for MSSQL)
Teradata: Removes Teradata-specific handling (add in Teradata statements #913)

Of note, from https://stackoverflow.com/questions/44105691/row-number-without-order-by per second comment, I opted to use SELECT NULL over SELECT 1 (which also aligns with the implementation for Teradata).

fh-mthomson · 2023-07-08T19:52:11Z

tests/testthat/_snaps/backend-teradata.md

-      SELECT `df`.*, ROW_NUMBER() OVER (PARTITION BY `y` ORDER BY `y`) AS `rown`
+      SELECT
+        `df`.*,
+        ROW_NUMBER() OVER (PARTITION BY `y` ORDER BY (SELECT NULL)) AS `rown`


Interestingly, Teradata (via win_rank_tdata()) was previously defaulting to ORDER BY the result of win_current_group() (added in 4109051#diff-c4f086980e5692d16e3337acd7bfefe3a9315135688133ee4f24ea2944ccba3bR174 by @overmar ).

By generalizing the behavior to fall back to SELECT NULL for all back ends (incl. Snowflake / MSSQL) this now no longer defaults to ordering by the group.

I'm likely not familiar enough with (1) PARTITION vs ORDER and (2) Teradata to determine whether this would have adverse impacts, so would welcome input from more knowledgeable folks!

fh-mthomson · 2023-07-14T22:25:40Z

@mgirlich would you be open to reviewing? thank you!

fh-mthomson · 2023-08-08T22:54:42Z

@mgirlich @hadley would you be open to including this PR in the 2.4.0 release?

Apologies for the direct pester, but it's last (known) Snowflake <> dbplyr blocker for us; would love to close out if the timing works for y'all!

hadley

Thanks for working on this!

@mgirlich I'm ok with allowing an argument-less row_number(); it doesn't really make a lot of sense for databases, but it's still sometimes useful.

R/translate-sql-window.R

hadley · 2023-08-09T12:49:09Z

R/backend-teradata.R

-
-#' @export
-#' @rdname win_over
-win_rank_tdata <- function(f) {


Was this exported in a released dbplyr?

It was, as of 2.3.0 (added in 4109051#diff-c4f086980e5692d16e3337acd7bfefe3a9315135688133ee4f24ea2944ccba3bR167). I'd guess it was primarily to enable row_number() for Teradata backend.

Open to any recommended patterns for deprecation!

hadley · 2023-08-09T12:49:33Z

R/translate-sql-window.R

 #' @rdname win_over
 #' @export
-win_rank <- function(f) {
+win_rank <- function(f, use_default_order_null = FALSE) {


This argument name feels a bit long to me, and needs to be documented. Maybe empty_order = TRUE or order_by = c("required", "optional"), or something along those lines?

Changed to empty_order (defaulting to FALSE to not apply to all backends) and added some docs!

…nowflake

Co-authored-by: Hadley Wickham <[email protected]>

#Conflicts: # man/win_over.Rd

mgirlich · 2023-08-22T10:00:25Z

Thanks for the PR! 😄

fh-mthomson added 3 commits July 8, 2023 11:48

add default for all backends

b474de1

add tests

6c9c880

placeholders

f94bc04

fh-mthomson commented Jul 8, 2023

View reviewed changes

add issue

b8fd9a4

fh-mthomson mentioned this pull request Jul 13, 2023

Respect na.rm = TRUE in pmin() and pmax() for Snowflake #1330

Merged

fh-mthomson and others added 2 commits July 14, 2023 08:27

Merge branch 'main' into mthomson/row_number_snowflake

41e70c7

redo snaps

3ceadf4

hadley reviewed Aug 9, 2023

View reviewed changes

fh-mthomson and others added 6 commits August 9, 2023 20:39

Merge remote-tracking branch 'origin/main' into mthomson/row_number_s…

45d7d49

…nowflake

Update R/translate-sql-window.R

1a623ee

Co-authored-by: Hadley Wickham <[email protected]>

pull in latest news

af5ccba

tweak news

f4447fd

update and document arg

3da7cf7

nit comment

1c647c2

fh-mthomson mentioned this pull request Aug 11, 2023

Fix glue_sql2() uses #1352

Merged

mgirlich added 3 commits August 22, 2023 08:28

Merge commit 'c4df0de58582f3c8edbad6ee9d70de6f1adf6497'

2874b88

#Conflicts: # man/win_over.Rd

check empty_order argument

c2a1911

Re-add win_rank_tdata()

12c71f1

mgirlich merged commit 9ff37e0 into tidyverse:main Aug 22, 2023
14 checks passed

This was referenced Aug 22, 2023

MSSQL Incorrect Translation of Boolean and row_number while Filtering Zero Rows #1233

Closed

MS SQL Fix Translation of Unordered row_number() #1316

Closed

fh-mthomson deleted the mthomson/row_number_snowflake branch October 24, 2023 21:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support unordered row_number() on Snowflake, MSSQL, and Teradata #1331

Support unordered row_number() on Snowflake, MSSQL, and Teradata #1331

fh-mthomson commented Jul 8, 2023 •

edited

Loading

fh-mthomson Jul 8, 2023

fh-mthomson commented Jul 14, 2023

fh-mthomson commented Aug 8, 2023

hadley left a comment

hadley Aug 9, 2023

fh-mthomson Aug 10, 2023

hadley Aug 9, 2023

fh-mthomson Aug 10, 2023

mgirlich commented Aug 22, 2023

Support unordered row_number() on Snowflake, MSSQL, and Teradata #1331

Support unordered row_number() on Snowflake, MSSQL, and Teradata #1331

Conversation

fh-mthomson commented Jul 8, 2023 • edited Loading

fh-mthomson Jul 8, 2023

Choose a reason for hiding this comment

fh-mthomson commented Jul 14, 2023

fh-mthomson commented Aug 8, 2023

hadley left a comment

Choose a reason for hiding this comment

hadley Aug 9, 2023

Choose a reason for hiding this comment

fh-mthomson Aug 10, 2023

Choose a reason for hiding this comment

hadley Aug 9, 2023

Choose a reason for hiding this comment

fh-mthomson Aug 10, 2023

Choose a reason for hiding this comment

mgirlich commented Aug 22, 2023

fh-mthomson commented Jul 8, 2023 •

edited

Loading