Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward compatibility with dplyr 2.5.0 #338

Closed
wants to merge 1 commit into from

Conversation

hadley
Copy link

@hadley hadley commented Feb 21, 2024

This seems like a pretty uncompelling set of changes and I feel like it's likely to substantially negatively impact the usability of your package. Do you agree? If so, I'll see what I can change in dbplyr to make 2.5.0 more appealing.

@hadley
Copy link
Author

hadley commented Feb 22, 2024

Hmmmm, maybe I misinterpreted what's going on here because it looks like you are suppling a custom translation via sql_translation.wfsConnection so your functions should already be registered.

So the problem is likely due to tidyverse/dbplyr#1430 — because dbplyr translates all the arguments to a function before passing it to the function. But in this case, you want the arguments to the function to be regular R objects because your function is going to do the translation. Hmmmmmmmmmmmmmmm.

@hadley
Copy link
Author

hadley commented Feb 22, 2024

Yeah, so a better fix for (e.g.) WITHIN would be:

bcdc_query_geodata("bc-airports") %>%
  filter(WITHIN(local(local))) %>%
  collect()

And unfortunately !! doesn't work here (possibly it should, but I'm not sure it's worth reasoning out the implications).

So the problem here is not declaring additional translations but declaring that WITHIN and friends expect a local object, not a database column. That unfortunately makes them different to the existing dbplyr translations so I don't have a good model for how this should work.

But I think this means I'm being too aggressive at erroring on unknown types in partial_eval_sym() and the best option will just be to revert to the previous behaviour. If I do attempt to tackle this again in the future, it would probably be better to do type checking in individual translation functions so that we can be more specific about what the expected inputs are (e.g. either a whole number or a variable name).

hadley added a commit to tidyverse/dbplyr that referenced this pull request Feb 22, 2024
Partially reverts #1368. See bcgov/bcdata#338 for details. If we do this again in the future, I think we have to do argument checking in individual translations. That's a lot of work but allows a whole suite of potentially useful translations for more complex R objects.
@ateucher
Copy link
Collaborator

Thanks @hadley for taking a look at this. Yeah, those uppercase predicates (WITHIN, INTERSECTS) etc. take a local sf object and translate it to a string to pass as a query parameter in the GET/POST request to the WFS server.

We have precedent for asking users to use local() to force local evaluation of functions (https://bcgov.github.io/bcdata/articles/local-filter.html), so it wouldn't be the end of the world to enforce the same thing for local (non-function) objects inside WITHIN and friends, but it would be a breaking change.

It is a shame that !! doesn't work, as it's a bit neater (but I understand not wanting to go down a new rabbit hole). EDIT: It does work with an atomic vector (see below with a bbox object, which is a double of length 4).

Your idea of being able to register a function as expecting a local object would be really nice.

Putting this here mostly for my edification:

library(bcdata)
library(sf)

x <- bcdc_query_geodata('d1aff64e-dbfe-45a6-af97-582b7f6418b9') |> 
  filter(ADMIN_AREA_NAME == "Cariboo Regional District") |> 
  collect()

airports <- '76b1b7a3-2112-4444-857a-afccf7b20da8'

# original behaviour, no longer works with dev dbplyr 2.5.0
res <- bcdc_query_geodata(airports) |> 
  filter(WITHIN(x)) |> 
  collect()
#> Error:
#> ! Cannot translate a data.frame to SQL.
#> ℹ Do you want to force evaluation in R with (e.g.) `!!x$x` or `local(x$x)`?

# works with dev dbplyr 2.5.0
res <- bcdc_query_geodata(airports) |> 
  filter(!!WITHIN(x)) |> 
  collect()

# works with dev dbplyr 2.5.0
res <- bcdc_query_geodata(airports) |> 
  filter(local(WITHIN(x))) |> 
  collect()

# works with dev dbplyr 2.5.0
res <- bcdc_query_geodata(airports) |> 
  filter(WITHIN(local(x))) |> 
  collect()

# does not work with dev dbplyr 2.5.0
res <- bcdc_query_geodata(airports) |> 
  filter(WITHIN(!!x)) |> 
  collect()
#> Error in `FUN()`:
#> ! Unknown input type: list

# !! does work with an atomic vector:
x_bbox <- st_bbox(x)
res <- bcdc_query_geodata(airports) |> 
  filter(WITHIN(!!x_bbox)) |> 
  collect()

Created on 2024-02-22 with reprex v2.1.0

@ateucher
Copy link
Collaborator

ateucher commented Feb 22, 2024

Just looking at the testthat failures here, there is something funny going on with how messages are emitted when using !! vs local() - The expected message is printed to the console but not detected by testthat::expect_message():

library(bcdata)
library(testthat)

x <- bcdc_query_geodata('d1aff64e-dbfe-45a6-af97-582b7f6418b9') %>% 
  filter(ADMIN_AREA_NAME == "Cariboo Regional District") %>% 
  collect()

airports <- '76b1b7a3-2112-4444-857a-afccf7b20da8'

# fine
expect_message(
  bcdc_query_geodata(airports) %>% 
    filter(WITHIN(local(x))) %>% 
    collect()
)

# says no message produced
expect_message(
  bcdc_query_geodata(airports) %>% 
    filter(!!WITHIN(x)) %>% 
    collect()
)
#> The object is too large to perform exact spatial operations using bcdata.
#> Object size: 697696 bytes
#> BC Data Threshold: 500000 bytes
#> Exceedance: 197696 bytes
#> See ?bcdc_check_geom_size for more details
#> A bounding box was drawn around the object passed to WITHIN and all features within the box will be returned.
#> Error: ``%>%`(...)` did not produce any messages.

Created on 2024-02-22 with reprex v2.1.0

@hadley
Copy link
Author

hadley commented Feb 23, 2024

FWIW I'm going to back this change out of dbplyr so that bcdata doesn't break. (But mostly because it is a large, unanticipated, change to the semantics of dbplyr function translation)

@hadley
Copy link
Author

hadley commented Feb 23, 2024

Closing this in favour of tidyverse/dbplyr#1466

Thanks for the feedback!

@hadley hadley closed this Feb 23, 2024
@ateucher
Copy link
Collaborator

Thanks @hadley!

hadley added a commit to tidyverse/dbplyr that referenced this pull request Feb 28, 2024
Revert to previous escaping method (i.e. reverts #1368). See bcgov/bcdata#338 for details. If we do this again in the future, I think we have to do argument checking in individual translations. That's a lot of work but allows a whole suite of potentially useful translations for more complex R objects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants