-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange behaviour of filter and == #6920
Comments
It seems library(tidyverse)
as_tibble(mtcars, rownames = "Model") |>
mutate(`equal` = cyl == c(4,6),
`equal_2` = cyl == c(6,4),
`operation_in` = cyl %in% c(4,6)) |>
select(Model, cyl, equal:operation_in)
#> # A tibble: 32 × 5
#> Model cyl equal equal_2 operation_in
#> <chr> <dbl> <lgl> <lgl> <lgl>
#> 1 Mazda RX4 6 FALSE TRUE TRUE
#> 2 Mazda RX4 Wag 6 TRUE FALSE TRUE
#> 3 Datsun 710 4 TRUE FALSE TRUE
#> 4 Hornet 4 Drive 6 TRUE FALSE TRUE
#> 5 Hornet Sportabout 8 FALSE FALSE FALSE
#> 6 Valiant 6 TRUE FALSE TRUE
#> 7 Duster 360 8 FALSE FALSE FALSE
#> 8 Merc 240D 4 FALSE TRUE TRUE
#> 9 Merc 230 4 TRUE FALSE TRUE
#> 10 Merc 280 6 TRUE FALSE TRUE
#> # ℹ 22 more rows Created on 2023-08-29 with reprex v2.0.2 |
This is just R's default vector recycling behavior. If you run:
you'll see that this generates a perfectly "valid" boolean vector of the correct length by recycling While it may seem odd, some people probably do rely on this behavior in order to do things intentionally. In this instance, someone may want to compare Whether dplyr attempts to check for this and either warn or stop the user is a judgement call that the package maintainers would have to weigh in on. |
The filter I don't expect that dplyr warn on everything, and less to make an error out of the scope of dplyr. And as @joranE says, "some people probably do rely on this behavior": an other example could be the left hand side as vector. |
Thanks @joranE, you've given the right explanation. I don't believe dplyr should error or warn on this, even though it is typically a user mistake. Unfortunately I think this one is too much of a slippery slope If you really did want to try and avoid this, you could do: `==` <- function(x, y) {
vctrs::vec_equal(x, y)
}
dplyr::mutate(mtcars, cyl == c(4, 6))
#> Error in `dplyr::mutate()`:
#> ℹ In argument: `cyl == c(4, 6)`.
#> Caused by error in `vctrs::vec_equal()`:
#> ! Can't recycle `..1` (size 32) to match `..2` (size 2). |
Hello,
I found a strange behaviour of the function
filter
when using the operator==
and a vector. In the next example I would expect the function to give me an error because I should use%in%
:Instead, it is working and filtering 10 elements from the
mtcars
. However, the total elements with 4 or 6 cylinders in themtcars
dataset is 18. Why is this happening?Best,
Cidre
The text was updated successfully, but these errors were encountered: