New function to drop instead of keep observations (anti filter)

There are a couple of problems I regularly run into when subsetting data with multiple conditions. 

The first has been brought up several times ([including in this open issue](https://github.com/tidyverse/dplyr/issues/6560)): `filter()` drops any observations where any of the variables used in the conditions is missing.

A less obvious issue is readability. I often have to drop observations where any of several conditions is true. Writing this using `filter()` usually requires complex use of negation and parentheses, even if there are no missings to be concerned about. Understanding and explaining these statements in my code is fraught to say the least.

I propose a new `dplyr` function called `drop_if()`. It would drop all rows meeting any condition specified. This would greatly reduce the need to use `!()` and would also keep observations that are missing any of the references variables in the resulting data set by default.

The way I wrote this out for now is straightforward to understand but likely not efficient from a computational standpoint (filter to keep observations matching the condition then using `anti_join()` to drop them from the original data set). 

```r
library(tidyverse)

mpg_miss <- mpg %>% 
  mutate(
    class = na_if(class, "pickup"),
    cty   = na_if(cty, 18)
  )

mpg_miss_rown <- mpg_miss %>% 
  mutate(rown = row_number())

# if we want to drop where class is "suv" or cty < 15, temp df of these
keep <- mpg_miss_rown %>% 
  filter(
    class == "suv" | cty < 15
  ) %>% 
  select(rown)

keep %>% 
  nrow()
## [1] 89

drop <- mpg_miss_rown %>% 
  anti_join(keep) %>% 
  select(-rown)
## Joining, by = "rown"

drop %>% nrow()
## [1] 145

# drops if class or cty is NA
mpg_miss %>% 
  filter(!(class == "suv" | cty < 15)) %>% 
  nrow()
## [1] 114

# annoying syntax but produces desired result
mpg_miss %>% 
  filter(!(class == "suv" | cty < 15) | (is.na(class) & !cty < 15) | (class != "suv" & is.na(cty)))  %>%
  nrow()
## [1] 145

## Not run:
# What syntax could look like:
mpg_miss %>% 
  drop_if(class == "suv" | cty < 15)
## End(**Not run**)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New function to drop instead of keep observations (anti filter) #6888

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

New function to drop instead of keep observations (anti filter) #6888

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions