Skip to content

Commit

Permalink
Update fct_lump() usage
Browse files Browse the repository at this point in the history
  • Loading branch information
hadley committed Apr 18, 2021
1 parent 4c17173 commit 42bbae5
Showing 1 changed file with 10 additions and 10 deletions.
20 changes: 10 additions & 10 deletions factors.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -336,22 +336,21 @@ gss_cat %>%
```

Sometimes you just want to lump together all the small groups to make a plot or table simpler.
That's the job of `fct_lump()`:
That's the job of the `fct_lump_*()` family of functions.
`fct_lump_lowfreq()` is a simple starting point that progressively lumps the smallest groups categories into "Other", always keeping "Other" as the smallest category.

```{r}
gss_cat %>%
mutate(relig = fct_lump(relig)) %>%
mutate(relig = fct_lump_lowfreq(relig)) %>%
count(relig)
```

The default behaviour is to progressively lump together the smallest groups, ensuring that the aggregate is still the smallest group.
In this case it's not very helpful: it is true that the majority of Americans in this survey are Protestant, but we've probably over collapsed.

Instead, we can use the `n` parameter to specify how many groups (excluding other) we want to keep:
In this case it's not very helpful: it is true that the majority of Americans in this survey are Protestant, but we'd probably like to see some more details!
Instead, we can use the `fct_lump_n()` to specify that we want exactly 10 groups:

```{r}
gss_cat %>%
mutate(relig = fct_lump(relig, n = 10)) %>%
mutate(relig = fct_lump_n(relig, n = 10)) %>%
count(relig, sort = TRUE) %>%
print(n = Inf)
```
Expand All @@ -360,7 +359,8 @@ gss_cat %>%

1. How have the proportions of people identifying as Democrat, Republican, and Independent changed over time?

1. How could you collapse `rincome` into a small set of categories?

1. Notice there are 9 groups (excluding other) in the `fct_lump` example above. Why not 10? (Hint: type `?fct_lump`, and find the default for the argument `other_level` is "Other".)
2. How could you collapse `rincome` into a small set of categories?

3. Notice there are 9 groups (excluding other) in the `fct_lump` example above.
Why not 10?
(Hint: type `?fct_lump`, and find the default for the argument `other_level` is "Other".)

0 comments on commit 42bbae5

Please sign in to comment.