reframe() is dropping columns when you use rowwise() #6903

mutahiwachira · 2023-08-04T15:07:11Z

Minimal reprex

library(dplyr)
quantile_df <- function(x, probs = c(0.25, 0.5, 0.75)) {
  tibble(
    val = quantile(x, probs, na.rm = TRUE),
    quant = probs
  )
}

# Actual Behavioiur: Removes the grouping columns and just returns cols from rowwise
starwars %>%
  rowwise() %>% 
  reframe(quantile_df(height)) %>%
  ungroup()
  # 261 rows, 2 cols from quantile_df

# Expected Beavhiour: Preserve them like in this code
starwars %>% 
  rowwise() %>% 
  mutate(quantiles = list(quantile_df(height))) %>% 
  unnest(quantiles) %>%
  ungroup()
  # 261 rows, all columns preserved

I want to be able to go from 1 row to multiple rows while keeping the previous information.

The use case is described below.

Use case

I was doing a simulation of the German Tank problem following a very functional/list-column heavy workflow. I have a dataframe called sensitivities which has columns that I need to generate and describe my samples. It looks like this:

## A tibble: 18 × 4
#  pop_size prop_of_pop all_tanks     sample_size
#       <dbl>       <dbl> <list>              <dbl>
#   1     1000         0.1 <int [1,000]>         100
#   2     1000         0.2 <int [1,000]>         200
#   3     1000         0.3 <int [1,000]>         300
#   4     1000         0.4 <int [1,000]>         400
#   5     1000         0.5 <int [1,000]>         500
#   6     1000         0.6 <int [1,000]>         600
#   7     1000         0.7 <int [1,000]>         700
#   8     1000         0.8 <int [1,000]>         800
#   9     1000         0.9 <int [1,000]>         900
# 10     2000         0.1 <int [2,000]>         200
# 11     2000         0.2 <int [2,000]>         400
# 12     2000         0.3 <int [2,000]>         600
# 13     2000         0.4 <int [2,000]>         800
# 14     2000         0.5 <int [2,000]>        1000
# 15     2000         0.6 <int [2,000]>        1200
# 16     2000         0.7 <int [2,000]>        1400
# 17     2000         0.8 <int [2,000]>        1600
# 18     2000         0.9 <int [2,000]>        1800

I have a function called simulate_samples. Each row of the above df defines one sensitivity. Given one sensitivity, simulate_sample generates a dataframe of 100 samples with a sample_id<int> and a list column for sample<list[int]>. One row to many. So I used reframe and got the behaviour like in the minimal reprex.

The text was updated successfully, but these errors were encountered:

DavisVaughan · 2023-08-04T15:16:38Z

Do you just want this then?

starwars %>%
  rowwise(everything()) %>% 
  reframe(quantile_df(height)) %>%
  ungroup()

rowwise() works more like summarise() than like mutate()

mutahiwachira · 2023-08-04T20:06:03Z

That's perfect. I also read the docs and I see that the simulation case is mentioned.
Thanks so much this is very useful. Look forward to see what you guys do with by-row operations in future as they are very useful for simulations and nesting calculations without loops.

This solves my issue so I will close it.

mutahiwachira mentioned this issue Aug 4, 2023

mutate(.by_row =), reframe(.by_row =), and possibly filter(.by_row =) #6660

Open

mutahiwachira closed this as completed Aug 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reframe() is dropping columns when you use rowwise() #6903

reframe() is dropping columns when you use rowwise() #6903

mutahiwachira commented Aug 4, 2023 •

edited

Loading

DavisVaughan commented Aug 4, 2023

mutahiwachira commented Aug 4, 2023

reframe() is dropping columns when you use rowwise() #6903

reframe() is dropping columns when you use rowwise() #6903

Comments

mutahiwachira commented Aug 4, 2023 • edited Loading

Minimal reprex

Use case

DavisVaughan commented Aug 4, 2023

mutahiwachira commented Aug 4, 2023

mutahiwachira commented Aug 4, 2023 •

edited

Loading