Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: split funs which use add_combo_facet don't work in row splits (do work in cols) #768

Open
3 tasks done
gmbecker opened this issue Oct 31, 2023 · 2 comments
Open
3 tasks done

Comments

@gmbecker
Copy link
Collaborator

gmbecker commented Oct 31, 2023

What happened?

Some of the split-time validation thats happening for row splits is brittle in a way that it shouldn't be, and is running afoul of a buglet in add_combo_facet

Specifically, add_combo_facet seems to be creating a split result object whose labels component is missing the name (name is "") for the added facet. This is a bug, but not a very important one, because as you can see in the columns case, the core tabulation machinery doesn't care about the names on the labels vector (apparently).

Something is tripping it up in .checkvarsok, though.

As a side note, it seems that all our tests and examples only use add_combo_facet in column splits...

Relevant log output

mysplitfun <- make_split_fun(post = list(add_overall_facet("all", "all of them")))
lyt <- basic_table() %>% 
    split_rows_by("STRATA1", split_fun = mysplitfun) %>% 
    analyze("AGE")

build_table(lyt, DM)
Error in .checkvarsok(spl, df) : 
   variable(s) [AGE] not present in data. (AnalyzeVarSplit)
In addition: Warning message:
In rep(vals, length.out = nrow(full_parent_df[[1]])) :
  first element used of 'length.out' argument

 lyt2 <- basic_table() %>%
   split_cols_by("STRATA1", split_fun = mysplitfun) %>% 
   analyze("AGE")

 build_table(lyt2, DM)
         A       B       C     all of them
——————————————————————————————————————————
Mean   33.74   34.10   34.79      34.22   

Code of Conduct

  • I agree to follow this project's Code of Conduct.

Contribution Guidelines

  • I agree to follow this project's Contribution Guidelines.

Security Policy

  • I agree to follow this project's Security Policy.
@gmbecker
Copy link
Collaborator Author

also, @cicdguy it said in the form that the log output portion would be formatted as code, but that didn't seem to have worked properly when the output starts with ">"
(as things copied from an R session are wont to do).

@Melkiades Melkiades added the sme label Nov 1, 2023
@gmbecker
Copy link
Collaborator Author

gmbecker commented Nov 1, 2023

So, a couple things:

I was wrong about (harmless, it turns out) buglet being related to the error. .fixupvals does indeed fail to correctly name the label portion of the split result, but as I had initially expected, that doesn't actually cause any problems in practice.

The real culprit is that unlike add_combo_levels, add_combo_facet is not currently smart enough to handle the AllLevelsSentinel value.

Because add_overall_facet just calls directly down to add_combo_facet with the levels value of select_all_levels (ie the sentinel), it fails to do anything, resulting in a partition that has no data, thus causing all the downstream problems.

There are 2 solutions to this: fix add_combo_facet so it is checking for the sentinel and doing the right thing in that case, or decoupling add_overall_facet from add_combo_facet and implementing the behavior directly there.

I prefer the former from a design perspective but it doesn't really matter much in practice.

The behavior itself is trivial in both cases (modulo an if statement) because post-processing behavior functions accept fulldf as one of their arguments:

## implementation of the less desirable option 2
real_add_overall_facet <- function(name, label) {
  function(ret, spl, .spl_context, fulldf) {
    add_to_split_result(ret, values = name, datasplit = setNames(list(fulldf), name),
                        labels = setNames(label, name))
  }
}

@ayogasekaram ayogasekaram self-assigned this Nov 3, 2023
@ayogasekaram ayogasekaram removed their assignment Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants