-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting 'number stored as text' error in mixed columns #93
Comments
Hi Cat, thanks for highlighting this. Are you able to provide a reproducible example of the issue? I've created a small example below that generates a workbook with a table (in sheet 'Table_1') that has two columns: one column ( (Sidenote: as it happens, I may be converting {a11ytables} to use {openxlsx2} instead of {openxlsx} in future (see #70). This issue may be another reason to make the change.) # Prepare supporting tables
support_tables <- list(
cover_df = tibble::tribble(
~subsection_title, ~subsection_content,
"Purpose", "Example results for something.",
"Workbook properties", "Some placeholder information.",
"Contact", "Placeholder email"
),
contents_df = tibble::tribble(
~"Sheet name", ~"Sheet title",
"Notes", "Notes",
"Table_1", "Example sheet title"
),
notes_df = tibble::tribble(
~"Note number", ~"Note text",
"[note 1]", "Placeholder note.",
"[note 2]", "Placeholder note."
)
)
# Prepare example dataset
table_df <- mtcars
table_df[["car [note 1]"]] <- row.names(mtcars) # row names to named column
row.names(table_df) <- NULL # remove row names
table_df <- table_df[1:5, c("car [note 1]", "mpg", "cyl")] # add note to header
table_df[2, "mpg"] <- "[c]" # make a value confidential in the mpg column
table_df["Notes"] <- c("[note 2]", rep(NA_character_, 4)) # include note column
# Create new a11ytable
my_a11ytable <-
a11ytables::create_a11ytable(
tab_titles = c(
"Cover",
"Contents",
"Notes",
"Table_1"
),
sheet_types = c(
"cover",
"contents",
"notes",
"tables"
),
sheet_titles = c(
"Cover title (example)",
"Contents",
"Notes",
"Example sheet title"
),
blank_cells = c(
NA_character_,
NA_character_,
NA_character_,
"Blank cells mean that a row does not have a note."
),
sources = c(
NA_character_,
NA_character_,
NA_character_,
"Example source."
),
tables = list(
support_tables$cover_df,
support_tables$contents_df,
support_tables$notes_df,
table_df
)
)
# Generate workbook from a11ytable
my_wb <- a11ytables::generate_workbook(my_a11ytable)
# Preview output
openxlsx::openXL(my_wb) |
Thanks for the quick reply Matt! I think the problem is that all my 'numeric' columns are actually text by the time I create my a11ytable - I have a function which rounds and suppresses all of the figures, so they are all stored as strings after using that function. I read that openxlx bases the excel column type on the column type in the input data, so I'm assuming that this is what is causing the error. Was hoping there was some kind of workaround but I guess not haha. I have loads of tables in my publication so getting rid of the error manually isn't awesome. openxlsx2 sounds very promising though! For info, this is essentially how that function works: library(dplyr)
table_df <- mtcars %>%
mutate(across(where(is.numeric), ~case_when(
# in this example, figures less than 5 are suppressed
.x < 5 ~ "[c]",
# figures less than 6 are 'unreliable' so are shown but marked with 'u'
.x < 6 ~ paste(janitor::round_half_up(.x, 1), "[u]"),
TRUE ~ as.character(janitor::round_half_up(.x, 0))))) |
Okey dokey, I think I understand: columns get converted to character class even if they remain unsuppressed. I think there's a few ways to convert numbers-only columns back to numeric en masse. Below is some demo code that uses suppressPackageStartupMessages(library(tidyverse))
library(janitor, warn.conflicts = FALSE)
set.seed(1337)
tbl <- tibble(
id = paste0("id_", 1:10),
x = runif(10, 0, 10), # will contain values to be suppressed
y = runif(10, 6, 10) # won't need any suppression
)
tbl
#> # A tibble: 10 × 3
#> id x y
#> <chr> <dbl> <dbl>
#> 1 id_1 5.76 9.92
#> 2 id_2 5.65 9.97
#> 3 id_3 0.740 9.31
#> 4 id_4 4.54 6.78
#> 5 id_5 3.73 9.93
#> 6 id_6 3.31 6.10
#> 7 id_7 9.48 9.89
#> 8 id_8 2.81 9.70
#> 9 id_9 2.45 7.36
#> 10 id_10 1.46 6.99
tbl_supp <- tbl |>
mutate(
across(
where(is.numeric),
\(value) case_when(
value < 5 ~ "[c]", # suppress
value < 6 ~ paste(round_half_up(value, 1), "[u]"), # 'unreliable' flag
TRUE ~ as.character(round_half_up(value, 0)) # otherwise round
)
)
)
tbl_supp # note 'y' is character but contains only numbers
#> # A tibble: 10 × 3
#> id x y
#> <chr> <chr> <chr>
#> 1 id_1 5.8 [u] 10
#> 2 id_2 5.6 [u] 10
#> 3 id_3 [c] 9
#> 4 id_4 [c] 7
#> 5 id_5 [c] 10
#> 6 id_6 [c] 6
#> 7 id_7 9 10
#> 8 id_8 [c] 10
#> 9 id_9 [c] 7
#> 10 id_10 [c] 7
type.convert(tbl_supp, as.is = TRUE) # note 'y' is now numeric
#> # A tibble: 10 × 3
#> id x y
#> <chr> <chr> <int>
#> 1 id_1 5.8 [u] 10
#> 2 id_2 5.6 [u] 10
#> 3 id_3 [c] 9
#> 4 id_4 [c] 7
#> 5 id_5 [c] 10
#> 6 id_6 [c] 6
#> 7 id_7 9 10
#> 8 id_8 [c] 10
#> 9 id_9 [c] 7
#> 10 id_10 [c] 7 |
Thank Matt, I didn't know about that function 😃 will give that a go |
Hi there, Have been using this package for a while now. I was wondering whether there was any plans to have it so columns with both strings and numbers could be stored as numbers where appropriate. Alternatively is there is a neat way to do it with openxslx that would be great to see :) Thanks, Jake |
@jake-pitchers I have a temporary fix for this, stole it from the rapid.spreadsheets package:
|
I'm exporting rounded and suppressed numbers and am getting the 'number stored as text' error in all of my numeric columns, even for columns where there are no rounded or suppressed numbers. I tried using numFmt = "#,##0" in the openxlsx createStyle function and it hasn't helped.
It looks like it might be an openxlsx issue and seems like it's solved in openxlsx2.
Can anyone help?
The text was updated successfully, but these errors were encountered: