-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate scripts for correcting typos and renaming domains #173
Comments
@andkov, for the renaming part of the script (currently at line 172), consider pulling that out intoa metadata csv with three columns: It may not be worth messing with now, unless there are multiple |
Good point, thank you, @wibeasley. I would very much like a registry of names of model components. This would especially be useful for different tiers of coordination:
The next work-through of the existing scripts will help me identify where the renaming you've mentioned should be the most organic. |
Cool. Then here's a regex script that will pull out those values and put them into a CSV. Copy & paste the meat of that column_renames <- '
# general model information
"study_name" = "`study_name`"
, "model_number" = "`model_number`"
, "subgroup" = "`subgroup`"
, "model_type" = "`model_type`"
...
, "b_gamma_16_se" = "`b_GAMMA_16_se`"
, "b_gamma_16_wald" = "`b_GAMMA_16_wald`"
, "b_gamma_16_pval" = "`b_GAMMA_16_pval`"
' Then run this and rename/move the pattern <- '(?s).+?"(\\w+)"\\s+=\\s*"`(\\w+)`".*?'
rearranged <- gsub(pattern, "\\2,\\1,\n", column_renames, perl=TRUE)
rearranged
ds <- rearranged %>%
readr::read_csv(, col_names = c("name_old", "name_new", "comments"))
readr::write_csv(ds, "./column-renames.csv") This is a handy little script for converting code into proper metadata. I'm surprised we haven't need to write something like this yet. |
This is the code that should work (I haven't tested it) when you read the metadata and apply the column name changes. ds <- readr::read_csv("./column-renames.csv")
renaming_vector <- ds$name_old
names(renaming_vector) <- ds$name_new
ds_names_new <- ds_names_old %>%
dplyr::rename_(.dots = renaming_vector) edit:: and don't be afraid to add extra columns to this, if it helps anything. |
Great regex example for studying. I've finally got over the initial scare of using it and can learn more elaborate applications. Can't imagine an efficient data manipulations without regexes anymore. Thanks for pushing me down that hill! |
Currently these two tasks are accomplished by a single script `./manipulation/rename-classify.R.
Such practice is far from optimal for the following reasons:
For these and other reasons, it is advisable to develop a function that would take in a catalog and and the external csv with grouping instructions, so that this procedure could be applied immediately before table or graph production and NOT during the manipulation phase.
The text was updated successfully, but these errors were encountered: