Skip to content

Commit

Permalink
Merge pull request #52 from epi-sam/dev
Browse files Browse the repository at this point in the history
Dev - covid ihme newly versioned dir utils
  • Loading branch information
epi-sam authored Sep 30, 2024
2 parents a6f1d7f + f3d7a14 commit aa1bdbe
Show file tree
Hide file tree
Showing 24 changed files with 396 additions and 16 deletions.
19 changes: 19 additions & 0 deletions ChangeLog.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,19 @@ ChangeLog for SamsElves Package

--------------------------------------------------------------------------------

## 2024-09-30 v.0.3.4

- added:
- utils_io.R from ihme.covid package
- `make_new_output_dir` - create a version-incremented run-date folder based on a 'YYYY_MM_DD.VV' run-date folder structure
- supported by `get_latest_output_date_index` and `get_new_output_dir`
- `get_latest_output_dir` - get the latest output directory based on a 'YYYY_MM_DD.VV' run-date folder structure
- added tests, updated deprecated methods
- documented:
- some previously undocumented helper functions for various methods (not exported)



## 2024-09-23 v.0.3.3

- updated:
Expand All @@ -13,13 +26,16 @@ ChangeLog for SamsElves Package
- retains original file extension



## 2024-09-19 v.0.3.2

- updated:
- `read_file`
- now includes option for custom csv reading function since `data.table::fread` can have quotation-doubling issues
- also includes `...` arg to pass additional user-desired args to the reader function (works for any underlying reader function)



## 2024-09-18

- deprecated:
Expand All @@ -34,6 +50,7 @@ ChangeLog for SamsElves Package
- now includes a selection of sessionInfo for R version, package versions, etc. for pipeline provenance



## 2024-09-06

- added:
Expand All @@ -42,6 +59,8 @@ ChangeLog for SamsElves Package
- `submit_job` & `submit_job_array`
- added console-style log option (combine stderr and stdout)



## 2023-12-04

- deprecated:
Expand Down
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: SamsElves
Title: Helper functions for the data science at IHME
Version: 0.3.3
Version: 0.3.4
Author: Sam Byrne ([email protected])
Description: Helper functions for the data science at IHME
License: none
Expand Down
4 changes: 4 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,14 @@ export(datetime_stamp)
export(extract_sessionInfo)
export(extract_submission_commands)
export(find_file_extension)
export(get_latest_output_date_index)
export(get_latest_output_dir)
export(get_new_output_dir)
export(increment_file_version)
export(is_empty)
export(is_sequential_int_vec)
export(make_directory)
export(make_new_output_dir)
export(make_versioned_dir)
export(msg_multiline)
export(msg_prt)
Expand Down
13 changes: 10 additions & 3 deletions R/children_of_parents.R
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,13 @@ children_of_parents <- function(

}

#' @description Helper function for children_of_parents.
#' Helper function for children_of_parents.
#'
#' @param parent_loc_ids [int] ihme location ids
#' @param output [character] output options
#' @param hierarchy [data.table] ihme location hierarchy
#'
#' @return [none] stop on failure
validate_children_of_parents_inputs = function(parent_loc_ids, output, hierarchy){
# Check for valid parent_locs_ids type
if (!is.vector(parent_loc_ids) | !is.numeric(parent_loc_ids)) {
Expand All @@ -87,14 +93,15 @@ validate_children_of_parents_inputs = function(parent_loc_ids, output, hierarchy
}
}

#' @description Helper function for children_of_parents.
#' Helper function for children_of_parents.
#'
#' Given a single parent_id and a path_to_top_parent,
#' returns TRUE if that parent_id is in the path.
#'
#' @param parent_id [int] Location ID of parent to test
#' @param path_to_top_parent [character] String of path to top parent from hierarchy
#'
#' @return boolean
#' @return [lgl] TRUE if parent_id is in path_to_top_parent
is_child_of_parent = function(parent_id, path_to_top_parent){
path_to_top_parent = as.integer(unlist(strsplit(path_to_top_parent, ",")))
return(parent_id %in% path_to_top_parent)
Expand Down
8 changes: 7 additions & 1 deletion R/parents_of_children.R
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,13 @@ parent_of_child <- function(
parent_level %i", child_location_id, head(hierarchy), parent_level))
}

#' @description Helper function to validate inputs to function
#' Helper function to validate inputs to function
#'
#' @param child_location_id [int] ihme location_id
#' @param hierarchy [data.table] ihme location hierarchy
#' @param parent_level [int] ihme location level
#'
#' @return [none] stop on failure
validate_parents_of_children_inputs <- function(child_location_id, hierarchy, parent_level){
# Check for valid parent_level
if(length(parent_level) != 1){
Expand Down
113 changes: 111 additions & 2 deletions R/utils_io.R
Original file line number Diff line number Diff line change
Expand Up @@ -115,9 +115,9 @@ save_file <- function(object, f_path, forbid_overwrite = TRUE, verbose = FALSE){

#' Read a file of an arbitrary type
#'
#' @param path_to_file [chr] full path with extenstion
#' @param path_to_file [chr] full path with extension
#' @param verbose [lgl] noisy or quiet function?
#' @param csv_opt [chr] namespaced function call for csv reads (default `"data.table::fread"`)
#' @param csv_opt [chr] name spaced function call for csv reads (default `"data.table::fread"`)
#' @param ... [any] additional arguments to pass to the reader function
#'
#' @return [file] an object of appropriate file type
Expand Down Expand Up @@ -189,3 +189,112 @@ increment_file_version <- function(outpath){

return(outpath_new)
}


#' get the latest index for given an output dir and a date
#'
#' directories are assumed to be named in YYYY_MM_DD.VV format with sane
#' year/month/date/version values.
#'
#' @param dir [chr] path to directory with versioned dirs
#' @param date [chr] character in be YYYY_MM_DD format
#'
#' @return [dbl] largest version in directory tree or 0 if there are no version OR
#' the directory tree does not exist
#' @export
#'
#' @examples
#' get_latest_output_date_index("tests/testthat/fixtures/versioned-dirs/nested/1999_09_09", date = "1999_09_09") # expect 2
get_latest_output_date_index <- function(dir, date) {
currentfolders <- list.files(dir)

# subset to date
pat <- sprintf("^%s[.]\\d{2}$", date)
date_dirs <- grep(pat, currentfolders, value = TRUE)

if (length(date_dirs) == 0) {
return(0)
}

# get the index after day
date_list <- strsplit(date_dirs, "[.]")

inds <- unlist(lapply(date_list, function(x) x[2]))
if (is.na(max(inds, na.rm = T))) inds <- 0

return(max(as.numeric(inds)))
}


#' Find the latest output directory with format YYYY_MM_DD.VV
#'
#' @param root [chr] path to root of output results
#'
#' @return [chr] path to latest output directory
#' @export
#'
#' @examples
#' get_latest_output_dir("tests/testthat/fixtures/versioned-dirs/nested/1999_09_09") # expect "tests/testthat/fixtures/versioned-dirs/nested/1999_09_09/1999_09_09.02"
get_latest_output_dir <- function(root) {
if (!dir.exists(root)) {
stop(sprintf("root %s does not exist", root))
}
raw <- list.dirs(root, full.names = FALSE, recursive = FALSE)
valid.idx <- grep("^\\d{4}_\\d{2}_\\d{2}[.]\\d{2}$", raw)
if (length(valid.idx) == 0) {
stop(sprintf("No YYYY_MM_DD.VV directories in %s", root))
}
return(file.path(root, max(raw[valid.idx])))
}



#' Increment a new output folder date-version
#'
#' Get a new directory path, but don't make it
#'
#' @param root [chr] path to root of output results
#' @param date [chr] character date in form of "YYYY_MM_DD" or "today". "today" will be interpreted as today's date.
#'
#' @return [chr] path to new output direcctory
#' @export
#'
#' @examples
#' get_new_output_dir(root = tempdir(), date = "today")
get_new_output_dir <- function(root, date){
if (date == "today") {
date <- format(Sys.Date(), "%Y_%m_%d")
}
cur.version <- get_latest_output_date_index(root, date = date)

dir.name <- sprintf("%s.%02i", date, cur.version + 1)
dir.path <- file.path(root, dir.name)
return(dir.path)
}


#' Get output directory for results to save in
#'
#' Returns an appropriate path to save results in, creating it if necessary.
#'
#' @param root [chr] path to root of output results
#' @param date [chr] character date in form of "YYYY_MM_DD" or "today". "today" will be interpreted as today's date.
#'
#' @return [chr] path to new output directory
#' @export
#'
#' @examples
#' \dontrun{
#' make_new_output_dir("my/root/folder", date = "today")
#' }
make_new_output_dir <- function(root, date) {
dir.path <- get_new_output_dir(root, date)
if (!dir.exists(dir.path)) {
# handle quirk with singularity image default umask
old.umask <- Sys.umask()
Sys.umask("002")
dir.create(dir.path, showWarnings = FALSE, recursive = TRUE, mode = "0777")
Sys.umask(old.umask)
}
return(dir.path)
}
4 changes: 3 additions & 1 deletion R/wait_on_slurm_job_id.R
Original file line number Diff line number Diff line change
Expand Up @@ -214,12 +214,14 @@ wait_on_slurm_job_id <-
print(paste0("Job(s) ", job_id_msg, " no longer PENDING, RUNNING, or FAILED. Time elapsed: ", job.runtime, " seconds"))
}

#' Helper function for wait_on_slurm_job_id - how do you want jobs to break and display user messages?
#'
#' @param cmd_fail [chr]
#' @param cmd_fail_feedback [chr]
#' @param job_id_regex_raw [regex]
#' @param filter_by [chr]
#'
#' @description Helper function for wait_on_slurm_job_id - how do you want jobs to break and display user messages?
#' @return [none] stop on failure
break_for_failed_jobs <-
function(
cmd_fail,
Expand Down
23 changes: 23 additions & 0 deletions man/break_for_failed_jobs.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 14 additions & 3 deletions man/get_latest_output_date_index.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 20 additions & 0 deletions man/get_latest_output_dir.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 22 additions & 0 deletions man/get_new_output_dir.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 20 additions & 0 deletions man/is_child_of_parent.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit aa1bdbe

Please sign in to comment.