Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Initial data spec #39

Merged
merged 29 commits into from
Jul 12, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
8a47d52
update renv
aclark02-arcus May 21, 2024
57cf626
sync with dev
aclark02-arcus May 30, 2024
47664dc
add a new vignette to outline the data specification
aclark02-arcus May 30, 2024
91645d8
update data_spec
aclark02-arcus Jun 12, 2024
35e90ff
Merge branch 'dev' into ac-initial
aclark02-arcus Jun 13, 2024
8e32b82
commit stuff
aclark02-arcus Jun 26, 2024
85d45b1
added 'under construction' note to vignette
aclark02-arcus Jun 27, 2024
bfac0e2
update Rd files
aclark02-arcus Jun 27, 2024
072b8fd
Merge branch 'dev' into ac-21-data-spec
aclark02-arcus Jun 27, 2024
797bb43
updating run_app.rd
aclark02-arcus Jun 27, 2024
12752a3
Merge branch 'ac-21-data-spec' of https://github.com/openpharma/clins…
aclark02-arcus Jun 27, 2024
1de747d
remove unneeded changes
aclark02-arcus Jun 27, 2024
402fa51
update run_app.Rd
aclark02-arcus Jun 27, 2024
1aa41b2
Increment version number to 0.0.0.9003
aclark02-arcus Jun 27, 2024
3b9f32c
update run_app() documentation now that arguments have been shifted t…
aclark02-arcus Jun 27, 2024
9a15f5f
fix typo issues
aclark02-arcus Jun 27, 2024
9fab16e
update run_app.Rd to include 'meta_data' info
aclark02-arcus Jun 28, 2024
96864ab
update golem-config pkg version
aclark02-arcus Jun 28, 2024
a8c3691
move data config information over to vignette instead of run_app docs
aclark02-arcus Jul 1, 2024
dad8b58
update data spec to include section for raw data
aclark02-arcus Jul 1, 2024
802fb2c
get all the right peices into the vignette before testing them out
aclark02-arcus Jul 9, 2024
617749f
making more progress
aclark02-arcus Jul 9, 2024
cc3c0dc
Merge branch 'dev' into ac-21-data-spec
aclark02-arcus Jul 10, 2024
44bfe61
Merge branch 'ac-21-data-spec' of https://github.com/openpharma/clins…
aclark02-arcus Jul 12, 2024
d0fb860
Fix vignette
LDSamson Jul 12, 2024
9e74540
Update clinsightful_data description
LDSamson Jul 12, 2024
8b45810
Provide updated yaml
LDSamson Jul 12, 2024
1b9a240
Add a few clarifications
LDSamson Jul 12, 2024
5e77543
Update version
LDSamson Jul 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: clinsight
Title: ClinSight
Version: 0.0.0.9002
Version: 0.0.0.9003
aclark02-arcus marked this conversation as resolved.
Show resolved Hide resolved
Authors@R: c(
person("Leonard Daniël", "Samson", , "[email protected]", role = c("cre", "aut")),
person("GCP-Service International Ltd.& Co. KG", role = "fnd")
Expand Down
4 changes: 2 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@
- Created two renv profiles, one for development and one for production. Goal is
to minimize the package dependencies of the production version.
- Removed development package dependencies (for example devtools) that were not needed to run the application.
- Improved data anonimization.
- Improved data anonymization.
- Changed license.

- Updated Description file.
- Improved reading of data files within clinsight::run_app()
- Improved creating test result report.
- Added data specification to `run_app()` documentation

## Bug fixes

Expand Down
12 changes: 10 additions & 2 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,17 @@
#'
"col_palette"


#' Clinical Trial test data
#'
#' A data frame containing randomly created clinical trial data. Used for
#' testing purposes.
#' A data.frame containing randomly created clinical trial data. Acceptable for
#' for the `data` argument in `run_app()` & used for testing purposes.
#'
#' @format a data.frame with 6,483 rows and 24 variables.
#'
#' @source Created with `data-raw/create_random_data.R`
"clinsightful_data"




99 changes: 98 additions & 1 deletion R/run_app.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#' Run the Shiny Application
#'
#'
#' @param data_folder Character string. The folder in which all data resides is
#' usually set in the config.yml file. However, this can be overwritten if a
#' path is set in this argument. Useful for testing purposes.
Expand All @@ -9,6 +9,103 @@
#' @param ... arguments to pass to golem_opts. See `?golem::get_golem_options`
#' for more details.
#' @inheritParams shiny::shinyApp
#'
#' @details
#' There are several elements defined in `golem-config.yml` that require
#' configuration before launching the application for the first time. To name a
#' few:
#'
#' - `user_db` a Character string providing the path to the app databases.
#' If it does not exist, one will be created based on app data and metadata,
#' with all data labeled as new'/not yet reviewed.
#' - `credentials_db` Character string. Path to the credentials database.
aclark02-arcus marked this conversation as resolved.
Show resolved Hide resolved
#'
#' The other two are `meta_data` and `study_data`, file paths to the app's primary
#' source of data, stored as RDS files pertinent to successful app deployment.
#' As such, here are comprehensive data specifications for these objects:
#'
#' Column specs for the `study_data` RDS object:
#' - `site_code`: character or integer, identifier for study site; If an integer,
#' recommended to add prefix "Site" as this will display more intuitively in
#' the application's UI
#' - `subject_id`: character, unique identifier for a subject
#' - `event_repeat`: integer, helps keep track of unique `event_id` for a single
#' `subject_id` and `event_date`
#' - `event_id`: character, names that help classify types of `event_name`s
#' into like-groups, generally characterized by site visits. For example,
#' "SCR" for the screening visit, "VIS" for Visit X (where X is some integer),
#' and "EXIT" for when the patient exits the study trial. However, some
#' `event_id`s track events that could apply outside of any visit, like AE,
#' ConMed, Medical History, etc.
#' - `event_name`: character, an "event" generally characterizes some sort of
#' site visit, whether that be a "Screening", "Visit X" (where X is some
#' integer), "Exit", or "Any Visit".
#' - `event_date`: Date, the date associated with `event_name`
#' - `form_id`: character, a unique identifier for the form the `item_name` metric
#' and `item_value` were pulled from. Note: when `item_type` is continuous,
#' `form_id` can contain several different `item_group`s. However, when
#' `item_type` is 'other', `item_group` can be made up of several `form_id`
#' values.
#' - `form_repeat`: integer, helps keep track of unique `item_name`s collected
#' from a specific `form_id` for a given `subject_id`. `form_repeat` is
#' particularly helpful when conslidating data like Adverse Events into this
#' data format. Specifically, if more than one AE is collected on a patient,
#' they'll have more than one `form_repeat`
#' - `edit_date_time`: datetime (POSIXct), the last time this record was edited
#' - `db_update_time`: datetime (POSIXct), the last time the database storing this
#' record was updated.
#' - `region`: character, describing the region code that `site_code` falls under
#' - `day`: a difftime number, meaning it contains both a number and unit of
#' time. It measures the number of days each visit is from screening
#' - `vis_day`: numeric, a numeric representation of `day`
#' - `vis_num`: numeric, a numeric representation of `event_name`
#' - `event_label`: character, an abbreviation of `event_name`
#' - `item_name`: character, describes a metric or parameter of interest.
#' - `item_type`: character, classifies `item_name`s into either 'continuous'
#' or 'other', where continuous types are those generally associated with the
#' CDISC "basic data structure" (BDS). That is, each `item_name` metric is
#' collected over time at a patient visit (`event_name`). The 'other' type
#' represents all non-time dependent measures, like demographic info, adverse
#' events, Medications, medical history, etc.
#' - `item_group`: character, provides is a high level category that groups
#' like-`item_name`s together. For example, and `item_group` = 'Vital Signs'
#' will group together pertinent `item_name` metrics like BMI, Pulse, Blood
#' pressure, etc.
#' - `item_value`: character, the measurement collected for a given `item_name`.
#' The value collected may be a number like 150 (when collecting a patient's
#' weight) or a word (such as 'white' for the subject's race).
#' - `item_unit`: character, tracking the unit of measurement for `item_name`
#' and `item_value`.
#' - `lower_lim`: numeric, some `item_name`s (particularly the 'continuous' type)
#' have a pre-defined range of values that are considered normal. This is the
#' lower limit to that range.
#' - `upper_lim`: numeric, some `item_name`s (particularly the 'continuous' type)
#' have a pre-defined range of values that are considered normal. This is the
#' upper limit to that range.
#' - `significance`: character, either 'CS' which means 'Clinically Significant'
#' or 'NCS' which means 'Not Clinically Significant'
#' - `reason_notdone`: character, an effort to describe why the `item_value`
#' field is `NA` / missing.
#'
#'
#' Specifications for list items that may be included in the `meta_data` RDS:
#'
#' `column_specs` a data.frame
#'
#' `events` a data.frame
#'
#' `common_forms` a data.frame
#'
#' `study_forms` a data.frame
#'
#' `general` a data.frame
#'
#' `groups` a data.frame
#'
#' `table_names` a data.frame
#'
#' `items_expanded` a data.frame
#'
aclark02-arcus marked this conversation as resolved.
Show resolved Hide resolved
#'
#' @export
#'
Expand Down
72 changes: 72 additions & 0 deletions dev/02_dev.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,77 @@
#### CURRENT FILE: DEV SCRIPT #####
###################################

# Exploring for insights on data spec
unique(clinsightful_data[,c("site_code","region")]) |> arrange(site_code)
unique(clinsightful_data[,c("event_id","event_name")])
unique(clinsightful_data[,c("form_id","form_repeat")])
unique(clinsightful_data[,c("event_repeat","event_id","event_name")])
unique(clinsightful_data[,c("event_repeat","event_id","event_name","form_id","form_repeat")])
unique(clinsightful_data[,c("event_repeat","event_id","event_name", "day")])
unique(clinsightful_data[,c("event_repeat","event_id","event_name", "event_label")])

# form exploration
unique(clinsightful_data[,c("form_id")])
unique(clinsightful_data[,c("form_id","item_group", "item_type")])

# other
unique(clinsightful_data[,c("item_group","item_type")])
other <- clinsightful_data |>
filter(item_type == "other")
unique(other[,c("item_group","item_type","item_name")]) |>
print(n=40)

# bds
unique(clinsightful_data[,c("item_group","item_type")])
bds <- clinsightful_data |>
filter(item_type != "other")
unique(bds[,c("item_group","item_type","item_name")]) |>
print(n=40)


unique(clinsightful_data[,c("item_group","item_type")])
other <- clinsightful_data |>
filter(item_group == "Vital signs")
unique(other[,c("item_group","item_type","item_name")]) |>
print(n=40)

# dirty?
library(clinsight)
data("clinsightful_data")
dirty <- clinsightful_data |>
filter(item_value %in% c("µg/h","µg/ml"))

str(clinsightful_data$item_value)

# day
class(clinsightful_data$day)

# exploring events
data("clinsightful_data")
d_1pat <- clinsightful_data |>
filter(subject_id == "BEL_04_772") |>
filter(event_id == "COMMON_CM")

# exploring event_repeat & event_date
library(dplyr)
clinsightful_data |>
filter(subject_id == "BEL_04_772") |>
group_by(subject_id, event_id, event_repeat, event_date) |>
summarize(n = n()) |>
# filter(event_repeat != form_repeat) |>
print(n = 36)

d_1pat <- clinsightful_data |>
filter(subject_id == "BEL_04_772") |>
filter(event_id == "COMMON_AE")

clinsightful_data |>
filter(subject_id == "BEL_04_772") |>
group_by(subject_id, event_id, event_repeat, event_date, form_repeat) |>
summarize(n = n()) |>
filter(event_repeat != form_repeat) |>
print(n = 36)

# Engineering

## Dependencies ----
Expand Down Expand Up @@ -47,6 +118,7 @@ usethis::use_test("app")

## Vignette ----
usethis::use_vignette("testgolem")
usethis::use_vignette(name = "data_spec", title = "Input Data Specification")
#devtools::build_vignettes()

## Code Coverage----
Expand Down
6 changes: 3 additions & 3 deletions man/clinsightful_data.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

98 changes: 98 additions & 0 deletions man/run_app.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading