diff --git a/404.html b/404.html index 71aa4ccc..4c13e88b 100644 --- a/404.html +++ b/404.html @@ -24,7 +24,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/articles/delim_files.html b/articles/delim_files.html index 12ebe9d2..914c13b9 100644 --- a/articles/delim_files.html +++ b/articles/delim_files.html @@ -24,7 +24,7 @@ plmmr - 4.2.0 + 4.1.0.1 @@ -99,7 +99,7 @@ Process the data#> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed -#> Processed files now saved as /tmp/RtmpQdavVL/processed_colon2.rds +#> Processed files now saved as /tmp/RtmpZ5yqi0/processed_colon2.rds # look at what is created colon <- readRDS(colon_dat) @@ -149,7 +149,7 @@ Create a design#> There are 0 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... -#> Standardization completed at 2024-12-13 20:58:41 +#> Standardization completed at 2024-12-17 19:29:26 #> Done with standardization. File formatting in progress As with process_delim(), the create_design() function returns a filepath: . The output @@ -178,7 +178,7 @@ Create a design#> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr "FileBacked" #> .. .. ..$ filename : chr "std_colon2.bk" -#> .. .. ..$ dirname : chr "/tmp/RtmpQdavVL/" +#> .. .. ..$ dirname : chr "/tmp/RtmpZ5yqi0/" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 @@ -204,18 +204,18 @@ Fit a modelcolon_fit <- plmm(design = colon_design, return_fit = TRUE, trace = TRUE) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized -#> Input data passed all checks at 2024-12-13 20:58:42 +#> Input data passed all checks at 2024-12-17 19:29:26 #> Starting decomposition. #> Calculating the eigendecomposition of K -#> Eigendecomposition finished at 2024-12-13 20:58:42 +#> Eigendecomposition finished at 2024-12-17 19:29:26 #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:58:42 +#> Rotation (preconditiong) finished at 2024-12-17 19:29:26 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:58:42 +#> Model fitting finished at 2024-12-17 19:29:27 #> Beta values are estimated -- almost done! #> Formatting results (backtransforming coefs. to original scale). -#> Model ready at 2024-12-13 20:58:42 +#> Model ready at 2024-12-17 19:29:27 Notice the messages that are printed out – this documentation may be optionally saved to another .log file using the logfile argument. diff --git a/articles/getting-started.html b/articles/getting-started.html index 2abceebf..20d451a7 100644 --- a/articles/getting-started.html +++ b/articles/getting-started.html @@ -24,7 +24,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/articles/index.html b/articles/index.html index 09354326..fc6079a1 100644 --- a/articles/index.html +++ b/articles/index.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/articles/matrix_data.html b/articles/matrix_data.html index c6e5a785..995d11f9 100644 --- a/articles/matrix_data.html +++ b/articles/matrix_data.html @@ -24,7 +24,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/articles/notation.html b/articles/notation.html index e7ab66ce..a2a58239 100644 --- a/articles/notation.html +++ b/articles/notation.html @@ -24,7 +24,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/articles/plink_files.html b/articles/plink_files.html index 2225de80..39797679 100644 --- a/articles/plink_files.html +++ b/articles/plink_files.html @@ -24,7 +24,7 @@ plmmr - 4.2.0 + 4.1.0.1 @@ -106,7 +106,7 @@ Processing PLINK files temp_dir <- tempdir() # using a temp dir -- change to fit your preference unzip_example_data(outdir = temp_dir) -#> Unzipped files are saved in /tmp/Rtmph6hzBv +#> Unzipped files are saved in /tmp/RtmpzM7QaI For GWAS data, we have to tell plmmr how to combine information across all three PLINK files (the .bed, .bim, and .fam files). We do this with @@ -139,7 +139,7 @@ Processing PLINK files#> Imputing the missing (genotype) values using mode method #> #> process_plink() completed -#> Processed files now saved as /tmp/Rtmph6hzBv/imputed_penncath_lite.rds +#> Processed files now saved as /tmp/RtmpzM7QaI/imputed_penncath_lite.rds You’ll see a lot of messages printed to the console here … the result of all this is the creation of 3 files: imputed_penncath_lite.rds and @@ -162,7 +162,7 @@ Processing PLINK files#> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr "FileBacked" #> .. .. ..$ filename : chr "imputed_penncath_lite.bk" -#> .. .. ..$ dirname : chr "/tmp/Rtmph6hzBv/" +#> .. .. ..$ dirname : chr "/tmp/RtmpzM7QaI/" #> .. .. ..$ totalRows : int 1401 #> .. .. ..$ totalCols : int 4367 #> .. .. ..$ rowOffset : num [1:2] 0 1401 @@ -231,7 +231,7 @@ Creating a design#> There are 62 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... -#> Standardization completed at 2024-12-13 20:58:57 +#> Standardization completed at 2024-12-17 19:29:42 #> Done with standardization. File formatting in progress # examine the design - notice the components of this object @@ -253,7 +253,7 @@ Creating a design#> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr "FileBacked" #> .. .. ..$ filename : chr "std_penncath_lite.bk" -#> .. .. ..$ dirname : chr "/tmp/Rtmph6hzBv/" +#> .. .. ..$ dirname : chr "/tmp/RtmpzM7QaI/" #> .. .. ..$ totalRows : int 1401 #> .. .. ..$ totalCols : int 4305 #> .. .. ..$ rowOffset : num [1:2] 0 1401 @@ -300,18 +300,18 @@ Fitting a model return_fit = T) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized -#> Input data passed all checks at 2024-12-13 20:58:58 +#> Input data passed all checks at 2024-12-17 19:29:43 #> Starting decomposition. #> Calculating the eigendecomposition of K -#> Eigendecomposition finished at 2024-12-13 20:59:00 +#> Eigendecomposition finished at 2024-12-17 19:29:45 #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:00 +#> Rotation (preconditiong) finished at 2024-12-17 19:29:45 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:03 +#> Model fitting finished at 2024-12-17 19:29:48 #> Beta values are estimated -- almost done! #> Formatting results (backtransforming coefs. to original scale). -#> Model ready at 2024-12-13 20:59:03 +#> Model ready at 2024-12-17 19:29:48 # you can turn off the trace messages by letting trace = F (default) We examine our model results below: @@ -342,10 +342,10 @@ Cross validation#> Starting decomposition. #> Calculating the eigendecomposition of K #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:05 +#> Rotation (preconditiong) finished at 2024-12-17 19:29:50 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:08 +#> Model fitting finished at 2024-12-17 19:29:53 #> 'Fold' argument is either NULL or missing; assigning folds randomly (by default). #> #> To specify folds for each observation, supply a vector with fold assignments. @@ -356,42 +356,42 @@ Cross validation#> Calculating the eigendecomposition of K #> Fitting model in fold 1 : #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:09 +#> Rotation (preconditiong) finished at 2024-12-17 19:29:54 #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:11 +#> Model fitting finished at 2024-12-17 19:29:56 #> | |============== | 20% #> Beginning eigendecomposition in fold 2 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 2 : #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:13 +#> Rotation (preconditiong) finished at 2024-12-17 19:29:58 #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:15 +#> Model fitting finished at 2024-12-17 19:30:00 #> | |============================ | 40%Beginning eigendecomposition in fold 3 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 3 : #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:16 +#> Rotation (preconditiong) finished at 2024-12-17 19:30:01 #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:18 +#> Model fitting finished at 2024-12-17 19:30:04 #> | |========================================== | 60%Beginning eigendecomposition in fold 4 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 4 : #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:20 +#> Rotation (preconditiong) finished at 2024-12-17 19:30:05 #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:22 +#> Model fitting finished at 2024-12-17 19:30:07 #> | |======================================================== | 80%Beginning eigendecomposition in fold 5 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 5 : #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:23 +#> Rotation (preconditiong) finished at 2024-12-17 19:30:08 #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:25 +#> Model fitting finished at 2024-12-17 19:30:11 #> | |======================================================================| 100% There are plot and summary methods for CV models as well: diff --git a/authors.html b/authors.html index b923f7dc..8e349b2d 100644 --- a/authors.html +++ b/authors.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/index.html b/index.html index 7f38a1f0..3804dc34 100644 --- a/index.html +++ b/index.html @@ -26,7 +26,7 @@ plmmr - 4.2.0 + 4.1.0.1 @@ -142,7 +142,7 @@ Developers Dev status - + diff --git a/news/index.html b/news/index.html index c242b485..c1d216c9 100644 --- a/news/index.html +++ b/news/index.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/pkgdown.yml b/pkgdown.yml index 6ba67070..10c68f32 100644 --- a/pkgdown.yml +++ b/pkgdown.yml @@ -7,7 +7,7 @@ articles: articles/matrix_data: matrix_data.html articles/notation: notation.html articles/plink_files: plink_files.html -last_built: 2024-12-13T20:58Z +last_built: 2024-12-17T19:28Z urls: reference: https://pbreheny.github.io/plmmr/reference article: https://pbreheny.github.io/plmmr/articles diff --git a/reference/MCP.html b/reference/MCP.html index f97ecbe0..5d7253e8 100644 --- a/reference/MCP.html +++ b/reference/MCP.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/SCAD.html b/reference/SCAD.html index 63cfe2aa..97150aab 100644 --- a/reference/SCAD.html +++ b/reference/SCAD.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/add_predictors.html b/reference/add_predictors.html index a67f7750..936c6a8f 100644 --- a/reference/add_predictors.html +++ b/reference/add_predictors.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/adjust_beta_dimension.html b/reference/adjust_beta_dimension.html index 1a4afb70..ad20a312 100644 --- a/reference/adjust_beta_dimension.html +++ b/reference/adjust_beta_dimension.html @@ -15,7 +15,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/admix.html b/reference/admix.html index 5ed67f61..5d921268 100644 --- a/reference/admix.html +++ b/reference/admix.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/align_ids.html b/reference/align_ids.html index f72b1b27..6db9b28a 100644 --- a/reference/align_ids.html +++ b/reference/align_ids.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/big_cbind.html b/reference/big_cbind.html index 8fa08a4e..ecb00d6e 100644 --- a/reference/big_cbind.html +++ b/reference/big_cbind.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/check_for_file_extension.html b/reference/check_for_file_extension.html index 49461d0b..419902f9 100644 --- a/reference/check_for_file_extension.html +++ b/reference/check_for_file_extension.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/coef.cv_plmm.html b/reference/coef.cv_plmm.html index 22efb4d2..7963aaac 100644 --- a/reference/coef.cv_plmm.html +++ b/reference/coef.cv_plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/coef.plmm.html b/reference/coef.plmm.html index 115ca66f..c779752a 100644 --- a/reference/coef.plmm.html +++ b/reference/coef.plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/construct_variance.html b/reference/construct_variance.html index d881b812..6af5a0d4 100644 --- a/reference/construct_variance.html +++ b/reference/construct_variance.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/count_constant_features.html b/reference/count_constant_features.html index a8b314b6..3235eac6 100644 --- a/reference/count_constant_features.html +++ b/reference/count_constant_features.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/count_cores.html b/reference/count_cores.html index 55c1fb11..80238e99 100644 --- a/reference/count_cores.html +++ b/reference/count_cores.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/create_design.html b/reference/create_design.html index 301ca7ba..3677a267 100644 --- a/reference/create_design.html +++ b/reference/create_design.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 @@ -132,7 +132,7 @@ Examples#> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed -#> Processed files now saved as /tmp/Rtmpm57oVf/processed_colon2.rds +#> Processed files now saved as /tmp/RtmpuyGgXe/processed_colon2.rds # prepare outcome data colon_outcome <- read.delim(find_example_data(path = "colon2_outcome.txt")) @@ -145,7 +145,7 @@ Examples#> There are 0 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... -#> Standardization completed at 2024-12-13 20:58:19 +#> Standardization completed at 2024-12-17 19:29:03 #> Done with standardization. File formatting in progress # look at the results @@ -168,7 +168,7 @@ Examples#> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr "FileBacked" #> .. .. ..$ filename : chr "std_colon2.bk" -#> .. .. ..$ dirname : chr "/tmp/Rtmpm57oVf/" +#> .. .. ..$ dirname : chr "/tmp/RtmpuyGgXe/" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 @@ -191,7 +191,7 @@ Examples# process PLINK data temp_dir <- tempdir() unzip_example_data(outdir = temp_dir) -#> Unzipped files are saved in /tmp/Rtmpm57oVf +#> Unzipped files are saved in /tmp/RtmpuyGgXe plink_data <- process_plink(data_dir = temp_dir, data_prefix = "penncath_lite", @@ -215,7 +215,7 @@ Examples#> Imputing the missing (genotype) values using mode method #> #> process_plink() completed -#> Processed files now saved as /tmp/Rtmpm57oVf/imputed_penncath_lite.rds +#> Processed files now saved as /tmp/RtmpuyGgXe/imputed_penncath_lite.rds # get outcome data penncath_pheno <- read.csv(find_example_data(path = 'penncath_clinical.csv')) @@ -250,7 +250,7 @@ Examples#> There are 62 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... -#> Standardization completed at 2024-12-13 20:58:21 +#> Standardization completed at 2024-12-17 19:29:06 #> Done with standardization. File formatting in progress # examine the design - notice the components of this object diff --git a/reference/create_design_filebacked.html b/reference/create_design_filebacked.html index 29f33c0f..a37f7c22 100644 --- a/reference/create_design_filebacked.html +++ b/reference/create_design_filebacked.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/create_design_in_memory.html b/reference/create_design_in_memory.html index 85c43c21..c1c141fe 100644 --- a/reference/create_design_in_memory.html +++ b/reference/create_design_in_memory.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/create_log.html b/reference/create_log.html index d6caf862..1e7f4576 100644 --- a/reference/create_log.html +++ b/reference/create_log.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/cv-plmm.log b/reference/cv-plmm.log index 9a0c657c..af69c49a 100644 --- a/reference/cv-plmm.log +++ b/reference/cv-plmm.log @@ -1,21 +1,21 @@ ### plmmr log file ### Logging to ./cv-plmm.log -Host: fv-az797-383 +Host: fv-az1074-827 Current working directory: /home/runner/work/plmmr/plmmr/docs/reference -Start log at: 2024-12-13 20:58:36 +Start log at: 2024-12-17 19:29:20 Call: cv_plmm(design = admix_design) -Input data passed all checks at 2024-12-13 20:58:36 +Input data passed all checks at 2024-12-17 19:29:20 -Eigendecomposition finished at 2024-12-13 20:58:36 +Eigendecomposition finished at 2024-12-17 19:29:20 -Full model fit finished at 2024-12-13 20:58:36 +Full model fit finished at 2024-12-17 19:29:20 -Formatting for full model finished at 2024-12-13 20:58:36 +Formatting for full model finished at 2024-12-17 19:29:20 -Cross validation started at: 2024-12-13 20:58:36 -Started fold 1 at 2024-12-13 20:58:36 -Started fold 2 at 2024-12-13 20:58:36 -Started fold 3 at 2024-12-13 20:58:37 -Started fold 4 at 2024-12-13 20:58:37 -Started fold 5 at 2024-12-13 20:58:37 +Cross validation started at: 2024-12-17 19:29:21 +Started fold 1 at 2024-12-17 19:29:21 +Started fold 2 at 2024-12-17 19:29:21 +Started fold 3 at 2024-12-17 19:29:21 +Started fold 4 at 2024-12-17 19:29:21 +Started fold 5 at 2024-12-17 19:29:21 diff --git a/reference/cv_plmm.html b/reference/cv_plmm.html index 9ff63694..82f73eab 100644 --- a/reference/cv_plmm.html +++ b/reference/cv_plmm.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/cvf.html b/reference/cvf.html index 339aff41..d36f525a 100644 --- a/reference/cvf.html +++ b/reference/cvf.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/eigen_K.html b/reference/eigen_K.html index bd1b73f2..1d8cea63 100644 --- a/reference/eigen_K.html +++ b/reference/eigen_K.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/estimate_eta.html b/reference/estimate_eta.html index 7b7b3140..e01f5022 100644 --- a/reference/estimate_eta.html +++ b/reference/estimate_eta.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/fbm2bm.html b/reference/fbm2bm.html index 126285e4..8ab3cd5f 100644 --- a/reference/fbm2bm.html +++ b/reference/fbm2bm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/file_sans_ext.html b/reference/file_sans_ext.html index a708d22a..5b94581e 100644 --- a/reference/file_sans_ext.html +++ b/reference/file_sans_ext.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/find_example_data.html b/reference/find_example_data.html index c9771bcb..970fcf14 100644 --- a/reference/find_example_data.html +++ b/reference/find_example_data.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/get_data.html b/reference/get_data.html index bc7173a1..9b0f99ec 100644 --- a/reference/get_data.html +++ b/reference/get_data.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/get_hostname.html b/reference/get_hostname.html index f5889bca..061b0837 100644 --- a/reference/get_hostname.html +++ b/reference/get_hostname.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/impute_snp_data.html b/reference/impute_snp_data.html index 25baac5a..816e4a82 100644 --- a/reference/impute_snp_data.html +++ b/reference/impute_snp_data.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/index.html b/reference/index.html index 60181768..8361ef20 100644 --- a/reference/index.html +++ b/reference/index.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/index_samples.html b/reference/index_samples.html index 7dd267e2..d1a78b0a 100644 --- a/reference/index_samples.html +++ b/reference/index_samples.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/index_std_X.html b/reference/index_std_X.html index 6b1a0dd5..5e13056f 100644 --- a/reference/index_std_X.html +++ b/reference/index_std_X.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/lam_names.html b/reference/lam_names.html index 8cb3a10c..6a3a25f5 100644 --- a/reference/lam_names.html +++ b/reference/lam_names.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/lasso.html b/reference/lasso.html index 866b21cd..ec2f4f1a 100644 --- a/reference/lasso.html +++ b/reference/lasso.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/log_lik.html b/reference/log_lik.html index 5c683426..5b1f59b5 100644 --- a/reference/log_lik.html +++ b/reference/log_lik.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/name_and_count_bigsnp.html b/reference/name_and_count_bigsnp.html index 05e0e53c..481e37ad 100644 --- a/reference/name_and_count_bigsnp.html +++ b/reference/name_and_count_bigsnp.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm.html b/reference/plmm.html index 9ead38bb..8308c76e 100644 --- a/reference/plmm.html +++ b/reference/plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm.log b/reference/plmm.log index 450ddcbf..2844046d 100644 --- a/reference/plmm.log +++ b/reference/plmm.log @@ -1,11 +1,11 @@ ### plmmr log file ### Logging to ./plmm.log -Host: fv-az797-383 +Host: fv-az1074-827 Current working directory: /home/runner/work/plmmr/plmmr/docs/reference -Start log at: 2024-12-13 20:58:38 +Start log at: 2024-12-17 19:29:22 Call: plmm(design = admix_design) -Input data passed all checks at 2024-12-13 20:58:38 +Input data passed all checks at 2024-12-17 19:29:22 -Eigendecomposition finished at 2024-12-13 20:58:38 +Eigendecomposition finished at 2024-12-17 19:29:22 -Model ready at 2024-12-13 20:58:38 +Model ready at 2024-12-17 19:29:22 diff --git a/reference/plmm_checks.html b/reference/plmm_checks.html index 722bfd7b..3ddf7c82 100644 --- a/reference/plmm_checks.html +++ b/reference/plmm_checks.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm_fit.html b/reference/plmm_fit.html index 875dc69f..7c9c0b6d 100644 --- a/reference/plmm_fit.html +++ b/reference/plmm_fit.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm_format.html b/reference/plmm_format.html index 00812a22..1498c488 100644 --- a/reference/plmm_format.html +++ b/reference/plmm_format.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm_loss.html b/reference/plmm_loss.html index efeaf083..f7c5b1ad 100644 --- a/reference/plmm_loss.html +++ b/reference/plmm_loss.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm_prep.html b/reference/plmm_prep.html index 94979bab..67edf2f1 100644 --- a/reference/plmm_prep.html +++ b/reference/plmm_prep.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmmr-package.html b/reference/plmmr-package.html index 142e9a2a..89d9769b 100644 --- a/reference/plmmr-package.html +++ b/reference/plmmr-package.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plot.cv_plmm.html b/reference/plot.cv_plmm.html index 39b3fa0e..300cda4e 100644 --- a/reference/plot.cv_plmm.html +++ b/reference/plot.cv_plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plot.plmm.html b/reference/plot.plmm.html index 084ac612..cebd5419 100644 --- a/reference/plot.plmm.html +++ b/reference/plot.plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/predict.plmm.html b/reference/predict.plmm.html index c9d1a0b6..4fc374b8 100644 --- a/reference/predict.plmm.html +++ b/reference/predict.plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/predict_within_cv.html b/reference/predict_within_cv.html index cd297447..3356b8dc 100644 --- a/reference/predict_within_cv.html +++ b/reference/predict_within_cv.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/pretty_time.html b/reference/pretty_time.html index 575cad2a..9948ad16 100644 --- a/reference/pretty_time.html +++ b/reference/pretty_time.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/print.summary.cv_plmm.html b/reference/print.summary.cv_plmm.html index ae0229b1..a229d685 100644 --- a/reference/print.summary.cv_plmm.html +++ b/reference/print.summary.cv_plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/print.summary.plmm.html b/reference/print.summary.plmm.html index 3f777aeb..6249b7ea 100644 --- a/reference/print.summary.plmm.html +++ b/reference/print.summary.plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/process_delim.html b/reference/process_delim.html index 591d232e..a9bdb06c 100644 --- a/reference/process_delim.html +++ b/reference/process_delim.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 @@ -125,7 +125,7 @@ Examples#> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed -#> Processed files now saved as /tmp/Rtmpm57oVf/processed_colon2.rds +#> Processed files now saved as /tmp/RtmpuyGgXe/processed_colon2.rds colon2 <- readRDS(colon_dat) str(colon2) @@ -134,7 +134,7 @@ Examples#> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr "FileBacked" #> .. .. ..$ filename : chr "processed_colon2.bk" -#> .. .. ..$ dirname : chr "/tmp/Rtmpm57oVf/" +#> .. .. ..$ dirname : chr "/tmp/RtmpuyGgXe/" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 diff --git a/reference/process_plink.html b/reference/process_plink.html index 12c8f844..194ad4bb 100644 --- a/reference/process_plink.html +++ b/reference/process_plink.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/read_data_files.html b/reference/read_data_files.html index 6cb0b73f..ee5459a5 100644 --- a/reference/read_data_files.html +++ b/reference/read_data_files.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/read_plink_files.html b/reference/read_plink_files.html index 6d9c0e2a..b7c7c783 100644 --- a/reference/read_plink_files.html +++ b/reference/read_plink_files.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/relatedness_mat.html b/reference/relatedness_mat.html index 9993520b..70ff9d07 100644 --- a/reference/relatedness_mat.html +++ b/reference/relatedness_mat.html @@ -13,7 +13,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/rotate_filebacked.html b/reference/rotate_filebacked.html index e4b449cd..e32718ed 100644 --- a/reference/rotate_filebacked.html +++ b/reference/rotate_filebacked.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/setup_lambda.html b/reference/setup_lambda.html index faf74199..aba7b89a 100644 --- a/reference/setup_lambda.html +++ b/reference/setup_lambda.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/standardize_filebacked.html b/reference/standardize_filebacked.html index c8c9fd87..c41de5b6 100644 --- a/reference/standardize_filebacked.html +++ b/reference/standardize_filebacked.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/standardize_in_memory.html b/reference/standardize_in_memory.html index 608fcb53..1caec0d8 100644 --- a/reference/standardize_in_memory.html +++ b/reference/standardize_in_memory.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/subset_filebacked.html b/reference/subset_filebacked.html index de66a96f..6b9a0bd3 100644 --- a/reference/subset_filebacked.html +++ b/reference/subset_filebacked.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/summary.cv_plmm.html b/reference/summary.cv_plmm.html index e1126ca9..00859686 100644 --- a/reference/summary.cv_plmm.html +++ b/reference/summary.cv_plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/summary.plmm.html b/reference/summary.plmm.html index 8a9f02ed..3e8425b2 100644 --- a/reference/summary.plmm.html +++ b/reference/summary.plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/untransform.html b/reference/untransform.html index c0d04515..9686b990 100644 --- a/reference/untransform.html +++ b/reference/untransform.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/untransform_delim.html b/reference/untransform_delim.html index 8b0d0f78..5364e87b 100644 --- a/reference/untransform_delim.html +++ b/reference/untransform_delim.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/untransform_in_memory.html b/reference/untransform_in_memory.html index 69739408..667e85e0 100644 --- a/reference/untransform_in_memory.html +++ b/reference/untransform_in_memory.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/untransform_plink.html b/reference/untransform_plink.html index a4aedd2c..227249a7 100644 --- a/reference/untransform_plink.html +++ b/reference/untransform_plink.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/unzip_example_data.html b/reference/unzip_example_data.html index fc874259..97d1ddd2 100644 --- a/reference/unzip_example_data.html +++ b/reference/unzip_example_data.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/search.json b/search.json index df7d0a58..1ee43d7c 100644 --- a/search.json +++ b/search.json @@ -1 +1 @@ -[{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"process-the-data","dir":"Articles","previous_headings":"","what":"Process the data","title":"If your data is in a delimited file","text":"output messages indicate data processed. call created 2 files, one .rds file corresponding .bk file. .bk file special type binary file can used store large data sets. .rds file contains pointer .bk file, along meta-data. Note returned process_delim() character string filepath: .","code":"# I will create the processed data files in a temporary directory; # fill in the `rds_dir` argument with the directory of your choice temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", overwrite = TRUE, header = TRUE) #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpQdavVL/processed_colon2.rds # look at what is created colon <- readRDS(colon_dat)"},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"create-a-design","dir":"Articles","previous_headings":"","what":"Create a design","title":"If your data is in a delimited file","text":"Creating design ensures data uniform format prior analysis. delimited files, two main processes happening create_design(): (1) standardization columns (2) construction penalty factor vector. Standardization columns ensures features evaluated model uniform scale; done transforming column design matrix mean 0 variance 1. penalty factor vector indicator vector 0 represents feature always model – feature unpenalized. specify columns want unpenalized, use ‘unpen’ argument. example, choosing make ‘sex’ unpenalized covariate. side note unpenalized covariates: delimited file data, features want include model – penalized unpenalized features – must included delimited file. differs PLINK file data analyzed; look create_design() documentation details examples. process_delim(), create_design() function returns filepath: . output messages document steps create design procedure, messages saved text file colon_design.log rds_dir folder. didactic purposes, can look design:","code":"# prepare outcome data colon_outcome <- read.delim(find_example_data(path = \"colon2_outcome.txt\")) # create a design colon_design <- create_design(data_file = colon_dat, rds_dir = temp_dir, new_file = \"std_colon2\", add_outcome = colon_outcome, outcome_id = \"ID\", outcome_col = \"y\", unpen = \"sex\", # this will keep 'sex' in the final model logfile = \"colon_design\") #> No feature_id supplied; will assume data X are in same row-order as add_outcome. #> There are 0 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-13 20:58:41 #> Done with standardization. File formatting in progress # look at the results colon_rds <- readRDS(colon_design) str(colon_rds) #> List of 18 #> $ X_colnames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ X_rownames : chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ n : num 62 #> $ p : num 2001 #> $ is_plink : logi FALSE #> $ outcome_idx : int [1:62] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : int [1:62] 1 0 1 0 1 0 1 0 1 0 ... #> $ std_X_rownames: chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ unpen : int 1 #> $ unpen_colnames: chr \"sex\" #> $ ns : int [1:2001] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpQdavVL/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 62 #> $ std_X_p : num 2001 #> $ std_X_center : num [1:2001] 1.47 7015.79 4966.96 4094.73 3987.79 ... #> $ std_X_scale : num [1:2001] 0.499 3067.926 2171.166 1803.359 2002.738 ... #> $ penalty_factor: num [1:2001] 0 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\""},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"fit-a-model","dir":"Articles","previous_headings":"","what":"Fit a model","title":"If your data is in a delimited file","text":"fit model using design follows: Notice messages printed – documentation may optionally saved another .log file using logfile argument. can examine results specific \\lambda value: may also plot paths estimated coefficients:","code":"colon_fit <- plmm(design = colon_design, return_fit = TRUE, trace = TRUE) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Input data passed all checks at 2024-12-13 20:58:42 #> Starting decomposition. #> Calculating the eigendecomposition of K #> Eigendecomposition finished at 2024-12-13 20:58:42 #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:58:42 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:58:42 #> Beta values are estimated -- almost done! #> Formatting results (backtransforming coefs. to original scale). #> Model ready at 2024-12-13 20:58:42 summary(colon_fit, idx = 50) #> lasso-penalized regression model with n=62, p=2002 at lambda=0.0597 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 30 #> ------------------------------------------------- plot(colon_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"prediction-for-filebacked-data","dir":"Articles","previous_headings":"","what":"Prediction for filebacked data","title":"If your data is in a delimited file","text":"example shows experimental option, wherein working add prediction method filebacked outside cross-validation.","code":"# linear predictor yhat_lp <- predict(object = colon_fit, newX = attach.big.matrix(colon$X), type = \"lp\") # best linear unbiased predictor yhat_blup <- predict(object = colon_fit, newX = attach.big.matrix(colon$X), type = \"blup\") # look at mean squared prediction error mspe_lp <- apply(yhat_lp, 2, function(c){crossprod(colon_outcome$y - c)/length(c)}) mspe_blup <- apply(yhat_blup, 2, function(c){crossprod(colon_outcome$y - c)/length(c)}) min(mspe_lp) #> [1] 0.007659158 min(mspe_blup) #> [1] 0.00617254"},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Getting started with plmmr","text":"plmmr package fitting Penalized Linear Mixed Models R. package created purpose fitting penalized regression models high dimensional data observations correlated. instance, kind data arises often context genetics (e.g., GWAS population structure /family grouping). novelties plmmr : Integration: plmmr combines functionality several packages order quality control, model fitting/analysis, data visualization one package. example, GWAS data, plmmr take PLINK files way list SNPs downstream analysis. Accessibility: plmmr can run R session typical desktop laptop computer. user need access supercomputer experience command line order fit models plmmr. Handling correlation: plmmr uses transformation (1) measures correlation among samples (2) uses correlation measurement improve predictions (via best linear unbiased predictor, BLUP). means plmm(), ’s need filter data ‘maximum subset unrelated samples.’","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"minimal-example","dir":"Articles","previous_headings":"","what":"Minimal example","title":"Getting started with plmmr","text":"minimal reproducible example plmmr can used:","code":"# library(plmmr) fit <- plmm(admix$X, admix$y) # admix data ships with package plot(fit) cvfit <- cv_plmm(admix$X, admix$y) plot(cvfit) summary(cvfit) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2325): #> ------------------------------------------------- #> Nonzero coefficients: 8 #> Cross-validation error (deviance): 2.12 #> Scale estimate (sigma): 1.455"},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"file-backing","dir":"Articles","previous_headings":"Computational capability","what":"File-backing","title":"Getting started with plmmr","text":"many applications high dimensional data analysis, dataset large read R – session crash lack memory. particularly common analyzing data genome-wide association studies (GWAS). analyze large datasets, plmmr equipped analyze data using filebacking - strategy lets R ‘point’ file disk, rather reading file R session. Many packages use technique - bigstatsr biglasso two examples packages use filebacking technique. package plmmr uses create store filebacked objects bigmemory. filebacked computation relies biglasso package Yaohui Zeng et al. bigalgebra Michael Kane et al. processing PLINK files, use methods bigsnpr package Florian Privé.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"numeric-outcomes-only","dir":"Articles","previous_headings":"Computational capability","what":"Numeric outcomes only","title":"Getting started with plmmr","text":"time, package designed linear regression – , considering continuous (numeric) outcomes. maintain treating binary outcomes numeric values appropriate contexts, described Hastie et al. Elements Statistical Learning, chapter 4. future, like extend package handle dichotomous outcomes via logistic regression; theoretical work underlying open problem.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"types-of-penalization","dir":"Articles","previous_headings":"Computational capability","what":"3 types of penalization","title":"Getting started with plmmr","text":"Since focused penalized regression package, plmmr offers 3 choices penalty: minimax concave (MCP), smoothly clipped absolute deviation (SCAD), least absolute shrinkage selection operator (LASSO). implementation penalties built concepts/techniques provided ncvreg package.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"data-size-and-dimensionality","dir":"Articles","previous_headings":"Computational capability","what":"Data size and dimensionality","title":"Getting started with plmmr","text":"distinguish data attributes ‘big’ ‘high dimensional.’ ‘Big’ describes amount space data takes computer, ‘high dimensional’ describes context ratio features (also called ‘variables’ ‘predictors’) observations (e.g., samples) high. instance, data 100 samples 100 variables high dimensional, big. contrast, data 10 million observations 100 variables big, high dimensional. plmmr optimized data high dimensional – methods using estimate relatedness among observations perform best high number features relative number observations. plmmr also designed accommodate data large analyze -memory. accommodate data file-backing (described ). current analysis pipeline works well data files 40 Gb size. practice, means plmmr equipped analyze GWAS data, biobank-sized data.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"data-input-types","dir":"Articles","previous_headings":"","what":"Data input types","title":"Getting started with plmmr","text":"plmmr currently works three types data input: Data stored -memory matrix data frame Data stored PLINK files Data stored delimited files","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"example-data-sets","dir":"Articles","previous_headings":"Data input types","what":"Example data sets","title":"Getting started with plmmr","text":"plmmr currently includes three example data sets, one type data input. admix data example matrix input data. admix small data set (197 observations, 100 SNPs) describes individuals different ancestry groups. outcome admix simulated include population structure effects (.e. race/ethnicity impact SNP associations). data set available whenever library(plmmr) called. example analysis admix data available vignette('matrix_data', package = \"plmmr\"). penncath_lite data example PLINK input data. penncath_lite (data coronary artery disease PennCath study) high dimensional data set (1401 observations, 4217 SNPs) several health outcomes well age sex information. features data set represent small subset much larger GWAS data set (original data 800K SNPs). information data set, refer original publication. example analysis penncath_lite data available vignette('plink_files', package = \"plmmr\"). colon2 data example delimited-file input data. colon2 variation colon data included biglasso package. colon2 62 observations 2,001 features representing study colon disease. 2000 features original data, ‘sex’ feature simulated. example analysis colon2 data available vignette('delim_files', package = \"plmmr\").","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"basic-model-fitting","dir":"Articles","previous_headings":"","what":"Basic model fitting","title":"If your data is in a matrix or data frame","text":"admix dataset now ready analyze call plmmr::plmm() (one main functions plmmr): Notice: passing admix$X design argument plmm(); internally, plmm() taken X input created plmm_design object. also supply X y create_design() make step explicit. returned beta_vals item matrix whose rows \\hat\\beta coefficients whose columns represent values penalization parameter \\lambda. default, plmm fits 100 values \\lambda (see setup_lambda function details). Note values \\lambda, SNP 8 \\hat \\beta = 0. SNP 8 constant feature, feature (.e., column \\mathbf{X}) whose values vary among members population. can summarize fit nth \\lambda value: can also plot path fit see model coefficients vary \\lambda: Plot path model fit Suppose also know ancestry groups person admix data self-identified. probably want include model unpenalized covariate (.e., want ‘ancestry’ always model). specify unpenalized covariate, need use create_design() function prior calling plmm(). look: may compare results model includes ‘ancestry’ first model:","code":"admix_fit <- plmm(admix$X, admix$y) summary(admix_fit, lambda = admix_fit$lambda[50]) #> lasso-penalized regression model with n=197, p=101 at lambda=0.01426 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 88 #> ------------------------------------------------- admix_fit$beta_vals[1:10, 97:100] |> knitr::kable(digits = 3, format = \"html\") # for n = 25 summary(admix_fit, lambda = admix_fit$lambda[25]) #> lasso-penalized regression model with n=197, p=101 at lambda=0.08163 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 46 #> ------------------------------------------------- plot(admix_fit) # add ancestry to design matrix X_plus_ancestry <- cbind(admix$ancestry, admix$X) # adjust column names -- need these for designating 'unpen' argument colnames(X_plus_ancestry) <- c(\"ancestry\", colnames(admix$X)) # create a design admix_design2 <- create_design(X = X_plus_ancestry, y = admix$y, # below, I mark ancestry variable as unpenalized # we want ancestry to always be in the model unpen = \"ancestry\") # now fit a model admix_fit2 <- plmm(design = admix_design2) summary(admix_fit2, idx = 25) #> lasso-penalized regression model with n=197, p=102 at lambda=0.09886 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 14 #> ------------------------------------------------- plot(admix_fit2)"},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"cross-validation","dir":"Articles","previous_headings":"","what":"Cross validation","title":"If your data is in a matrix or data frame","text":"select \\lambda value, often use cross validation. example using cv_plmm select \\lambda minimizes cross-validation error: can also plot cross-validation error (CVE) versus \\lambda (log scale): Plot CVE","code":"admix_cv <- cv_plmm(design = admix_design2, return_fit = T) admix_cv_s <- summary(admix_cv, lambda = \"min\") print(admix_cv_s) #> lasso-penalized model with n=197 and p=102 #> At minimum cross-validation error (lambda=0.1853): #> ------------------------------------------------- #> Nonzero coefficients: 3 #> Cross-validation error (deviance): 1.33 #> Scale estimate (sigma): 1.154 plot(admix_cv)"},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"predicted-values","dir":"Articles","previous_headings":"","what":"Predicted values","title":"If your data is in a matrix or data frame","text":"example predict() methods PLMMs: can compare predictions predictions get intercept-model using mean squared prediction error (MSPE) – lower better: see model better predictions null.","code":"# make predictions for select lambda value(s) y_hat <- predict(object = admix_fit, newX = admix$X, type = \"blup\", X = admix$X, y = admix$y) # intercept-only (or 'null') model crossprod(admix$y - mean(admix$y))/length(admix$y) #> [,1] #> [1,] 5.928528 # our model at its best value of lambda apply(y_hat, 2, function(c){crossprod(admix$y - c)/length(c)}) -> mse min(mse) #> [1] 0.6930826 # ^ across all values of lambda, our model has MSPE lower than the null model"},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"math-notation","dir":"Articles","previous_headings":"","what":"Math notation","title":"Notes on notation","text":"concepts need denote, order usage derivations. blocked sections corresponding steps model fitting process.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"statistical-model-the-overall-framework","dir":"Articles","previous_headings":"Math notation","what":"Statistical model (the overall framework)","title":"Notes on notation","text":"overall model can written \\mathbf{y} = \\mathbf{X}\\boldsymbol{\\beta} + \\mathbf{Z}\\boldsymbol{\\gamma} + \\boldsymbol{\\epsilon} equivalently \\mathbf{y} = \\dot{\\mathbf{X}}\\dot{\\boldsymbol{\\beta}} + \\mathbf{u} + \\boldsymbol{\\epsilon} : \\mathbf{X} \\mathbf{y} n \\times p design matrix data n \\times 1 vector outcomes, respectively. , n number observations (e.g., number patients, number samples, etc.) p number features (e.g., number SNPs, number variables, number covariates, etc.). \\dot{\\mathbf{X}} column-standardized \\mathbf{X}, p columns mean 0 standard deviation 1. Note: \\dot{\\mathbf{X}} excludes singular features (columns constants) original \\mathbf{X}. \\dot{\\boldsymbol{\\beta}} represents coefficients standardized scale. \\mathbf{Z} n \\times b matrix indicators corresponding grouping structure, \\boldsymbol{\\gamma} vector values describing grouping associated \\mathbf{y}. real data, values typically unknown. \\boldsymbol{\\epsilon} n \\times 1 vector noise. define realized (empirical) relatedness matrix \\mathbf{K} \\equiv \\frac{1}{p}\\dot{\\mathbf{X}}\\dot{\\mathbf{X}}^\\top model assumes: \\boldsymbol{\\epsilon} \\perp \\mathbf{u} \\boldsymbol{\\epsilon} \\sim N(0, \\sigma^2_{\\epsilon}\\mathbf{}) \\mathbf{u} \\sim N(0, \\sigma^2_{s}\\mathbf{K}) assumptions, may write \\mathbf{y} \\sim N(\\dot{\\mathbf{X}}\\dot{\\boldsymbol{\\beta}}, \\boldsymbol{\\Sigma}) Indices: \\1,..., n indexes observations j \\1,..., p indexes features h \\1,..., b indexes batches (e.g., different family groups, different data collection sites, etc.)","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"decomposition-and-rotation-prep-and-first-part-of-fit","dir":"Articles","previous_headings":"Math notation","what":"Decomposition and rotation (prep and first part of fit)","title":"Notes on notation","text":"Beginning eigendecomposition, \\mathbf{U} \\mathbf{s} eigenvectors eigenvalues \\mathbf{K}, one obtain \\text{eigen}(\\mathbf{K)} \\equiv \\mathbf{U}\\mathbf{S}\\mathbf{U}^\\top. elements \\mathbf{s} diagonal values \\mathbf{S}. Note, random effect \\mathbf{u} distinct columns matrix \\mathbf{U}. k represents number nonzero eigenvalues represented \\mathbf{U} \\mathbf{d}, k \\leq \\text{min}(n,p). , \\mathbf{K} \\equiv \\frac{1}{p}\\dot{\\mathbf{X}}\\dot{\\mathbf{X}}^{\\top} often referred literature realized relatedness matrix (RRM) genomic relatedness matrix (GRM). \\mathbf{K} dimension n \\times n. \\eta ratio \\frac{\\sigma^2_s}{\\sigma^2_e + \\sigma^2_s}. estimate \\hat{\\eta} null model (details come). \\mathbf{\\Sigma} variance outcome, \\mathbb{V}({\\mathbf{y}}) \\propto \\eta \\mathbf{K} + (1 - \\eta)\\mathbf{}_n. \\mathbf{w} vector weights defined (\\eta\\mathbf{\\mathbf{s}} + (1-\\eta))^{-1/2}. values \\mathbf{w} nonzero values diagonal matrix \\mathbf{W} \\equiv (\\eta\\mathbf{S} + (1 - \\eta)\\mathbf{})^{-1/2}. matrix used rotating (preconditioning) data \\mathbf{\\Sigma}^{-1/2} \\equiv \\mathbf{W}\\mathbf{U}^\\top. \\tilde{\\dot{\\mathbf{X}}} \\equiv \\mathbf{W}\\mathbf{U}^\\top\\dot{\\mathbf{X}} rotated data, data transformed scale. \\tilde{\\mathbf{y}} \\equiv \\mathbf{\\Sigma}^{-1/2}\\mathbf{y} outcome rotated scale. \\tilde{\\ddot{\\mathbf{X}}} standardized rotated data. Note: standardization involves scaling, centering. post-rotation standardization impacts estimated coefficients well; define {\\ddot{\\boldsymbol{\\beta}}} estimated coefficients scale.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"model-fitting-with-penalization","dir":"Articles","previous_headings":"Math notation","what":"Model fitting with penalization","title":"Notes on notation","text":"fit \\tilde{\\mathbf{y}} \\sim \\tilde{\\ddot{\\mathbf{X}}} using penalized linear mixed model, obtain \\hat{\\ddot{\\boldsymbol{\\beta}}} estimated coefficients. penalty parameter values (e.g., values lasso tuning parameter) indexed \\lambda_l \\1,..., t.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"rescaling-results-format","dir":"Articles","previous_headings":"Math notation","what":"Rescaling results (format)","title":"Notes on notation","text":"obtain estimated coefficients original scale, values estimated model must unscaled (‘untransformed’) twice: adjust post-rotation standardization, adjust pre-rotation standardization. process written \\hat{\\ddot{\\boldsymbol{\\beta}}} \\rightarrow \\hat{\\dot{\\boldsymbol{\\beta}}} \\rightarrow \\hat{\\boldsymbol{\\beta}}.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"object-names-in-source-code","dir":"Articles","previous_headings":"","what":"Object names in source code","title":"Notes on notation","text":"code, denote objects way: \\mathbf{X} \\mathbf{y} X y \\dot{\\mathbf{X}} std_X \\tilde{\\dot{\\mathbf{X}}} rot_X \\ddot{\\tilde{\\mathbf{X}}} stdrot_X \\hat{\\boldsymbol{\\beta}} named og_scale_beta helper functions (clarity) returned plmm objects beta_vals. beta_vals og_scale_beta equivalent; represent estimated coefficients original scale. \\hat{\\dot{\\boldsymbol{\\beta}}} std_scale_beta \\hat{\\ddot{\\boldsymbol{\\beta}}} stdrot_scale_beta \\dot{\\mathbf{X}}\\hat{\\dot{\\boldsymbol{\\beta}}} Xb \\ddot{\\tilde{\\mathbf{X}}} \\hat{\\ddot{\\boldsymbol{\\beta}}} linear_predictors. Note: words, means linear_predictors code scale rotated re-standardized data! \\hat{\\boldsymbol{\\Sigma}} \\equiv \\hat{\\eta}\\mathbf{K} + (1 - \\hat{\\eta})\\mathbf{} estimated_Sigma. Similarly, \\hat{\\boldsymbol{\\Sigma}}_{11} Sigma_11, etc.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"processing-plink-files","dir":"Articles","previous_headings":"","what":"Processing PLINK files","title":"If your data is in PLINK files","text":"First, unzip PLINK files zipped. example data, penncath_lite data ships plmmr zipped; MacOS Linux, can run command unzip: GWAS data, tell plmmr combine information across three PLINK files (.bed, .bim, .fam files). process_plink(). , create files want temporary directory just sake example. Users can specify folder choice rds_dir, shown : ’ll see lot messages printed console … result creation 3 files: imputed_penncath_lite.rds imputed_penncath_lite.bk contain data. 1 show folder PLINK data . returned filepath. .rds object filepath contains processed data, now use create design. didactic purposes, let’s examine ’s imputed_penncath_lite.rds using readRDS() function (Note Don’t analysis - section reads data memory. just illustration):","code":"temp_dir <- tempdir() # using a temp dir -- change to fit your preference unzip_example_data(outdir = temp_dir) #> Unzipped files are saved in /tmp/Rtmph6hzBv # temp_dir <- tempdir() # using a temporary directory (if you didn't already create one above) plink_data <- process_plink(data_dir = temp_dir, data_prefix = \"penncath_lite\", rds_dir = temp_dir, rds_prefix = \"imputed_penncath_lite\", # imputing the mode to address missing values impute_method = \"mode\", # overwrite existing files in temp_dir # (you can turn this feature off if you need to) overwrite = TRUE, # turning off parallelization - # leaving this on causes problems knitting this vignette parallel = FALSE) #> #> Preprocessing penncath_lite data: #> Creating penncath_lite.rds #> #> There are 1401 observations and 4367 genomic features in the specified data files, representing chromosomes 1 - 22 #> There are a total of 3514 SNPs with missing values #> Of these, 13 are missing in at least 50% of the samples #> #> Imputing the missing (genotype) values using mode method #> #> process_plink() completed #> Processed files now saved as /tmp/Rtmph6hzBv/imputed_penncath_lite.rds pen <- readRDS(plink_data) # notice: this is a `processed_plink` object str(pen) # note: genotype data is *not* in memory #> List of 5 #> $ X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"imputed_penncath_lite.bk\" #> .. .. ..$ dirname : chr \"/tmp/Rtmph6hzBv/\" #> .. .. ..$ totalRows : int 1401 #> .. .. ..$ totalCols : int 4367 #> .. .. ..$ rowOffset : num [1:2] 0 1401 #> .. .. ..$ colOffset : num [1:2] 0 4367 #> .. .. ..$ nrow : num 1401 #> .. .. ..$ ncol : num 4367 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : NULL #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ map:'data.frame': 4367 obs. of 6 variables: #> ..$ chromosome : int [1:4367] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ marker.ID : chr [1:4367] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> ..$ genetic.dist: int [1:4367] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ physical.pos: int [1:4367] 2056735 3188505 4275291 4280630 4286036 4302161 4364564 4388885 4606471 4643688 ... #> ..$ allele1 : chr [1:4367] \"C\" \"T\" \"T\" \"G\" ... #> ..$ allele2 : chr [1:4367] \"T\" \"C\" \"C\" \"A\" ... #> $ fam:'data.frame': 1401 obs. of 6 variables: #> ..$ family.ID : int [1:1401] 10002 10004 10005 10007 10008 10009 10010 10011 10012 10013 ... #> ..$ sample.ID : int [1:1401] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ paternal.ID: int [1:1401] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ maternal.ID: int [1:1401] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ sex : int [1:1401] 1 2 1 1 1 1 1 2 1 2 ... #> ..$ affection : int [1:1401] 1 1 2 1 2 2 2 1 2 -9 ... #> $ n : int 1401 #> $ p : int 4367 #> - attr(*, \"class\")= chr \"processed_plink\" # notice: no more missing values in X any(is.na(pen$genotypes[,])) #> [1] FALSE"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"creating-a-design","dir":"Articles","previous_headings":"","what":"Creating a design","title":"If your data is in PLINK files","text":"Now ready create plmm_design, object pieces need model: design matrix \\mathbf{X}, outcome vector \\mathbf{y}, vector penalty factor indicators (1 = feature penalized, 0 = feature penalized). side note: GWAS studies, typical include non-genomic factors unpenalized covariates part model. instance, may want adjust sex age – factors want ensure always included selected model. plmmr package allows include additional unpenalized predictors via ‘add_predictor’ ‘predictor_id’ options, passed create_design() internal function create_design_filebacked(). example options included create_design() documentation. key part create_design() standardizing columns genotype matrix. didactic example showing columns std_X element design mean = 0 variance = 1. Note something analysis – reads data memory.","code":"# get outcome data penncath_pheno <- read.csv(find_example_data(path = 'penncath_clinical.csv')) phen <- data.frame(FamID = as.character(penncath_pheno$FamID), CAD = penncath_pheno$CAD) pen_design <- create_design(data_file = plink_data, feature_id = \"FID\", rds_dir = temp_dir, new_file = \"std_penncath_lite\", add_outcome = phen, outcome_id = \"FamID\", outcome_col = \"CAD\", logfile = \"design\", # again, overwrite if needed; use with caution overwrite = TRUE) #> There are 62 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-13 20:58:57 #> Done with standardization. File formatting in progress # examine the design - notice the components of this object pen_design_rds <- readRDS(pen_design) str(pen_design_rds) #> List of 16 #> $ X_colnames : chr [1:4367] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> $ X_rownames : chr [1:1401] \"10002\" \"10004\" \"10005\" \"10007\" ... #> $ n : int 1401 #> $ p : int 4367 #> $ is_plink : logi TRUE #> $ outcome_idx : int [1:1401] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : Named int [1:1401] 1 1 1 1 1 1 1 1 1 0 ... #> ..- attr(*, \"names\")= chr [1:1401] \"CAD1\" \"CAD2\" \"CAD3\" \"CAD4\" ... #> $ std_X_rownames: chr [1:1401] \"10002\" \"10004\" \"10005\" \"10007\" ... #> $ ns : int [1:4305] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:4305] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_penncath_lite.bk\" #> .. .. ..$ dirname : chr \"/tmp/Rtmph6hzBv/\" #> .. .. ..$ totalRows : int 1401 #> .. .. ..$ totalCols : int 4305 #> .. .. ..$ rowOffset : num [1:2] 0 1401 #> .. .. ..$ colOffset : num [1:2] 0 4305 #> .. .. ..$ nrow : num 1401 #> .. .. ..$ ncol : num 4305 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : NULL #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 1401 #> $ std_X_p : num 4305 #> $ std_X_center : num [1:4305] 0.00785 0.35974 1.01213 0.06067 0.46253 ... #> $ std_X_scale : num [1:4305] 0.0883 0.7783 0.8636 0.28 1.2791 ... #> $ penalty_factor: num [1:4305] 1 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\" # we can check to see that our data have been standardized std_X <- attach.big.matrix(pen_design_rds$std_X) colMeans(std_X[,]) |> summary() # columns have mean zero... #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> -1.356e-16 -2.334e-17 3.814e-19 9.868e-19 2.520e-17 2.635e-16 apply(std_X[,], 2, var) |> summary() # ... & variance 1 #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 1.001 1.001 1.001 1.001 1.001 1.001"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"fitting-a-model","dir":"Articles","previous_headings":"","what":"Fitting a model","title":"If your data is in PLINK files","text":"Now design object, ready fit model. default, model fitting results saved files folder specified rds_dir argument plmmm. want return model fitting results, set return_fit = TRUE plmm(). examine model results :","code":"pen_fit <- plmm(design = pen_design, trace = T, return_fit = T) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Input data passed all checks at 2024-12-13 20:58:58 #> Starting decomposition. #> Calculating the eigendecomposition of K #> Eigendecomposition finished at 2024-12-13 20:59:00 #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:00 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:03 #> Beta values are estimated -- almost done! #> Formatting results (backtransforming coefs. to original scale). #> Model ready at 2024-12-13 20:59:03 # you can turn off the trace messages by letting trace = F (default) summary(pen_fit, idx = 50) #> lasso-penalized regression model with n=1401, p=4368 at lambda=0.01211 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 537 #> ------------------------------------------------- plot(pen_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"cross-validation","dir":"Articles","previous_headings":"","what":"Cross validation","title":"If your data is in PLINK files","text":"choose tuning parameter model, plmmr offers cross validation method: plot summary methods CV models well:","code":"cv_fit <- cv_plmm(design = pen_design, type = \"blup\", return_fit = T, trace = T) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Starting decomposition. #> Calculating the eigendecomposition of K #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:05 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:08 #> 'Fold' argument is either NULL or missing; assigning folds randomly (by default). #> #> To specify folds for each observation, supply a vector with fold assignments. #> #> Starting cross validation #> | | | 0%Beginning eigendecomposition in fold 1 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 1 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:09 #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:11 #> | |============== | 20% #> Beginning eigendecomposition in fold 2 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 2 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:13 #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:15 #> | |============================ | 40%Beginning eigendecomposition in fold 3 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 3 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:16 #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:18 #> | |========================================== | 60%Beginning eigendecomposition in fold 4 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 4 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:20 #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:22 #> | |======================================================== | 80%Beginning eigendecomposition in fold 5 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 5 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:23 #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:25 #> | |======================================================================| 100% summary(cv_fit) # summary at lambda value that minimizes CV error #> lasso-penalized model with n=1401 and p=4368 #> At minimum cross-validation error (lambda=0.0406): #> ------------------------------------------------- #> Nonzero coefficients: 6 #> Cross-validation error (deviance): 0.22 #> Scale estimate (sigma): 0.471 plot(cv_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"details-create_design-for-plink-data","dir":"Articles","previous_headings":"","what":"Details: create_design() for PLINK data","title":"If your data is in PLINK files","text":"call create_design() involves steps: Integrate external phenotype information, supplied. Note: samples PLINK data phenotype value specified additional phenotype file removed analysis. Identify missing values samples SNPs/features. Impute missing values per user’s specified method. See R documentation bigsnpr::snp_fastImputeSimple() details. Note: plmmr package fit models datasets missing values. missing values must imputed subset analysis. Integrate external predictor information, supplied. matrix meta-data (e.g., age, principal components, etc.). Note: samples supplied file included PLINK data, removed. example, phenotyped participants genotyped participants study, plmmr::create_design() create matrix data representing genotyped samples also data supplied external phenotype file. Create design matrix represents nonsingular features samples predictor phenotype information available (case external data supplied). Standardize design matrix columns mean 0 variance 1.","code":""},{"path":"https://pbreheny.github.io/plmmr/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Tabitha K. Peter. Author. Anna C. Reisetter. Author. Patrick J. Breheny. Author, maintainer. Yujing Lu. Author.","code":""},{"path":"https://pbreheny.github.io/plmmr/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Reisetter , Breheny P (2021). “Penalized linear mixed models structured genetic data.” Genetic epidemiology, 45(5), 427–444. https://doi.org/10.1002/gepi.22384.","code":"@Article{, author = {Anna C. Reisetter and Patrick Breheny}, title = {Penalized linear mixed models for structured genetic data}, journal = {Genetic epidemiology}, year = {2021}, volume = {45}, pages = {427--444}, number = {5}, url = {https://doi.org/10.1002/gepi.22384}, }"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"plmmr-","dir":"","previous_headings":"","what":"plmmr","title":"Penalized Linear Mixed Models for Correlated Data","text":"plmmr (penalized linear mixed models R) package contains functions fit penalized linear mixed models correct unobserved confounding effects.","code":""},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Penalized Linear Mixed Models for Correlated Data","text":"install latest version package GitHub, use : can also install plmmr CRAN: description motivation functions package (along examples) refer second module GWAS data tutorial","code":"devtools::install_github(\"pbreheny/plmmr\") install.packages('plmmr')"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"minimal-example","dir":"","previous_headings":"","what":"Minimal example","title":"Penalized Linear Mixed Models for Correlated Data","text":"","code":"library(plmmr) X <- rnorm(100*20) |> matrix(100, 20) y <- rnorm(100) fit <- plmm(X, y) plot(fit) cvfit <- cv_plmm(X, y) plot(cvfit) summary(cvfit)"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"so-how-fast-is-plmmr-and-how-well-does-it-scale","dir":"","previous_headings":"","what":"So how fast is plmmr? And how well does it scale?","title":"Penalized Linear Mixed Models for Correlated Data","text":"illustrate important questions, created separate GitHub repository scripts plmmr workflow using publicly-available genome-wide association (GWAS) data. main takeaway: using GWAS data study 1,400 samples 800,000 SNPs, full plmmr analysis run half hour using single core laptop. Three smaller datasets ship plmmr, tutorials walking analyze data sets documented documentation site. datasets useful didactic purposes, large enough really highlight computational scalability plmmr – motivated creation separate repository GWAS workflow.","code":""},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"note-on-branches","dir":"","previous_headings":"","what":"Note on branches","title":"Penalized Linear Mixed Models for Correlated Data","text":"branches repo organized following way: master main (‘head’) branch. gh_pages keeping documentation plmmr gwas_scale archived branch contains development version package used run dissertation analysis. delete eventually.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"helper function implement MCP penalty helper functions implement penalty.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"","code":"MCP(z, l1, l2, gamma, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"z vector representing solution active set feature l1 upper bound (beta) l2 lower bound (beta) gamma tuning parameter MCP penalty v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"numeric vector MCP-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement SCAD penalty — SCAD","title":"helper function to implement SCAD penalty — SCAD","text":"helper function implement SCAD penalty","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement SCAD penalty — SCAD","text":"","code":"SCAD(z, l1, l2, gamma, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement SCAD penalty — SCAD","text":"z solution active set feature l1 upper bound l2 lower bound gamma tuning parameter SCAD penalty v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement SCAD penalty — SCAD","text":"numeric vector SCAD-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to add predictors to a filebacked matrix of data — add_predictors","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"helper function add predictors filebacked matrix data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"","code":"add_predictors(obj, add_predictor, id_var, rds_dir, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"obj bigSNP object add_predictor Optional: add additional covariates/predictors/features external file (.e., PLINK file). id_var String specifying column PLINK .fam file unique sample identifiers. rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir(process_plink() call) quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"list 2 components: 'obj' - bigSNP object added element representing matrix includes additional predictors first columns 'non_gen' - integer vector ranges 1 number added predictors. Example: 2 predictors added, unpen= 1:2","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":null,"dir":"Reference","previous_headings":"","what":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"function designed use BLUP prediction. objective get matrix estimated beta coefficients standardized scale, dimension original/training data. adding rows 0s std_scale_beta matrix corresponding singular features X.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"","code":"adjust_beta_dimension(std_scale_beta, p, std_X_details, fbm_flag, plink_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"std_scale_beta matrix estimated beta coefficients scale standardized original/training data Note: rows matrix represent nonsingular columns design matrix p number columns original/training design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' fbm_flag Logical: model fit filebacked? plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"std_scale_b_og_dim: matrix estimated beta coefs. still scale std_X, dimension X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":null,"dir":"Reference","previous_headings":"","what":"Admix: Semi-simulated SNP data — admix","title":"Admix: Semi-simulated SNP data — admix","text":"dataset containing 100 SNPs, demographic variable representing race, simulated outcome","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Admix: Semi-simulated SNP data — admix","text":"","code":"admix"},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Admix: Semi-simulated SNP data — admix","text":"list 3 components X SNP matrix (197 observations 100 SNPs) y vector simulated (continuous) outcomes race vector racial group categorization: # 0 = African, 1 = African American, 2 = European, 3 = Japanese","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"source","dir":"Reference","previous_headings":"","what":"Source","title":"Admix: Semi-simulated SNP data — admix","text":"https://hastie.su.domains/CASI/","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to support process_plink() — align_ids","title":"A helper function to support process_plink() — align_ids","text":"helper function support process_plink()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to support process_plink() — align_ids","text":"","code":"align_ids(id_var, quiet, add_predictor, og_ids)"},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to support process_plink() — align_ids","text":"id_var String specifying variable name ID column quiet Logical: message printed? add_predictor External data include design matrix. add_predictors... arg process_plink() og_ids Character vector PLINK ids (FID IID) original data (.e., data subsetting handling missing phenotypes)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to support process_plink() — align_ids","text":"matrix dimensions add_predictor","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":null,"dir":"Reference","previous_headings":"","what":"a version of cbind() for file-backed matrices — big_cbind","title":"a version of cbind() for file-backed matrices — big_cbind","text":"version cbind() file-backed matrices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a version of cbind() for file-backed matrices — big_cbind","text":"","code":"big_cbind(A, B, C, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a version of cbind() for file-backed matrices — big_cbind","text":"-memory data B file-backed data C file-backed placeholder combined data quiet Logical","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a version of cbind() for file-backed matrices — big_cbind","text":"C, filled column values B combined","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":null,"dir":"Reference","previous_headings":"","what":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"check_for_file_extension: function make package 'smart' enough handle .rds file extensions","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"","code":"check_for_file_extension(path)"},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"path string specifying file path ends file name, e.g. \"~/dir/my_file.rds\"","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"string filepath without extension, e.g. \"~/dir/my_file\"","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Coef method for ","title":"Coef method for ","text":"Coef method \"cv_plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Coef method for ","text":"","code":"# S3 method for class 'cv_plmm' coef(object, lambda, which = object$min, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Coef method for ","text":"object object class \"cv_plmm.\" lambda numeric vector lambda values. Vector lambda indices coefficients return. Defaults lambda index minimum CVE. ... Additional arguments (used).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Coef method for ","text":"Returns named numeric vector. Values coefficients model specified value either lambda . Names values lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Coef method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design, return_fit = TRUE) head(coef(cv_fit)) #> (Intercept) Snp1 Snp2 Snp3 Snp4 Snp5 #> 4.326474 0.000000 0.000000 0.000000 0.000000 0.000000"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Coef method for ","title":"Coef method for ","text":"Coef method \"plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Coef method for ","text":"","code":"# S3 method for class 'plmm' coef(object, lambda, which = 1:length(object$lambda), drop = TRUE, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Coef method for ","text":"object object class \"plmm.\" lambda numeric vector lambda values. Vector lambda indices coefficients return. drop Logical. ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Coef method for ","text":"Either numeric matrix (model fit data stored memory) sparse matrix (model fit data stored filebacked). Rownames feature names, columns values lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Coef method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) coef(fit)[1:10, 41:45] #> 0.02673 0.02493 0.02325 0.02168 0.02022 #> (Intercept) 6.556445885 6.59257224 6.62815211 6.66317769 6.69366816 #> Snp1 -0.768261488 -0.78098090 -0.79310257 -0.80456803 -0.81482505 #> Snp2 0.131945426 0.13991539 0.14735024 0.15387884 0.15929074 #> Snp3 2.826806831 2.83842545 2.84879468 2.85860151 2.86047026 #> Snp4 0.036981534 0.04652885 0.05543821 0.06376126 0.07133592 #> Snp5 0.546784811 0.57461391 0.60049082 0.62402782 0.64291324 #> Snp6 -0.026215632 -0.03072017 -0.03494534 -0.03889146 -0.04256362 #> Snp7 0.009342269 0.01539705 0.02103262 0.02615358 0.03069956 #> Snp8 0.000000000 0.00000000 0.00000000 0.00000000 0.00000000 #> Snp9 0.160794660 0.16217570 0.16337102 0.16464901 0.16638663"},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"function create estimated variance matrix PLMM fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"","code":"construct_variance(fit, K = NULL, eta = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"fit object returned plmm() K optional matrix eta optional numeric value 0 1; fit supplied, option must specified.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"Sigma_hat, matrix representing estimated variance","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to count constant features — count_constant_features","title":"A helper function to count constant features — count_constant_features","text":"helper function count constant features","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to count constant features — count_constant_features","text":"","code":"count_constant_features(fbm, outfile, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to count constant features — count_constant_features","text":"fbm filebacked big.matrix outfile String specifying name log file quiet Logical: message printed console","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to count constant features — count_constant_features","text":"ns numeric vector indices non-singular columns matrix associated counts","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to count the number of cores available on the current machine — count_cores","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"helper function count number cores available current machine","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"","code":"count_cores()"},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"number cores use; parallel installed, parallel::detectCores(). Otherwise, returns 1.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to create a design for PLMM modeling — create_design","title":"a function to create a design for PLMM modeling — create_design","text":"function create design PLMM modeling","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to create a design for PLMM modeling — create_design","text":"","code":"create_design(data_file = NULL, rds_dir = NULL, X = NULL, y = NULL, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a function to create a design for PLMM modeling — create_design","text":"data_file filebacked data (data process_plink() process_delim()), filepath processed data. Defaults NULL (argument apply -memory data). rds_dir filebacked data, filepath directory/folder want design saved. Note: include/append name want --created file – name argument new_file, passed create_design_filebacked(). Defaults NULL (argument apply -memory data). X -memory data (data matrix data frame), design matrix. Defaults NULL (argument apply filebacked data). y -memory data, numeric vector representing outcome. Defaults NULL (argument apply filebacked data). Note: responsibility user ensure rows X corresponding elements y row order, .e., observations must order design matrix outcome vector. ... Additional arguments pass create_design_filebacked() create_design_in_memory(). See documentation helper functions details.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to create a design for PLMM modeling — create_design","text":"filepath object class plmm_design, named list design matrix, outcome, penalty factor vector, details needed fitting model. list stored .rds file filebacked data, filebacked case string path file returned. -memory data, list returned.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"a function to create a design for PLMM modeling — create_design","text":"function wrapper create_design...() inner functions; arguments included passed along create_design...() inner function matches type data supplied. Note arguments optional ones . Additional arguments filebacked data: new_file User-specified filename (without .bk/.rds extension) --created .rds/.bk files. Must different existing .rds/.bk files folder. feature_id Optional: string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. - PLINK data: string specifying ID column PLINK .fam file. Options \"IID\" (default) \"FID\" - filebacked data: character vector unique identifiers (IDs) row feature data (.e., data processed process_delim()) - left NULL (default), X assumed row-order add_outcome. Note: assumption made error, calculations downstream incorrect. Pay close attention . add_outcome data frame matrix two columns: ID column column outcome value (used 'y' final design). IDs must characters, outcome must numeric. outcome_id string specifying name ID column 'add_outcome' outcome_col string specifying name phenotype column 'add_outcome' na_outcome_vals Optional: vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). overwrite Optional: logical - existing .rds files overwritten? Defaults FALSE. logfile Optional: name '.log' file written – Note: append .log filename; done automatically. quiet Optional: logical - messages printed console silenced? Defaults FALSE Additional arguments specific PLINK data: add_predictor Optional (PLINK data ): matrix data frame used adding additional unpenalized covariates/predictors/features external file (.e., PLINK file). matrix must one column ID column; columns aside ID used covariates design matrix. Columns must named. predictor_id Optional (PLINK data ): string specifying name column 'add_predictor' sample IDs. Required 'add_predictor' supplied. names used subset align external covariate supplied PLINK data. Additional arguments specific delimited file data: unpen Optional: character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, delimited file must column names. Additional arguments -memory data: unpen Optional: character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"a function to create a design for PLMM modeling — create_design","text":"","code":"## Example 1: matrix data in-memory ## admix_design <- create_design(X = admix$X, y = admix$y, unpen = \"Snp1\") ## Example 2: delimited data ## # process delimited data temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), overwrite = TRUE, rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", header = TRUE) #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/Rtmpm57oVf/processed_colon2.rds # prepare outcome data colon_outcome <- read.delim(find_example_data(path = \"colon2_outcome.txt\")) # create a design colon_design <- create_design(data_file = colon_dat, rds_dir = temp_dir, new_file = \"std_colon2\", add_outcome = colon_outcome, outcome_id = \"ID\", outcome_col = \"y\", unpen = \"sex\", overwrite = TRUE, logfile = \"test.log\") #> No feature_id supplied; will assume data X are in same row-order as add_outcome. #> There are 0 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-13 20:58:19 #> Done with standardization. File formatting in progress # look at the results colon_rds <- readRDS(colon_design) str(colon_rds) #> List of 18 #> $ X_colnames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ X_rownames : chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ n : num 62 #> $ p : num 2001 #> $ is_plink : logi FALSE #> $ outcome_idx : int [1:62] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : int [1:62] 1 0 1 0 1 0 1 0 1 0 ... #> $ std_X_rownames: chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ unpen : int 1 #> $ unpen_colnames: chr \"sex\" #> $ ns : int [1:2001] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/Rtmpm57oVf/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 62 #> $ std_X_p : num 2001 #> $ std_X_center : num [1:2001] 1.47 7015.79 4966.96 4094.73 3987.79 ... #> $ std_X_scale : num [1:2001] 0.499 3067.926 2171.166 1803.359 2002.738 ... #> $ penalty_factor: num [1:2001] 0 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\" ## Example 3: PLINK data ## # \\donttest{ # process PLINK data temp_dir <- tempdir() unzip_example_data(outdir = temp_dir) #> Unzipped files are saved in /tmp/Rtmpm57oVf plink_data <- process_plink(data_dir = temp_dir, data_prefix = \"penncath_lite\", rds_dir = temp_dir, rds_prefix = \"imputed_penncath_lite\", # imputing the mode to address missing values impute_method = \"mode\", # overwrite existing files in temp_dir # (you can turn this feature off if you need to) overwrite = TRUE, # turning off parallelization - leaving this on causes problems knitting this vignette parallel = FALSE) #> #> Preprocessing penncath_lite data: #> Creating penncath_lite.rds #> #> There are 1401 observations and 4367 genomic features in the specified data files, representing chromosomes 1 - 22 #> There are a total of 3514 SNPs with missing values #> Of these, 13 are missing in at least 50% of the samples #> #> Imputing the missing (genotype) values using mode method #> #> process_plink() completed #> Processed files now saved as /tmp/Rtmpm57oVf/imputed_penncath_lite.rds # get outcome data penncath_pheno <- read.csv(find_example_data(path = 'penncath_clinical.csv')) outcome <- data.frame(FamID = as.character(penncath_pheno$FamID), CAD = penncath_pheno$CAD) unpen_predictors <- data.frame(FamID = as.character(penncath_pheno$FamID), sex = penncath_pheno$sex, age = penncath_pheno$age) # create design where sex and age are always included in the model pen_design <- create_design(data_file = plink_data, feature_id = \"FID\", rds_dir = temp_dir, new_file = \"std_penncath_lite\", add_outcome = outcome, outcome_id = \"FamID\", outcome_col = \"CAD\", add_predictor = unpen_predictors, predictor_id = \"FamID\", logfile = \"design\", # again, overwrite if needed; use with caution overwrite = TRUE) #> #> Aligning external data with the feature data by FamID #> Adding predictors from external data. #> Aligning IDs between fam and predictor files #> Column-wise combining data sets #> | | | 0% | | | 1% | |= | 1% | |= | 2% | |== | 2% | |== | 3% | |== | 4% | |=== | 4% | |=== | 5% | |==== | 5% | |==== | 6% | |===== | 6% | |===== | 7% | |===== | 8% | |====== | 8% | |====== | 9% | |======= | 9% | |======= | 10% | |======= | 11% | |======== | 11% | |======== | 12% | |========= | 12% | |========= | 13% | |========= | 14% | |========== | 14% | |========== | 15% | |=========== | 15% | |=========== | 16% | |============ | 16% | |============ | 17% | |============ | 18% | |============= | 18% | |============= | 19% | |============== | 19% | |============== | 20% | |============== | 21% | |=============== | 21% | |=============== | 22% | |================ | 22% | |================ | 23% | |================ | 24% | |================= | 24% | |================= | 25% | |================== | 25% | |================== | 26% | |=================== | 26% | |=================== | 27% | |=================== | 28% | |==================== | 28% | |==================== | 29% | |===================== | 29% | |===================== | 30% | |===================== | 31% | |====================== | 31% | |====================== | 32% | |======================= | 32% | |======================= | 33% | |======================= | 34% | |======================== | 34% | |======================== | 35% | |========================= | 35% | |========================= | 36% | |========================== | 36% | |========================== | 37% | |========================== | 38% | |=========================== | 38% | |=========================== | 39% | |============================ | 39% | |============================ | 40% | |============================ | 41% | |============================= | 41% | |============================= | 42% | |============================== | 42% | |============================== | 43% | |============================== | 44% | |=============================== | 44% | |=============================== | 45% | |================================ | 45% | |================================ | 46% | |================================= | 46% | |================================= | 47% | |================================= | 48% | |================================== | 48% | |================================== | 49% | |=================================== | 49% | |=================================== | 50% | |=================================== | 51% | |==================================== | 51% | |==================================== | 52% | |===================================== | 52% | |===================================== | 53% | |===================================== | 54% | |====================================== | 54% | |====================================== | 55% | |======================================= | 55% | |======================================= | 56% | |======================================== | 56% | |======================================== | 57% | |======================================== | 58% | |========================================= | 58% | |========================================= | 59% | |========================================== | 59% | |========================================== | 60% | |========================================== | 61% | |=========================================== | 61% | |=========================================== | 62% | |============================================ | 62% | |============================================ | 63% | |============================================ | 64% | |============================================= | 64% | |============================================= | 65% | |============================================== | 65% | |============================================== | 66% | |=============================================== | 66% | |=============================================== | 67% | |=============================================== | 68% | |================================================ | 68% | |================================================ | 69% | |================================================= | 69% | |================================================= | 70% | |================================================= | 71% | |================================================== | 71% | |================================================== | 72% | |=================================================== | 72% | |=================================================== | 73% | |=================================================== | 74% | |==================================================== | 74% | |==================================================== | 75% | |===================================================== | 75% | |===================================================== | 76% | |====================================================== | 76% | |====================================================== | 77% | |====================================================== | 78% | |======================================================= | 78% | |======================================================= | 79% | |======================================================== | 79% | |======================================================== | 80% | |======================================================== | 81% | |========================================================= | 81% | |========================================================= | 82% | |========================================================== | 82% | |========================================================== | 83% | |========================================================== | 84% | |=========================================================== | 84% | |=========================================================== | 85% | |============================================================ | 85% | |============================================================ | 86% | |============================================================= | 86% | |============================================================= | 87% | |============================================================= | 88% | |============================================================== | 88% | |============================================================== | 89% | |=============================================================== | 89% | |=============================================================== | 90% | |=============================================================== | 91% | |================================================================ | 91% | |================================================================ | 92% | |================================================================= | 92% | |================================================================= | 93% | |================================================================= | 94% | |================================================================== | 94% | |================================================================== | 95% | |=================================================================== | 95% | |=================================================================== | 96% | |==================================================================== | 96% | |==================================================================== | 97% | |==================================================================== | 98% | |===================================================================== | 98% | |===================================================================== | 99% | |======================================================================| 99% | |======================================================================| 100% #> There are 62 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-13 20:58:21 #> Done with standardization. File formatting in progress # examine the design - notice the components of this object pen_design_rds <- readRDS(pen_design) # }"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"function create design matrix, outcome, penalty factor passed model fitting function","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"","code":"create_design_filebacked( data_file, rds_dir, obj, new_file, feature_id = NULL, add_outcome, outcome_id, outcome_col, na_outcome_vals = c(-9, NA_integer_), add_predictor = NULL, predictor_id = NULL, unpen = NULL, logfile = NULL, overwrite = FALSE, quiet = FALSE )"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"data_file filepath rds file processed data (data process_plink() process_delim()) rds_dir path directory want create new '.rds' '.bk' files. obj RDS object read create_design() new_file User-specified filename (without .bk/.rds extension) --created .rds/.bk files. Must different existing .rds/.bk files folder. feature_id string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. - PLINK data: string specifying ID column PLINK .fam file. Options \"IID\" (default) \"FID\" - filebacked data: character vector unique identifiers (IDs) row feature data (.e., data processed process_delim()) - left NULL (default), X assumed row-order add_outcome. Note: assumption made error, calculations downstream incorrect. Pay close attention . add_outcome data frame matrix two columns: ID column column outcome value (used 'y' final design). IDs must characters, outcome must numeric. outcome_id string specifying name ID column 'add_outcome' outcome_col string specifying name phenotype column 'add_outcome' na_outcome_vals vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). add_predictor Optional (PLINK data ): matrix data frame used adding additional unpenalized covariates/predictors/features external file (.e., PLINK file). matrix must one column ID column; columns aside ID used covariates design matrix. Columns must named. predictor_id Optional (PLINK data ): string specifying name column 'add_predictor' sample IDs. Required 'add_predictor' supplied. names used subset align external covariate supplied PLINK data. unpen Optional (delimited file data ): optional character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names. logfile Optional: name '.log' file written – Note: append .log filename; done automatically. overwrite Logical: existing .rds files overwritten? Defaults FALSE. quiet Logical: messages printed console silenced? Defaults FALSE","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"filepath created .rds file containing information model fitting, including standardized X model design information","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to create a design with an in-memory X matrix — create_design_in_memory","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"function create design -memory X matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"","code":"create_design_in_memory(X, y, unpen = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"X numeric matrix rows correspond observations (e.g., samples) columns correspond features. y numeric vector representing outcome model. Note: responsibility user ensure outcome_col X row order! unpen optional character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"list elements including standardized X model design information","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":null,"dir":"Reference","previous_headings":"","what":"create_log_file — create_log","title":"create_log_file — create_log","text":"create_log_file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"create_log_file — create_log","text":"","code":"create_log(outfile, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"create_log_file — create_log","text":"outfile String specifying name --created file, without extension ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"create_log_file — create_log","text":"Nothing returned, intead text file suffix .log created.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Cross-validation for plmm — cv_plmm","title":"Cross-validation for plmm — cv_plmm","text":"Performs k-fold cross validation lasso-, MCP-, SCAD-penalized linear mixed models grid values regularization parameter lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Cross-validation for plmm — cv_plmm","text":"","code":"cv_plmm( design, y = NULL, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", type = \"blup\", gamma, alpha = 1, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, dfmax = NULL, warn = TRUE, init = NULL, cluster, nfolds = 5, seed, fold = NULL, returnY = FALSE, returnBiasDetails = FALSE, trace = FALSE, save_rds = NULL, save_fold_res = FALSE, return_fit = TRUE, compact_save = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Cross-validation for plmm — cv_plmm","text":"design first argument must one three things: (1) plmm_design object (created create_design()) (2) string file path design object (file path must end '.rds') (3) matrix data.frame object representing design matrix interest y Optional: case design matrix data.frame, user must also supply numeric outcome vector y argument. case, design y passed internally create_design(X = design, y = y). K Similarity matrix used rotate data. either (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'u', returned choose_k(). diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"lasso\" (default), \"SCAD\", \"MCP\". type character argument indicating returned predict.plmm(). type == 'lp', predictions based linear predictor, X beta. type == 'blup', predictions based sum linear predictor estimated random effect (BLUP). Defaults 'blup', shown superior prediction method many applications. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (future idea; yet incorporated) Calculate index objective function ceases locally convex? Default TRUE. dfmax (future idea; yet incorporated) Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. warn Return warning messages failures converge model saturation? Default TRUE. init Initial values coefficients. Default 0 columns X. cluster cv_plmm() can run parallel across cluster using parallel package. cluster must set advance using parallel::makeCluster(). cluster must passed cv_plmm(). nfolds number cross-validation folds. Default 5. seed may set seed random number generator order obtain reproducible results. fold fold observation belongs . default, observations randomly assigned. returnY cv_plmm() return linear predictors cross-validation folds? Default FALSE; TRUE, return matrix element row , column j fitted value observation fold observation excluded fit, jth value lambda. returnBiasDetails Logical: cross-validation bias (numeric value) loss (n x p matrix) returned? Defaults FALSE. trace set TRUE, inform user progress announcing beginning CV fold. Default FALSE. save_rds Optional: filepath name without '.rds' suffix specified (e.g., save_rds = \"~/dir/my_results\"), model results saved provided location (e.g., \"~/dir/my_results.rds\"). Defaults NULL, save result. save_fold_res Optional: logical value indicating whether results (loss predicted values) CV fold saved? TRUE, two '.rds' files saved ('loss' 'yhat') created directory 'save_rds'. files updated fold done. Defaults FALSE. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE. compact_save Optional: TRUE, three separate .rds files saved: one 'beta_vals', one 'K', one everything else (see ). Defaults FALSE. Note: must specify save_rds argument called. ... Additional arguments plmm_fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Cross-validation for plmm — cv_plmm","text":"list 12 items: type: type prediction used ('lp' 'blup') cve: numeric vector cross validation error (CVE) value lambda cvse: numeric vector estimated standard error associated value cve fold: numeric n length vector integers indicating fold observation assigned lambda: numeric vector lambda values fit: overall fit object, including predictors; list returned plmm() min: index corresponding value lambda minimizes cve lambda_min: lambda value cve minmized min1se: index corresponding value lambda within standard error minimizes cve lambda1se: largest value lambda error within 1 standard error minimum. null.dev: numeric value representing deviance intercept-model. supplied lambda sequence, quantity may meaningful. estimated_Sigma: n x n matrix representing estimated covariance matrix.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Cross-validation for plmm — cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) print(summary(cv_fit)) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2493): #> ------------------------------------------------- #> Nonzero coefficients: 5 #> Cross-validation error (deviance): 2.00 #> Scale estimate (sigma): 1.413 plot(cv_fit) # Note: for examples with filebacked data, see the filebacking vignette # https://pbreheny.github.io/plmmr/articles/filebacking.html"},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":null,"dir":"Reference","previous_headings":"","what":"Cross-validation internal function for cv_plmm — cvf","title":"Cross-validation internal function for cv_plmm — cvf","text":"Internal function cv_plmm calls plmm fold subset original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Cross-validation internal function for cv_plmm — cvf","text":"","code":"cvf(i, fold, type, cv_args, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Cross-validation internal function for cv_plmm — cvf","text":"Fold number excluded fit. fold n-length vector fold-assignments. type character argument indicating returned predict.plmm. type == 'lp' predictions based linear predictor, $X beta$. type == 'individual' predictions based linear predictor plus estimated random effect (BLUP). cv_args List additional arguments passed plmm. ... Optional arguments predict_within_cv","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Cross-validation internal function for cv_plmm — cvf","text":"list three elements: numeric vector loss value lambda numeric value indicating number lambda values used numeric value predicted outcome (y hat) values lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"function take eigendecomposition K Note: faster taking SVD X p >> n","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"","code":"eigen_K(std_X, fbm_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"std_X standardized design matrix, stored big.matrix object. fbm_flag Logical: std_X FBM obejct? Passed plmm().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"list eigenvectors eigenvalues K","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":null,"dir":"Reference","previous_headings":"","what":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"Estimate eta (used rotating data) function called internally plmm()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"","code":"estimate_eta(n, s, U, y, eta_star)"},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"n number observations s singular values K, realized relationship matrix U left-singular vectors standardized design matrix y Continuous outcome vector.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"numeric value estimated value eta, variance parameter","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":null,"dir":"Reference","previous_headings":"","what":"Functions to convert between FBM and big.matrix type objects — fbm2bm","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"Functions convert FBM big.matrix type objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"","code":"fbm2bm(fbm, desc = FALSE)"},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"fbm FBM object; see bigstatsr::FBM() details desc Logical: descriptor file desired (opposed filebacked big matrix)? Defaults FALSE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"big.matrix - see bigmemory::filebacked.big.matrix() details","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to get the file path of a file without the extension — file_sans_ext","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"helper function get file path file without extension","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"","code":"file_sans_ext(path)"},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"path path file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"path_sans_ext filepath without extension","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to help with accessing example PLINK files — find_example_data","title":"A function to help with accessing example PLINK files — find_example_data","text":"function help accessing example PLINK files","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to help with accessing example PLINK files — find_example_data","text":"","code":"find_example_data(path, parent = FALSE)"},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to help with accessing example PLINK files — find_example_data","text":"path Argument (string) specifying path (filename) external data file extdata/ parent path=TRUE user wants name parent directory file located, set parent=TRUE. Defaults FALSE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to help with accessing example PLINK files — find_example_data","text":"path=NULL, character vector file names returned. path given, character string full file path","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to help with accessing example PLINK files — find_example_data","text":"","code":"find_example_data(parent = TRUE) #> [1] \"/home/runner/work/_temp/Library/plmmr/extdata\""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":null,"dir":"Reference","previous_headings":"","what":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"Read processed data function intended called either process_plink() process_delim() called .","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"","code":"get_data(path, returnX = FALSE, trace = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"path file path RDS object containing processed data. add '.rds' extension path. returnX Logical: design matrix returned numeric matrix stored memory. default, FALSE. trace Logical: trace messages shown? Default TRUE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"list components: std_X, column-standardized design matrix either (1) numeric matrix (2) filebacked matrix (FBM). See bigstatsr::FBM() bigsnpr::bigSnp-class documentation details. (PLINK data) fam, data frame containing pedigree information (like .fam file PLINK) (PLINK data) map, data frame containing feature information (like .bim file PLINK) ns: vector indicating columns X contain nonsingular features (.e., features variance != 0. center: vector values centering column X scale: vector values scaling column X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to return the computer's host name — get_hostname","title":"a function to return the computer's host name — get_hostname","text":"function return computer's host name","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to return the computer's host name — get_hostname","text":"","code":"get_hostname()"},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to return the computer's host name — get_hostname","text":"String hostname current machine","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to impute SNP data — impute_snp_data","title":"A function to impute SNP data — impute_snp_data","text":"function impute SNP data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to impute SNP data — impute_snp_data","text":"","code":"impute_snp_data( obj, X, impute, impute_method, parallel, outfile, quiet, seed = as.numeric(Sys.Date()), ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to impute SNP data — impute_snp_data","text":"obj bigSNP object (created read_plink_files()) X matrix genotype data returned name_and_count_bigsnp impute Logical: data imputed? Default TRUE. impute_method 'impute' = TRUE, argument specify kind imputation desired. Options : mode (default): Imputes frequent call. See bigsnpr::snp_fastImputeSimple() details. random: Imputes sampling according allele frequencies. mean0: Imputes rounded mean. mean2: Imputes mean rounded 2 decimal places. xgboost: Imputes using algorithm based local XGBoost models. See bigsnpr::snp_fastImpute() details. Note: can take several minutes, even relatively small data set. parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults TRUE seed Numeric value passed seed impute_method = 'xgboost'. Defaults .numeric(Sys.Date()) ... Optional: additional arguments bigsnpr::snp_fastImpute() (relevant impute_method = \"xgboost\")","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to impute SNP data — impute_snp_data","text":"Nothing returned, obj$genotypes overwritten imputed version data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to align genotype and phenotype data — index_samples","title":"A function to align genotype and phenotype data — index_samples","text":"function align genotype phenotype data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to align genotype and phenotype data — index_samples","text":"","code":"index_samples( obj, rds_dir, indiv_id, add_outcome, outcome_id, outcome_col, na_outcome_vals, outfile, quiet )"},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to align genotype and phenotype data — index_samples","text":"obj object created process_plink() rds_dir path directory want create new '.rds' '.bk' files. indiv_id character string indicating ID column name 'fam' element genotype data list. Defaults 'sample.ID', equivalent 'IID' PLINK. option 'family.ID', equivalent 'FID' PLINK. add_outcome data frame least two columns: ID column phenotype column outcome_id string specifying name ID column pheno outcome_col string specifying name phenotype column pheno. column used default y argument 'plmm()'. na_outcome_vals vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). outfile string name filepath log file quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to align genotype and phenotype data — index_samples","text":"list two items: data.table rows corresponding samples genotype phenotype available. numeric vector indices indicating samples 'complete' (.e., samples add_outcome corresponding data PLINK files)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":null,"dir":"Reference","previous_headings":"","what":"Helper function to index standardized data — index_std_X","title":"Helper function to index standardized data — index_std_X","text":"Helper function index standardized data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Helper function to index standardized data — index_std_X","text":"","code":"index_std_X(std_X_p, non_genomic)"},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Helper function to index standardized data — index_std_X","text":"std_X_p number features standardized matrix data (may filebacked) non_genomic Integer vector columns std_X representing non-genomic data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Helper function to index standardized data — index_std_X","text":"list indices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":null,"dir":"Reference","previous_headings":"","what":"Generate nicely formatted lambda vec — lam_names","title":"Generate nicely formatted lambda vec — lam_names","text":"Generate nicely formatted lambda vec","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Generate nicely formatted lambda vec — lam_names","text":"","code":"lam_names(l)"},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Generate nicely formatted lambda vec — lam_names","text":"l Vector lambda values.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Generate nicely formatted lambda vec — lam_names","text":"character vector formatted lambda value names","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement lasso penalty — lasso","title":"helper function to implement lasso penalty — lasso","text":"helper function implement lasso penalty","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement lasso penalty — lasso","text":"","code":"lasso(z, l1, l2, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement lasso penalty — lasso","text":"z solution active set feature l1 upper bound l2 lower bound v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement lasso penalty — lasso","text":"numeric vector lasso-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":null,"dir":"Reference","previous_headings":"","what":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"function allows evaluate negative log-likelihood linear mixed model assumption null model order estimate variance parameter, eta.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"","code":"log_lik(eta, n, s, U, y, rot_y = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"eta proportion variance outcome attributable causal SNP effects. words, signal--noise ratio. n number observations s singular values K, realized relationship matrix U left-singular vectors standardized design matrix y Continuous outcome vector. rot_y Optional: y already rotated, can supplied.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"value log-likelihood PLMM, evaluated supplied parameters","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"helper function label summarize contents bigSNP","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"","code":"name_and_count_bigsnp(obj, id_var, quiet, outfile)"},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"obj bigSNP object, possibly subset add_external_phenotype() id_var String specifying column PLINK .fam file unique sample identifiers. Options \"IID\" (default) \"FID\". quiet Logical: messages printed console? Defaults TRUE outfile string name .log file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"list components: counts: column-wise summary minor allele counts 'genotypes' obj: modified bigSNP list additional components X: obj$genotypes FBM pos: obj$map$physical.pos vector","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"Fit linear mixed model via non-convex penalized maximum likelihood.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"","code":"plmm( design, y = NULL, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", init = NULL, gamma, alpha = 1, dfmax = NULL, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, warn = TRUE, trace = FALSE, save_rds = NULL, compact_save = FALSE, return_fit = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"design first argument must one three things: (1) plmm_design object (created create_design()) (2) string file path design object (file path must end '.rds') (3) matrix data.frame object representing design matrix interest y Optional: case design matrix data.frame, user must also supply numeric outcome vector y argument. case, design y passed internally create_design(X = design, y = y). K Similarity matrix used rotate data. either : (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'U', returned previous plmm() model fit data. diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"lasso\" (default), \"SCAD\", \"MCP\". init Initial values coefficients. Default 0 columns X. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. dfmax (Future idea; yet incorporated): Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (Future idea; yet incorporated): Calculate index objective function ceases locally convex? Default TRUE. warn Return warning messages failures converge model saturation? Default TRUE. trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. save_rds Optional: filepath name without '.rds' suffix specified (e.g., save_rds = \"~/dir/my_results\"), model results saved provided location (e.g., \"~/dir/my_results.rds\"). Defaults NULL, save result. compact_save Optional: TRUE, three separate .rds files saved: one 'beta_vals', one 'K', one linear predictors, one everything else (see ). Defaults FALSE. Note: must specify save_rds argument called. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE -memory data, defaults FALSE filebacked data. ... Additional optional arguments plmm_checks()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"list includes 19 items: beta_vals: matrix estimated coefficients original scale. Rows predictors, columns values lambda std_scale_beta: matrix estimated coefficients ~standardized~ scale. returned compact_save = TRUE. std_X_details: list 3 items: center & scale values used center/scale data, vector ('ns') nonsingular columns original data. Nonsingular columns standardized (definition), removed analysis. std_X: standardized design matrix; data filebacked, object filebacked.big.matrix bigmemory package. Note: std_X saved/returned return_fit = FALSE. y: outcome vector used model fitting. p: total number columns design matrix (including singular columns). plink_flag: logical flag: data come PLINK files? lambda: numeric vector lasso tuning parameter values used model fitting. eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure linear_predictors: matrix resulting product stdrot_X estimated coefficients ~rotated~ scale. penalty: character string indicating penalty model fit (e.g., 'MCP') gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. ns_idx: vector indices predictors non-singular features (.e., features variation). iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda K: list 2 elements, s U — s: vector eigenvalues relatedness matrix; see relatedness_mat() details. U: matrix eigenvectors relatedness matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"","code":"# using admix data admix_design <- create_design(X = admix$X, y = admix$y) fit_admix1 <- plmm(design = admix_design) s1 <- summary(fit_admix1, idx = 50) print(s1) #> lasso-penalized regression model with n=197, p=101 at lambda=0.01426 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 88 #> ------------------------------------------------- plot(fit_admix1) # Note: for examples with large data that are too big to fit in memory, # see the article \"PLINK files/file-backed matrices\" on our website # https://pbreheny.github.io/plmmr/articles/filebacking.html"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":null,"dir":"Reference","previous_headings":"","what":"plmm_checks — plmm_checks","title":"plmm_checks — plmm_checks","text":"plmm_checks","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"plmm_checks — plmm_checks","text":"","code":"plmm_checks( design, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", init = NULL, gamma, alpha = 1, dfmax = NULL, trace = FALSE, save_rds = NULL, return_fit = TRUE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"plmm_checks — plmm_checks","text":"design design object, created create_design() K Similarity matrix used rotate data. either (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'u', returned choose_k(). diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"MCP\" (default), \"SCAD\", \"lasso\". init Initial values coefficients. Default 0 columns X. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. dfmax Option added soon: Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. save_rds Optional: filepath name specified (e.g., save_rds = \"~/dir/my_results.rds\"), model results saved provided location. Defaults NULL, save result. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE. ... Additional arguments get_data()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"plmm_checks — plmm_checks","text":"list parameters pass model fitting. list includes standardized design matrix, outcome, meta-data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"PLMM fit: function fits PLMM using values returned plmm_prep()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"","code":"plmm_fit( prep, y, std_X_details, eta_star, penalty_factor, fbm_flag, penalty, gamma = 3, alpha = 1, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, dfmax = NULL, init = NULL, warn = TRUE, returnX = TRUE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"prep list returned plmm_prep y original (centered) outcome vector. Need intercept estimate std_X_details list components 'center' (values used center X), 'scale' (values used scale X), 'ns' (indices nonsignular columns X) eta_star ratio variances (passed plmm()) penalty_factor multiplicative factor penalty applied coefficient. supplied, penalty_factor must numeric vector length equal number columns X. purpose penalty_factor apply differential penalization coefficients thought likely others model. particular, penalty_factor can 0, case coefficient always model without shrinkage. fbm_flag Logical: std_X FBM object? Passed plmm(). penalty penalty applied model. Either \"MCP\" (default), \"SCAD\", \"lasso\". gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (future idea; yet incorporated) convex Calculate index objective function ceases locally convex? Default TRUE. dfmax (future idea; yet incorporated) Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. init Initial values coefficients. Default 0 columns X. warn Return warning messages failures converge model saturation? Default TRUE. returnX Return standardized design matrix along fit? default, option turned X 100 MB, turned larger matrices preserve memory. ... Additional arguments can passed biglasso::biglasso_simple_path()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"list components: std_scale_beta: coefficients estimated scale std_X centered_y: y-values 'centered' mean 0 s U, values vectors eigendecomposition K lambda: vector tuning parameter values linear_predictors: product stdrot_X b (linear predictors transformed restandardized scale) eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure. iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty: character string indicating penalty model fit (e.g., 'MCP') penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. ns: indices nonsingular values X feature_names: formatted column names design matrix nlambda: number lambda values used model fitting eps: tolerance ('epsilon') used model fitting max_iter: max. number iterations per model fit warn: logical - warnings given model fit converge? init: initial values model fitting trace: logical - messages printed console models fit?","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"PLMM format: function format output model constructed plmm_fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"","code":"plmm_format(fit, p, std_X_details, fbm_flag, plink_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"fit list parameters describing output model constructed plmm_fit p number features original data (including constant features) std_X_details list 3 items: * 'center': centering values columns X * 'scale': scaling values non-singular columns X * 'ns': indicesof nonsingular columns std_X fbm_flag Logical: corresponding design matrix filebacked? Passed plmm(). plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"list components: beta_vals: matrix estimated coefficients original scale. Rows predictors, columns values lambda lambda: numeric vector lasso tuning parameter values used model fitting. eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure. s: vectof eigenvalues relatedness matrix K; see relatedness_mat() details. U: matrix eigenvalues relatedness matrix K rot_y: vector outcome values rotated scale. scale model fit. linear_predictors: matrix resulting product stdrot_X estimated coefficients ~rotated~ scale. penalty: character string indicating penalty model fit (e.g., 'MCP') gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. ns_idx: vector indices predictors nonsingular features (.e., variation). iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":null,"dir":"Reference","previous_headings":"","what":"Loss method for ","title":"Loss method for ","text":"Loss method \"plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Loss method for ","text":"","code":"plmm_loss(y, yhat)"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Loss method for ","text":"y Observed outcomes (response) vector yhat Predicted outcomes (response) vector","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Loss method for ","text":"numeric vector squared-error loss values given observed predicted outcomes","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Loss method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design, K = relatedness_mat(admix$X)) yhat <- predict(object = fit, newX = admix$X, type = 'lp', lambda = 0.05) head(plmm_loss(yhat = yhat, y = admix$y)) #> [,1] #> [1,] 0.81638401 #> [2,] 0.09983799 #> [3,] 0.50281622 #> [4,] 0.14234359 #> [5,] 2.03696796 #> [6,] 2.72044268"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"PLMM prep: function run checks, SVD, rotation prior fitting PLMM model internal function cv_plmm","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"","code":"plmm_prep( std_X, std_X_n, std_X_p, genomic = 1:std_X_p, n, p, centered_y, k = NULL, K = NULL, diag_K = NULL, eta_star = NULL, fbm_flag, penalty_factor = rep(1, ncol(std_X)), trace = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"std_X Column standardized design matrix. May include clinical covariates non-SNP data. std_X_n number observations std_X (integer) std_X_p number features std_X (integer) genomic numeric vector indices indicating columns standardized X genomic covariates. Defaults columns. n number instances original design matrix X. altered standardization. p number features original design matrix X, including constant features centered_y Continuous outcome vector, centered. k integer specifying number singular values used approximation rotated design matrix. argument passed RSpectra::svds(). Defaults min(n, p) - 1, n p dimensions standardized design matrix. K Similarity matrix used rotate data. either known matrix reflects covariance y, estimate (Default \\(\\frac{1}{p}(XX^T)\\), X standardized). can also list, components d u (returned choose_k) diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Passed plmm(). eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. fbm_flag Logical: std_X FBM type object? set internally plmm(). trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. ... used yet","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"List components: centered_y: vector centered outcomes std_X: standardized design matrix K: list 2 elements. (1) s: vector eigenvalues K, (2) U: eigenvectors K (left singular values X). eta: numeric value estimated eta parameter trace: logical.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmmr-package.html","id":null,"dir":"Reference","previous_headings":"","what":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","title":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","text":"Fits penalized linear mixed models correct unobserved confounding factors. 'plmmr' infers corrects presence unobserved confounding effects population stratification environmental heterogeneity. fits linear model via penalized maximum likelihood. Originally designed multivariate analysis single nucleotide polymorphisms (SNPs) measured genome-wide association study (GWAS), 'plmmr' eliminates need subpopulation-specific analyses post-analysis p-value adjustments. Functions appropriate processing 'PLINK' files also supplied. examples, see package homepage. https://pbreheny.github.io/plmmr/.","code":""},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/reference/plmmr-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","text":"Maintainer: Patrick J. Breheny patrick-breheny@uiowa.edu (ORCID) Authors: Tabitha K. Peter tabitha-peter@uiowa.edu (ORCID) Anna C. Reisetter anna-reisetter@uiowa.edu (ORCID) Yujing Lu","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot method for cv_plmm class — plot.cv_plmm","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"Plot method cv_plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"","code":"# S3 method for class 'cv_plmm' plot( x, log.l = TRUE, type = c(\"cve\", \"rsq\", \"scale\", \"snr\", \"all\"), selected = TRUE, vertical.line = TRUE, col = \"red\", ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"x object class cv_plmm log.l Logical indicate plot returned natural log scale. Defaults log.l = FALSE. type Type plot return. Defaults \"cve.\" selected Logical indicate variables plotted. Defaults TRUE. vertical.line Logical indicate whether vertical line plotted minimum/maximum value. Defaults TRUE. col Color vertical line, plotted. Defaults \"red.\" ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"Nothing returned; instead, plot drawn representing relationship tuning parameter 'lambda' value (x-axis) cross validation error (y-axis).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cvfit <- cv_plmm(design = admix_design) plot(cvfit)"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot method for plmm class — plot.plmm","title":"Plot method for plmm class — plot.plmm","text":"Plot method plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot method for plmm class — plot.plmm","text":"","code":"# S3 method for class 'plmm' plot(x, alpha = 1, log.l = FALSE, shade = TRUE, col, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot method for plmm class — plot.plmm","text":"x object class plmm alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. log.l Logical indicate plot returned natural log scale. Defaults log.l = FALSE. shade Logical indicate whether local nonconvex region shaded. Defaults TRUE. col Vector colors coefficient lines. ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot method for plmm class — plot.plmm","text":"Nothing returned; instead, plot coefficient paths drawn value lambda (one 'path' coefficient).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot method for plmm class — plot.plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) plot(fit) plot(fit, log.l = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Predict method for plmm class — predict.plmm","title":"Predict method for plmm class — predict.plmm","text":"Predict method plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Predict method for plmm class — predict.plmm","text":"","code":"# S3 method for class 'plmm' predict( object, newX, type = c(\"blup\", \"coefficients\", \"vars\", \"nvars\", \"lp\"), lambda, idx = 1:length(object$lambda), ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Predict method for plmm class — predict.plmm","text":"object object class plmm. newX Matrix values predictions made (used type=\"coefficients\" type settings predict). can either FBM object 'matrix' object. Note: Columns argument must named! type character argument indicating type prediction returned. Options \"lp,\" \"coefficients,\" \"vars,\" \"nvars,\" \"blup.\" See details. lambda numeric vector regularization parameter lambda values predictions requested. idx Vector indices penalty parameter lambda predictions required. default, indices returned. ... Additional optional arguments","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Predict method for plmm class — predict.plmm","text":"Depends type - see Details","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Predict method for plmm class — predict.plmm","text":"Define beta-hat coefficients estimated value lambda minimizes cross-validation error (CVE). options type follows: 'response' (default): uses product newX beta-hat predict new values outcome. incorporate correlation structure data. stats folks , simply linear predictor. 'blup' (acronym Best Linear Unbiased Predictor): adds 'response' value represents esetimated random effect. addition way incorporating estimated correlation structure data prediction outcome. 'coefficients': returns estimated beta-hat 'vars': returns indices variables (e.g., SNPs) nonzero coefficients value lambda. EXCLUDES intercept. 'nvars': returns number variables (e.g., SNPs) nonzero coefficients value lambda. EXCLUDES intercept.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Predict method for plmm class — predict.plmm","text":"","code":"set.seed(123) train_idx <- sample(1:nrow(admix$X), 100) # Note: ^ shuffling is important here! Keeps test and train groups comparable. train <- list(X = admix$X[train_idx,], y = admix$y[train_idx]) train_design <- create_design(X = train$X, y = train$y) test <- list(X = admix$X[-train_idx,], y = admix$y[-train_idx]) fit <- plmm(design = train_design) # make predictions for all lambda values pred1 <- predict(object = fit, newX = test$X, type = \"lp\") pred2 <- predict(object = fit, newX = test$X, type = \"blup\") # look at mean squared prediction error mspe <- apply(pred1, 2, function(c){crossprod(test$y - c)/length(c)}) min(mspe) #> [1] 2.87754 mspe_blup <- apply(pred2, 2, function(c){crossprod(test$y - c)/length(c)}) min(mspe_blup) # BLUP is better #> [1] 2.128471 # compare the MSPE of our model to a null model, for reference # null model = intercept only -> y_hat is always mean(y) crossprod(mean(test$y) - test$y)/length(test$y) #> [,1] #> [1,] 6.381748"},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":null,"dir":"Reference","previous_headings":"","what":"Predict method to use in cross-validation (within cvf) — predict_within_cv","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"Predict method use cross-validation (within cvf)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"","code":"predict_within_cv( fit, trainX, trainY = NULL, testX, std_X_details, type, fbm = FALSE, plink_flag = FALSE, Sigma_11 = NULL, Sigma_21 = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"fit list components returned plmm_fit. trainX training data, pre-standardization pre-rotation trainY training outcome, centered. needed type = 'blup' testX design matrix used computing predicted values (.e, test data). std_X_details list 3 items: 'center': centering values columns X 'scale': scaling values non-singular columns X 'ns': indices nonsingular columns std_X. Note: vector really need ! type character argument indicating type prediction returned. Passed cvf(), Options \"lp,\" \"coefficients,\" \"vars,\" \"nvars,\" \"blup.\" See details. fbm Logical: trainX FBM object? , function expects testX also FBM. two X matrices must stored way. Sigma_11 Variance-covariance matrix training data. Extracted estimated_Sigma generated using observations. Required type == 'blup'. Sigma_21 Covariance matrix training testing data. Extracted estimated_Sigma generated using observations. Required type == 'blup'. ... Additional optional arguments","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"numeric vector predicted values","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"Define beta-hat coefficients estimated value lambda minimizes cross-validation error (CVE). options type follows: 'lp' (default): uses linear predictor (.e., product test data estimated coefficients) predict test values outcome. Note approach incorporate correlation structure data. 'blup' (acronym Best Linear Unbiased Predictor): adds 'lp' value represents estimated random effect. addition way incorporating estimated correlation structure data prediction outcome. Note: main difference function predict.plmm() method CV, predictions made standardized scale (.e., trainX testX data come std_X). predict.plmm() method makes predictions scale X (original scale)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to format the time — pretty_time","title":"a function to format the time — pretty_time","text":"function format time","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to format the time — pretty_time","text":"","code":"pretty_time()"},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to format the time — pretty_time","text":"string formatted current date time","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"Print method summary.cv_plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"","code":"# S3 method for class 'summary.cv_plmm' print(x, digits, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"x object class summary.cv_plmm digits number digits use formatting output ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"Nothing returned; instead, message printed console summarizing results cross-validated model fit.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) print(summary(cv_fit)) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2168): #> ------------------------------------------------- #> Nonzero coefficients: 10 #> Cross-validation error (deviance): 1.96 #> Scale estimate (sigma): 1.399"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to print the summary of a plmm model — print.summary.plmm","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"function print summary plmm model","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"","code":"# S3 method for class 'summary.plmm' print(x, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"x summary.plmm object ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"Nothing returned; instead, message printed console summarizing results model fit.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"","code":"lam <- rev(seq(0.01, 1, length.out=20)) |> round(2) # for sake of example admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design, lambda = lam) fit2 <- plmm(design = admix_design, penalty = \"SCAD\", lambda = lam) print(summary(fit, idx = 18)) #> lasso-penalized regression model with n=197, p=101 at lambda=0.1100 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 27 #> ------------------------------------------------- print(summary(fit2, idx = 18)) #> SCAD-penalized regression model with n=197, p=101 at lambda=0.1100 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 29 #> -------------------------------------------------"},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in large data files as an FBM — process_delim","title":"A function to read in large data files as an FBM — process_delim","text":"function read large data files FBM","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in large data files as an FBM — process_delim","text":"","code":"process_delim( data_dir, data_file, feature_id, rds_dir = data_dir, rds_prefix, logfile = NULL, overwrite = FALSE, quiet = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in large data files as an FBM — process_delim","text":"data_dir directory file. data_file file read , without filepath. file numeric values. Example: use data_file = \"myfile.txt\", data_file = \"~/mydirectory/myfile.txt\" Note: file headers/column names, set 'header = TRUE' – passed bigmemory::read.big.matrix(). feature_id string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. rds_dir directory user wants create '.rds' '.bk' files Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds file (create insie rds_dir folder) Note: 'rds_prefix' 'data_prefix' logfile Optional: name (character string) prefix logfile written. Defaults 'process_delim', .e. get 'process_delim.log' outfile. overwrite Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. Note: multiple .rds files names start \"std_prefix_...\", error . protect users accidentally deleting files saved results, one .rds file can removed option. quiet Logical: messages printed console silenced? Defaults FALSE. ... Optional: arguments passed bigmemory::read.big.matrix(). Note: 'sep' option pass , 'header'.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in large data files as an FBM — process_delim","text":"file path newly created '.rds' file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to read in large data files as an FBM — process_delim","text":"","code":"temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), overwrite = TRUE, rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", header = TRUE) #> #> Overwriting existing files:processed_colon2.bk/.rds/.desc #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/Rtmpm57oVf/processed_colon2.rds colon2 <- readRDS(colon_dat) str(colon2) #> List of 3 #> $ X:Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"processed_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/Rtmpm57oVf/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ n: num 62 #> $ p: num 2001 #> - attr(*, \"class\")= chr \"processed_delim\""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":null,"dir":"Reference","previous_headings":"","what":"Preprocess PLINK files using the bigsnpr package — process_plink","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"Preprocess PLINK files using bigsnpr package","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"","code":"process_plink( data_dir, data_prefix, rds_dir = data_dir, rds_prefix, logfile = NULL, impute = TRUE, impute_method = \"mode\", id_var = \"IID\", parallel = TRUE, quiet = FALSE, overwrite = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"data_dir path bed/bim/fam data files, without trailing \"/\" (e.g., use data_dir = '~/my_dir', data_dir = '~/my_dir/') data_prefix prefix (character string) bed/fam data files (e.g., data_prefix = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds file (create insie rds_dir folder) Note: 'rds_prefix' 'data_prefix' logfile Optional: name (character string) prefix logfile written 'rds_dir'. Default NULL (log file written). Note: supply file path argument, error \"file found\" error. supply string; e.g., want my_log.log, supply 'my_log', my_log.log file appear rds_dir. impute Logical: data imputed? Default TRUE. impute_method 'impute' = TRUE, argument specify kind imputation desired. Options : * mode (default): Imputes frequent call. See bigsnpr::snp_fastImputeSimple() details. * random: Imputes sampling according allele frequencies. * mean0: Imputes rounded mean. * mean2: Imputes mean rounded 2 decimal places. * xgboost: Imputes using algorithm based local XGBoost models. See bigsnpr::snp_fastImpute() details. Note: can take several minutes, even relatively small data set. id_var String specifying column PLINK .fam file unique sample identifiers. Options \"IID\" (default) \"FID\" parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. quiet Logical: messages printed console silenced? Defaults FALSE overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. Note: multiple .rds files names start \"std_prefix_...\", error . protect users accidentally deleting files saved results, one .rds file can removed option. ... Optional: additional arguments bigsnpr::snp_fastImpute() (relevant impute_method = \"xgboost\")","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"filepath '.rds' object created; see details explanation.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"Three files created location specified rds_dir: 'rds_prefix.rds': list three items: (1) X: filebacked bigmemory::big.matrix object pointing imputed genotype data. matrix type 'double', important downstream operations create_design() (2) map: data.frame PLINK 'bim' data (.e., variant information) (3) fam: data.frame PLINK 'fam' data (.e., pedigree information) 'prefix.bk': backingfile stores numeric data genotype matrix 'rds_prefix.desc'\" description file, needed Note process_plink() need run given set PLINK files; subsequent data analysis/scripts, get_data() access '.rds' file. example, see vignette processing PLINK files","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"function read large file numeric file-backed matrix (FBM) Note: function wrapper bigstatsr::big_read()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"","code":"read_data_files( data_file, data_dir, rds_dir, rds_prefix, outfile, overwrite, quiet, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"data_file name file read, including directory. Directory specified data_dir data_dir path directory 'file' rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds/.bk files (create insie rds_dir folder) Note: 'rds_prefix' 'data_file' outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. quiet Logical: messages printed console? Defaults TRUE ... Optional: arguments passed bigmemory::read.big.matrix(). Note: 'sep' option pass .","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"'.rds', '.bk', '.desc' files created data_dir, obj (filebacked bigmemory big.matrix object) returned. See bigmemory documentation info big.matrix class.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in PLINK files using bigsnpr methods — read_plink_files","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"function read PLINK files using bigsnpr methods","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"","code":"read_plink_files( data_dir, data_prefix, rds_dir, outfile, parallel, overwrite, quiet )"},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"data_dir path bed/bim/fam data files, without trailing \"/\" (e.g., use data_dir = '~/my_dir', data_dir = '~/my_dir/') data_prefix prefix (character string) bed/fam data files (e.g., prefix = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. quiet Logical: messages printed console? Defaults TRUE","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"'.rds' '.bk' files created data_dir, obj (bigSNP object) returned. See bigsnpr documentation info bigSNP class.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculate a relatedness matrix — relatedness_mat","title":"Calculate a relatedness matrix — relatedness_mat","text":"Given matrix genotypes, function estimates genetic relatedness matrix (GRM, also known RRM, see Hayes et al. 2009, doi:10.1017/S0016672308009981 ) among subjects: XX'/p, X standardized.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculate a relatedness matrix — relatedness_mat","text":"","code":"relatedness_mat(X, std = TRUE, fbm = FALSE, ns = NULL, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculate a relatedness matrix — relatedness_mat","text":"X n x p numeric matrix genotypes (fully-imputed data). Note: matrix include non-genetic features. std Logical: X standardized? set FALSE (can done data stored memory), good reason , standardization best practice. fbm Logical: X stored FBM? Defaults FALSE ns Optional vector values indicating indices nonsingular features ... optional arguments bigstatsr::bigapply() (like ncores = ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculate a relatedness matrix — relatedness_mat","text":"n x n numeric matrix capturing genomic relatedness samples represented X. notation, call matrix K 'kinship'; also known GRM RRM.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculate a relatedness matrix — relatedness_mat","text":"","code":"RRM <- relatedness_mat(X = admix$X) RRM[1:5, 1:5] #> [,1] [,2] [,3] [,4] [,5] #> [1,] 0.81268908 -0.09098097 -0.07888910 0.06770613 0.08311777 #> [2,] -0.09098097 0.81764801 0.20480021 0.02112812 -0.02640295 #> [3,] -0.07888910 0.20480021 0.82177986 -0.02864226 0.18693970 #> [4,] 0.06770613 0.02112812 -0.02864226 0.89327266 -0.03541470 #> [5,] 0.08311777 -0.02640295 0.18693970 -0.03541470 0.79589686"},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to rotate filebacked data — rotate_filebacked","title":"A function to rotate filebacked data — rotate_filebacked","text":"function rotate filebacked data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to rotate filebacked data — rotate_filebacked","text":"","code":"rotate_filebacked(prep, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to rotate filebacked data — rotate_filebacked","text":"list 4 items: stdrot_X: X rotated re-standardized scale rot_y: y rotated scale (numeric vector) stdrot_X_center: numeric vector values used center rot_X stdrot_X_scale: numeric vector values used scale rot_X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute sequence of lambda values — setup_lambda","title":"Compute sequence of lambda values — setup_lambda","text":"function allows compute sequence lambda values plmm models.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute sequence of lambda values — setup_lambda","text":"","code":"setup_lambda( X, y, alpha, lambda_min, nlambda, penalty_factor, intercept = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute sequence of lambda values — setup_lambda","text":"X Rotated standardized design matrix includes intercept column present. May include clinical covariates non-SNP data. can either 'matrix' 'FBM' object. y Continuous outcome vector. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. value lambda_min = 0 supported. nlambda desired number lambda values sequence generated. penalty_factor multiplicative factor penalty applied coefficient. supplied, penalty_factor must numeric vector length equal number columns X. purpose penalty_factor apply differential penalization coefficients thought likely others model. particular, penalty_factor can 0, case coefficient always model without shrinkage. intercept Logical: X contain intercept column? Defaults TRUE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute sequence of lambda values — setup_lambda","text":"numeric vector lambda values, equally spaced log scale","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to standardize a filebacked matrix — standardize_filebacked","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"helper function standardize filebacked matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"","code":"standardize_filebacked( X, new_file, rds_dir, non_gen, complete_outcome, id_var, outfile, quiet, overwrite )"},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"X list includes: (1) subset_X: big.matrix object subset &/additional predictors appended columns (2) ns: numeric vector indicating indices nonsingular columns subset_X new_file new_file (character string) bed/fam data files (e.g., new_file = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) new_file logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...) overwrite Logical: existing .bk/.rds files exist specified directory/new_file, overwritten?","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"list new component obj called 'std_X' - FBM column-standardized data. List also includes several indices/meta-data standardized matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to standardize matrices — standardize_in_memory","title":"A helper function to standardize matrices — standardize_in_memory","text":"helper function standardize matrices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to standardize matrices — standardize_in_memory","text":"","code":"standardize_in_memory(X)"},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to standardize matrices — standardize_in_memory","text":"X matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to standardize matrices — standardize_in_memory","text":"list standardized matrix, vectors centering/scaling values, vector indices nonsingular columns","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"A helper function to standardize matrices — standardize_in_memory","text":"function adapted https://github.com/pbreheny/ncvreg/blob/master/R/std.R NOTE: function returns matrix memory. standardizing filebacked data, use big_std() – see src/big_standardize.cpp","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to subset big.matrix objects — subset_filebacked","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"helper function subset big.matrix objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"","code":"subset_filebacked(X, new_file, complete_samples, ns, rds_dir, outfile, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"X filebacked big.matrix --standardized design matrix new_file Optional user-specified new_file --created .rds/.bk files. complete_samples Numeric vector indicesmarking rows original data non-missing entry 6th column .fam file ns Numeric vector indices non-singular columns vector created handle_missingness() rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) new_file logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"list two components. First, big.matrix object, 'subset_X', representing design matrix wherein: rows subset according user's specification handle_missing_phen columns subset constant features remain – important standardization downstream list also includes integer vector 'ns' marks columns original matrix 'non-singular' (.e. constant features). 'ns' index plays important role plmm_format() untransform() (helper functions model fitting)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A summary function for cv_plmm objects — summary.cv_plmm","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"summary function cv_plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"","code":"# S3 method for class 'cv_plmm' summary(object, lambda = \"min\", ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"object cv_plmm object lambda regularization parameter value inference reported. Can choose numeric value, 'min', '1se'. Defaults 'min.' ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"return value object S3 class summary.cv_plmm. class print method contains following list elements: lambda_min: lambda value minimum cross validation error lambda.1se: maximum lambda value within 1 standard error minimum cross validation error penalty: penalty applied fitted model nvars: number non-zero coefficients selected lambda value cve: cross validation error folds min: minimum cross validation error fit: plmm fit used cross validation returnBiasDetails = TRUE, two items returned: bias: mean bias cross validation loss: loss value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) summary(cv_fit) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2168): #> ------------------------------------------------- #> Nonzero coefficients: 10 #> Cross-validation error (deviance): 2.12 #> Scale estimate (sigma): 1.455"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A summary method for the plmm objects — summary.plmm","title":"A summary method for the plmm objects — summary.plmm","text":"summary method plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A summary method for the plmm objects — summary.plmm","text":"","code":"# S3 method for class 'plmm' summary(object, lambda, idx, eps = 1e-05, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A summary method for the plmm objects — summary.plmm","text":"object object class plmm lambda regularization parameter value inference reported. idx Alternatively, lambda may specified index; idx=10 means: report inference 10th value lambda along regularization path. lambda idx specified, lambda takes precedence. eps lambda given, eps tolerance difference given lambda value lambda value object. Defaults 0.0001 (1e-5) ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A summary method for the plmm objects — summary.plmm","text":"return value object S3 class summary.plmm. class print method contains following list elements: penalty: penalty used plmm (e.g. SCAD, MCP, lasso) n: Number instances/observations std_X_n: number observations standardized data; time differ 'n' data PLINK external data include samples p: Number regression coefficients (including intercept) converged: Logical indicator whether model converged lambda: lambda value inference reported lambda_char: formatted character string indicating lambda value nvars: number nonzero coefficients (, including intercept) value lambda nonzero: column names indicating nonzero coefficients model specified value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A summary method for the plmm objects — summary.plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) summary(fit, idx = 97) #> lasso-penalized regression model with n=197, p=101 at lambda=0.00054 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 98 #> -------------------------------------------------"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale — untransform","title":"Untransform coefficient values back to the original scale — untransform","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale — untransform","text":"","code":"untransform( std_scale_beta, p, std_X_details, fbm_flag, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale — untransform","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' fbm_flag Logical: corresponding design matrix filebacked? plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns. use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale — untransform","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"","code":"untransform_delim( std_scale_beta, p, std_X_details, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"","code":"untransform_in_memory(std_scale_beta, p, std_X_details, use_names = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"","code":"untransform_plink( std_scale_beta, p, std_X_details, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":null,"dir":"Reference","previous_headings":"","what":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"Linux/Unix MacOS , companion function unzip .gz files ship plmmr package","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"","code":"unzip_example_data(outdir)"},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"outdir file path directory .gz files written","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"Nothing returned; PLINK files ship plmmr package stored directory specified 'outdir'","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"example function, look vignette('plink_files', package = \"plmmr\"). Note : function work Windows systems - Linux/Unix MacOS.","code":""},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"bug-fixes-4-2-0","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"plmmr 4.2.0 (2024-12-13)","text":"recently caught couple bugs model fitting functions – apologize errors may caused downstream analysis, explain addressed issues : Bug BLUP: caught mathematical error earlier implementation best linear unbiased prediction. issue inconsistency scaling among terms used constructing predictor. issue impacted prediction within cross-validation well predict() method plmm class. recommend users used best linear unbiased prediction (BLUP) previous analysis re-run analysis using corrected version. Bug processing delimited files: noticed bug way models fit data delimited files. previous version correctly implementing transformation model results standardized scale original scale due inadvertent addition two rows beta_vals object (one row added, intercept). error corrected. recommend users used previous version plmmr analyze data delimited files re-run analyses.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"other-changes-4-2-0","dir":"Changelog","previous_headings":"","what":"Other changes","title":"plmmr 4.2.0 (2024-12-13)","text":"Change default settings prediction: default prediction method predict() cv_plmm() now ‘blup’ (best linear unbiased prediction). Change objects returned default plmm(): default, main model fitting function plmm() now returns std_X (copy standardized design matrix) , y (outcome vector used fit model), std_scale_beta (estimated coefficients standardized scale). components used construct best linear unbiased predictor. user can opt return items using return_fit = FALSE compact_save options. Change arguments passed predict(): tandem change returned plmm() default, predict() method longer needs separate X y argument supplied type = 'blup'. components needed BLUP returned default plmm. Note predict() still early stages development filebacked data; given complexities particularities filebacked data processed (particularly data constant features), edge cases predict() method handle yet. continue work developing method; now, example predict() filebacked data vignette delimited data. Note particular example delimited data, constant features design matrix.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-410-2024-10-23","dir":"Changelog","previous_headings":"","what":"plmmr 4.1.0 (2024-10-23)","title":"plmmr 4.1.0 (2024-10-23)","text":"CRAN release: 2024-10-23 Restore plmm(X,y) syntax: version 4.0.0 required create_design() always called prior plmm() cv_plmm(); update restores X,y syntax consistent packages (e.g., glmnet, ncvreg). Note syntax available case design matrix stored -memory matrix data.frame object. create_design() function still required cases design matrix/dataset stored external file. Bug fix: 4.0.0 version create_design() required X column names, errored uninformative message names supplied (see issue 61). now fixed – column names required unless user wants specify argument unpen. Argument name change: create_design(), argument specify outcome -memory case renamed y; makes syntax consistent, e.g., create_design(X, y). Note change relevant -memory data . Internal: Fixed LTO type mismatch bug.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-400-2024-10-07","dir":"Changelog","previous_headings":"","what":"plmmr 4.0.0 (2024-10-07)","title":"plmmr 4.0.0 (2024-10-07)","text":"CRAN release: 2024-10-11 Major re-structuring preprocessing pipeline: Data external files must now processed process_plink() process_delim(). data (including -memory data) must prepared analysis via create_design(). change ensures data funneled uniform format analysis. Documentation updated: vignettes package now revised include examples complete pipeline new create_design() syntax. article type data input (matrix/data.frame, delimited file, PLINK). CRAN: package CRAN now.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-320-2024-09-02","dir":"Changelog","previous_headings":"","what":"plmmr 3.2.0 (2024-09-02)","title":"plmmr 3.2.0 (2024-09-02)","text":"bigsnpr now Suggests, Imports: essential filebacking support now done bigmemory bigalgebra. bigsnpr package used processing PLINK files. dev branch gwas_scale version pipeline runs completely file-backed.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-310-2024-07-13","dir":"Changelog","previous_headings":"","what":"plmmr 3.1.0 (2024-07-13)","title":"plmmr 3.1.0 (2024-07-13)","text":"Enhancement: make plmmr better functionality writing scripts, functions process_plink(), plmmm(), cv_plmm() now (optionally) write ‘.log’ files, PLINK. Enhancement: cases users working large datasets, may practical desirable results returned plmmm() cv_plmm() saved single ‘.rds’ file. now option model fitting functions called ‘compact_save’, gives users option save output multiple, smaller ‘.rds’ files. Argument removed: Argument std_needed longer available plmm() cv_plmm() functions.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-300-2024-06-27","dir":"Changelog","previous_headings":"","what":"plmmr 3.0.0 (2024-06-27)","title":"plmmr 3.0.0 (2024-06-27)","text":"Bug fix: Cross-validation implementation issues fixed. Previously, full set eigenvalues used inside CV folds, ideal involves information outside fold. Now, entire modeling process cross-validated: standardization, eigendecomposition relatedness matrix, model fitting, backtransformation onto original scale prediction. Computational speedup: standardization rotation filebacked data now much faster; bigalgebra bigmemory now used computations. Internal: standardized scale, intercept PLMM mean outcome. derivation considerably simplifies handling intercept internally model fitting.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-221-2024-03-16","dir":"Changelog","previous_headings":"","what":"plmmr 2.2.1 (2024-03-16)","title":"plmmr 2.2.1 (2024-03-16)","text":"Name change: Changed package name plmmr; note plmm(), cv_plmm(), functions starting plmm_ changed names.","code":""}] +[{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"process-the-data","dir":"Articles","previous_headings":"","what":"Process the data","title":"If your data is in a delimited file","text":"output messages indicate data processed. call created 2 files, one .rds file corresponding .bk file. .bk file special type binary file can used store large data sets. .rds file contains pointer .bk file, along meta-data. Note returned process_delim() character string filepath: .","code":"# I will create the processed data files in a temporary directory; # fill in the `rds_dir` argument with the directory of your choice temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", overwrite = TRUE, header = TRUE) #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpZ5yqi0/processed_colon2.rds # look at what is created colon <- readRDS(colon_dat)"},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"create-a-design","dir":"Articles","previous_headings":"","what":"Create a design","title":"If your data is in a delimited file","text":"Creating design ensures data uniform format prior analysis. delimited files, two main processes happening create_design(): (1) standardization columns (2) construction penalty factor vector. Standardization columns ensures features evaluated model uniform scale; done transforming column design matrix mean 0 variance 1. penalty factor vector indicator vector 0 represents feature always model – feature unpenalized. specify columns want unpenalized, use ‘unpen’ argument. example, choosing make ‘sex’ unpenalized covariate. side note unpenalized covariates: delimited file data, features want include model – penalized unpenalized features – must included delimited file. differs PLINK file data analyzed; look create_design() documentation details examples. process_delim(), create_design() function returns filepath: . output messages document steps create design procedure, messages saved text file colon_design.log rds_dir folder. didactic purposes, can look design:","code":"# prepare outcome data colon_outcome <- read.delim(find_example_data(path = \"colon2_outcome.txt\")) # create a design colon_design <- create_design(data_file = colon_dat, rds_dir = temp_dir, new_file = \"std_colon2\", add_outcome = colon_outcome, outcome_id = \"ID\", outcome_col = \"y\", unpen = \"sex\", # this will keep 'sex' in the final model logfile = \"colon_design\") #> No feature_id supplied; will assume data X are in same row-order as add_outcome. #> There are 0 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-17 19:29:26 #> Done with standardization. File formatting in progress # look at the results colon_rds <- readRDS(colon_design) str(colon_rds) #> List of 18 #> $ X_colnames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ X_rownames : chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ n : num 62 #> $ p : num 2001 #> $ is_plink : logi FALSE #> $ outcome_idx : int [1:62] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : int [1:62] 1 0 1 0 1 0 1 0 1 0 ... #> $ std_X_rownames: chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ unpen : int 1 #> $ unpen_colnames: chr \"sex\" #> $ ns : int [1:2001] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpZ5yqi0/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 62 #> $ std_X_p : num 2001 #> $ std_X_center : num [1:2001] 1.47 7015.79 4966.96 4094.73 3987.79 ... #> $ std_X_scale : num [1:2001] 0.499 3067.926 2171.166 1803.359 2002.738 ... #> $ penalty_factor: num [1:2001] 0 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\""},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"fit-a-model","dir":"Articles","previous_headings":"","what":"Fit a model","title":"If your data is in a delimited file","text":"fit model using design follows: Notice messages printed – documentation may optionally saved another .log file using logfile argument. can examine results specific \\lambda value: may also plot paths estimated coefficients:","code":"colon_fit <- plmm(design = colon_design, return_fit = TRUE, trace = TRUE) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Input data passed all checks at 2024-12-17 19:29:26 #> Starting decomposition. #> Calculating the eigendecomposition of K #> Eigendecomposition finished at 2024-12-17 19:29:26 #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:29:26 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:29:27 #> Beta values are estimated -- almost done! #> Formatting results (backtransforming coefs. to original scale). #> Model ready at 2024-12-17 19:29:27 summary(colon_fit, idx = 50) #> lasso-penalized regression model with n=62, p=2002 at lambda=0.0597 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 30 #> ------------------------------------------------- plot(colon_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"prediction-for-filebacked-data","dir":"Articles","previous_headings":"","what":"Prediction for filebacked data","title":"If your data is in a delimited file","text":"example shows experimental option, wherein working add prediction method filebacked outside cross-validation.","code":"# linear predictor yhat_lp <- predict(object = colon_fit, newX = attach.big.matrix(colon$X), type = \"lp\") # best linear unbiased predictor yhat_blup <- predict(object = colon_fit, newX = attach.big.matrix(colon$X), type = \"blup\") # look at mean squared prediction error mspe_lp <- apply(yhat_lp, 2, function(c){crossprod(colon_outcome$y - c)/length(c)}) mspe_blup <- apply(yhat_blup, 2, function(c){crossprod(colon_outcome$y - c)/length(c)}) min(mspe_lp) #> [1] 0.007659158 min(mspe_blup) #> [1] 0.00617254"},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Getting started with plmmr","text":"plmmr package fitting Penalized Linear Mixed Models R. package created purpose fitting penalized regression models high dimensional data observations correlated. instance, kind data arises often context genetics (e.g., GWAS population structure /family grouping). novelties plmmr : Integration: plmmr combines functionality several packages order quality control, model fitting/analysis, data visualization one package. example, GWAS data, plmmr take PLINK files way list SNPs downstream analysis. Accessibility: plmmr can run R session typical desktop laptop computer. user need access supercomputer experience command line order fit models plmmr. Handling correlation: plmmr uses transformation (1) measures correlation among samples (2) uses correlation measurement improve predictions (via best linear unbiased predictor, BLUP). means plmm(), ’s need filter data ‘maximum subset unrelated samples.’","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"minimal-example","dir":"Articles","previous_headings":"","what":"Minimal example","title":"Getting started with plmmr","text":"minimal reproducible example plmmr can used:","code":"# library(plmmr) fit <- plmm(admix$X, admix$y) # admix data ships with package plot(fit) cvfit <- cv_plmm(admix$X, admix$y) plot(cvfit) summary(cvfit) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2325): #> ------------------------------------------------- #> Nonzero coefficients: 8 #> Cross-validation error (deviance): 2.12 #> Scale estimate (sigma): 1.455"},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"file-backing","dir":"Articles","previous_headings":"Computational capability","what":"File-backing","title":"Getting started with plmmr","text":"many applications high dimensional data analysis, dataset large read R – session crash lack memory. particularly common analyzing data genome-wide association studies (GWAS). analyze large datasets, plmmr equipped analyze data using filebacking - strategy lets R ‘point’ file disk, rather reading file R session. Many packages use technique - bigstatsr biglasso two examples packages use filebacking technique. package plmmr uses create store filebacked objects bigmemory. filebacked computation relies biglasso package Yaohui Zeng et al. bigalgebra Michael Kane et al. processing PLINK files, use methods bigsnpr package Florian Privé.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"numeric-outcomes-only","dir":"Articles","previous_headings":"Computational capability","what":"Numeric outcomes only","title":"Getting started with plmmr","text":"time, package designed linear regression – , considering continuous (numeric) outcomes. maintain treating binary outcomes numeric values appropriate contexts, described Hastie et al. Elements Statistical Learning, chapter 4. future, like extend package handle dichotomous outcomes via logistic regression; theoretical work underlying open problem.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"types-of-penalization","dir":"Articles","previous_headings":"Computational capability","what":"3 types of penalization","title":"Getting started with plmmr","text":"Since focused penalized regression package, plmmr offers 3 choices penalty: minimax concave (MCP), smoothly clipped absolute deviation (SCAD), least absolute shrinkage selection operator (LASSO). implementation penalties built concepts/techniques provided ncvreg package.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"data-size-and-dimensionality","dir":"Articles","previous_headings":"Computational capability","what":"Data size and dimensionality","title":"Getting started with plmmr","text":"distinguish data attributes ‘big’ ‘high dimensional.’ ‘Big’ describes amount space data takes computer, ‘high dimensional’ describes context ratio features (also called ‘variables’ ‘predictors’) observations (e.g., samples) high. instance, data 100 samples 100 variables high dimensional, big. contrast, data 10 million observations 100 variables big, high dimensional. plmmr optimized data high dimensional – methods using estimate relatedness among observations perform best high number features relative number observations. plmmr also designed accommodate data large analyze -memory. accommodate data file-backing (described ). current analysis pipeline works well data files 40 Gb size. practice, means plmmr equipped analyze GWAS data, biobank-sized data.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"data-input-types","dir":"Articles","previous_headings":"","what":"Data input types","title":"Getting started with plmmr","text":"plmmr currently works three types data input: Data stored -memory matrix data frame Data stored PLINK files Data stored delimited files","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"example-data-sets","dir":"Articles","previous_headings":"Data input types","what":"Example data sets","title":"Getting started with plmmr","text":"plmmr currently includes three example data sets, one type data input. admix data example matrix input data. admix small data set (197 observations, 100 SNPs) describes individuals different ancestry groups. outcome admix simulated include population structure effects (.e. race/ethnicity impact SNP associations). data set available whenever library(plmmr) called. example analysis admix data available vignette('matrix_data', package = \"plmmr\"). penncath_lite data example PLINK input data. penncath_lite (data coronary artery disease PennCath study) high dimensional data set (1401 observations, 4217 SNPs) several health outcomes well age sex information. features data set represent small subset much larger GWAS data set (original data 800K SNPs). information data set, refer original publication. example analysis penncath_lite data available vignette('plink_files', package = \"plmmr\"). colon2 data example delimited-file input data. colon2 variation colon data included biglasso package. colon2 62 observations 2,001 features representing study colon disease. 2000 features original data, ‘sex’ feature simulated. example analysis colon2 data available vignette('delim_files', package = \"plmmr\").","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"basic-model-fitting","dir":"Articles","previous_headings":"","what":"Basic model fitting","title":"If your data is in a matrix or data frame","text":"admix dataset now ready analyze call plmmr::plmm() (one main functions plmmr): Notice: passing admix$X design argument plmm(); internally, plmm() taken X input created plmm_design object. also supply X y create_design() make step explicit. returned beta_vals item matrix whose rows \\hat\\beta coefficients whose columns represent values penalization parameter \\lambda. default, plmm fits 100 values \\lambda (see setup_lambda function details). Note values \\lambda, SNP 8 \\hat \\beta = 0. SNP 8 constant feature, feature (.e., column \\mathbf{X}) whose values vary among members population. can summarize fit nth \\lambda value: can also plot path fit see model coefficients vary \\lambda: Plot path model fit Suppose also know ancestry groups person admix data self-identified. probably want include model unpenalized covariate (.e., want ‘ancestry’ always model). specify unpenalized covariate, need use create_design() function prior calling plmm(). look: may compare results model includes ‘ancestry’ first model:","code":"admix_fit <- plmm(admix$X, admix$y) summary(admix_fit, lambda = admix_fit$lambda[50]) #> lasso-penalized regression model with n=197, p=101 at lambda=0.01426 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 88 #> ------------------------------------------------- admix_fit$beta_vals[1:10, 97:100] |> knitr::kable(digits = 3, format = \"html\") # for n = 25 summary(admix_fit, lambda = admix_fit$lambda[25]) #> lasso-penalized regression model with n=197, p=101 at lambda=0.08163 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 46 #> ------------------------------------------------- plot(admix_fit) # add ancestry to design matrix X_plus_ancestry <- cbind(admix$ancestry, admix$X) # adjust column names -- need these for designating 'unpen' argument colnames(X_plus_ancestry) <- c(\"ancestry\", colnames(admix$X)) # create a design admix_design2 <- create_design(X = X_plus_ancestry, y = admix$y, # below, I mark ancestry variable as unpenalized # we want ancestry to always be in the model unpen = \"ancestry\") # now fit a model admix_fit2 <- plmm(design = admix_design2) summary(admix_fit2, idx = 25) #> lasso-penalized regression model with n=197, p=102 at lambda=0.09886 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 14 #> ------------------------------------------------- plot(admix_fit2)"},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"cross-validation","dir":"Articles","previous_headings":"","what":"Cross validation","title":"If your data is in a matrix or data frame","text":"select \\lambda value, often use cross validation. example using cv_plmm select \\lambda minimizes cross-validation error: can also plot cross-validation error (CVE) versus \\lambda (log scale): Plot CVE","code":"admix_cv <- cv_plmm(design = admix_design2, return_fit = T) admix_cv_s <- summary(admix_cv, lambda = \"min\") print(admix_cv_s) #> lasso-penalized model with n=197 and p=102 #> At minimum cross-validation error (lambda=0.1853): #> ------------------------------------------------- #> Nonzero coefficients: 3 #> Cross-validation error (deviance): 1.33 #> Scale estimate (sigma): 1.154 plot(admix_cv)"},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"predicted-values","dir":"Articles","previous_headings":"","what":"Predicted values","title":"If your data is in a matrix or data frame","text":"example predict() methods PLMMs: can compare predictions predictions get intercept-model using mean squared prediction error (MSPE) – lower better: see model better predictions null.","code":"# make predictions for select lambda value(s) y_hat <- predict(object = admix_fit, newX = admix$X, type = \"blup\", X = admix$X, y = admix$y) # intercept-only (or 'null') model crossprod(admix$y - mean(admix$y))/length(admix$y) #> [,1] #> [1,] 5.928528 # our model at its best value of lambda apply(y_hat, 2, function(c){crossprod(admix$y - c)/length(c)}) -> mse min(mse) #> [1] 0.6930826 # ^ across all values of lambda, our model has MSPE lower than the null model"},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"math-notation","dir":"Articles","previous_headings":"","what":"Math notation","title":"Notes on notation","text":"concepts need denote, order usage derivations. blocked sections corresponding steps model fitting process.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"statistical-model-the-overall-framework","dir":"Articles","previous_headings":"Math notation","what":"Statistical model (the overall framework)","title":"Notes on notation","text":"overall model can written \\mathbf{y} = \\mathbf{X}\\boldsymbol{\\beta} + \\mathbf{Z}\\boldsymbol{\\gamma} + \\boldsymbol{\\epsilon} equivalently \\mathbf{y} = \\dot{\\mathbf{X}}\\dot{\\boldsymbol{\\beta}} + \\mathbf{u} + \\boldsymbol{\\epsilon} : \\mathbf{X} \\mathbf{y} n \\times p design matrix data n \\times 1 vector outcomes, respectively. , n number observations (e.g., number patients, number samples, etc.) p number features (e.g., number SNPs, number variables, number covariates, etc.). \\dot{\\mathbf{X}} column-standardized \\mathbf{X}, p columns mean 0 standard deviation 1. Note: \\dot{\\mathbf{X}} excludes singular features (columns constants) original \\mathbf{X}. \\dot{\\boldsymbol{\\beta}} represents coefficients standardized scale. \\mathbf{Z} n \\times b matrix indicators corresponding grouping structure, \\boldsymbol{\\gamma} vector values describing grouping associated \\mathbf{y}. real data, values typically unknown. \\boldsymbol{\\epsilon} n \\times 1 vector noise. define realized (empirical) relatedness matrix \\mathbf{K} \\equiv \\frac{1}{p}\\dot{\\mathbf{X}}\\dot{\\mathbf{X}}^\\top model assumes: \\boldsymbol{\\epsilon} \\perp \\mathbf{u} \\boldsymbol{\\epsilon} \\sim N(0, \\sigma^2_{\\epsilon}\\mathbf{}) \\mathbf{u} \\sim N(0, \\sigma^2_{s}\\mathbf{K}) assumptions, may write \\mathbf{y} \\sim N(\\dot{\\mathbf{X}}\\dot{\\boldsymbol{\\beta}}, \\boldsymbol{\\Sigma}) Indices: \\1,..., n indexes observations j \\1,..., p indexes features h \\1,..., b indexes batches (e.g., different family groups, different data collection sites, etc.)","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"decomposition-and-rotation-prep-and-first-part-of-fit","dir":"Articles","previous_headings":"Math notation","what":"Decomposition and rotation (prep and first part of fit)","title":"Notes on notation","text":"Beginning eigendecomposition, \\mathbf{U} \\mathbf{s} eigenvectors eigenvalues \\mathbf{K}, one obtain \\text{eigen}(\\mathbf{K)} \\equiv \\mathbf{U}\\mathbf{S}\\mathbf{U}^\\top. elements \\mathbf{s} diagonal values \\mathbf{S}. Note, random effect \\mathbf{u} distinct columns matrix \\mathbf{U}. k represents number nonzero eigenvalues represented \\mathbf{U} \\mathbf{d}, k \\leq \\text{min}(n,p). , \\mathbf{K} \\equiv \\frac{1}{p}\\dot{\\mathbf{X}}\\dot{\\mathbf{X}}^{\\top} often referred literature realized relatedness matrix (RRM) genomic relatedness matrix (GRM). \\mathbf{K} dimension n \\times n. \\eta ratio \\frac{\\sigma^2_s}{\\sigma^2_e + \\sigma^2_s}. estimate \\hat{\\eta} null model (details come). \\mathbf{\\Sigma} variance outcome, \\mathbb{V}({\\mathbf{y}}) \\propto \\eta \\mathbf{K} + (1 - \\eta)\\mathbf{}_n. \\mathbf{w} vector weights defined (\\eta\\mathbf{\\mathbf{s}} + (1-\\eta))^{-1/2}. values \\mathbf{w} nonzero values diagonal matrix \\mathbf{W} \\equiv (\\eta\\mathbf{S} + (1 - \\eta)\\mathbf{})^{-1/2}. matrix used rotating (preconditioning) data \\mathbf{\\Sigma}^{-1/2} \\equiv \\mathbf{W}\\mathbf{U}^\\top. \\tilde{\\dot{\\mathbf{X}}} \\equiv \\mathbf{W}\\mathbf{U}^\\top\\dot{\\mathbf{X}} rotated data, data transformed scale. \\tilde{\\mathbf{y}} \\equiv \\mathbf{\\Sigma}^{-1/2}\\mathbf{y} outcome rotated scale. \\tilde{\\ddot{\\mathbf{X}}} standardized rotated data. Note: standardization involves scaling, centering. post-rotation standardization impacts estimated coefficients well; define {\\ddot{\\boldsymbol{\\beta}}} estimated coefficients scale.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"model-fitting-with-penalization","dir":"Articles","previous_headings":"Math notation","what":"Model fitting with penalization","title":"Notes on notation","text":"fit \\tilde{\\mathbf{y}} \\sim \\tilde{\\ddot{\\mathbf{X}}} using penalized linear mixed model, obtain \\hat{\\ddot{\\boldsymbol{\\beta}}} estimated coefficients. penalty parameter values (e.g., values lasso tuning parameter) indexed \\lambda_l \\1,..., t.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"rescaling-results-format","dir":"Articles","previous_headings":"Math notation","what":"Rescaling results (format)","title":"Notes on notation","text":"obtain estimated coefficients original scale, values estimated model must unscaled (‘untransformed’) twice: adjust post-rotation standardization, adjust pre-rotation standardization. process written \\hat{\\ddot{\\boldsymbol{\\beta}}} \\rightarrow \\hat{\\dot{\\boldsymbol{\\beta}}} \\rightarrow \\hat{\\boldsymbol{\\beta}}.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"object-names-in-source-code","dir":"Articles","previous_headings":"","what":"Object names in source code","title":"Notes on notation","text":"code, denote objects way: \\mathbf{X} \\mathbf{y} X y \\dot{\\mathbf{X}} std_X \\tilde{\\dot{\\mathbf{X}}} rot_X \\ddot{\\tilde{\\mathbf{X}}} stdrot_X \\hat{\\boldsymbol{\\beta}} named og_scale_beta helper functions (clarity) returned plmm objects beta_vals. beta_vals og_scale_beta equivalent; represent estimated coefficients original scale. \\hat{\\dot{\\boldsymbol{\\beta}}} std_scale_beta \\hat{\\ddot{\\boldsymbol{\\beta}}} stdrot_scale_beta \\dot{\\mathbf{X}}\\hat{\\dot{\\boldsymbol{\\beta}}} Xb \\ddot{\\tilde{\\mathbf{X}}} \\hat{\\ddot{\\boldsymbol{\\beta}}} linear_predictors. Note: words, means linear_predictors code scale rotated re-standardized data! \\hat{\\boldsymbol{\\Sigma}} \\equiv \\hat{\\eta}\\mathbf{K} + (1 - \\hat{\\eta})\\mathbf{} estimated_Sigma. Similarly, \\hat{\\boldsymbol{\\Sigma}}_{11} Sigma_11, etc.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"processing-plink-files","dir":"Articles","previous_headings":"","what":"Processing PLINK files","title":"If your data is in PLINK files","text":"First, unzip PLINK files zipped. example data, penncath_lite data ships plmmr zipped; MacOS Linux, can run command unzip: GWAS data, tell plmmr combine information across three PLINK files (.bed, .bim, .fam files). process_plink(). , create files want temporary directory just sake example. Users can specify folder choice rds_dir, shown : ’ll see lot messages printed console … result creation 3 files: imputed_penncath_lite.rds imputed_penncath_lite.bk contain data. 1 show folder PLINK data . returned filepath. .rds object filepath contains processed data, now use create design. didactic purposes, let’s examine ’s imputed_penncath_lite.rds using readRDS() function (Note Don’t analysis - section reads data memory. just illustration):","code":"temp_dir <- tempdir() # using a temp dir -- change to fit your preference unzip_example_data(outdir = temp_dir) #> Unzipped files are saved in /tmp/RtmpzM7QaI # temp_dir <- tempdir() # using a temporary directory (if you didn't already create one above) plink_data <- process_plink(data_dir = temp_dir, data_prefix = \"penncath_lite\", rds_dir = temp_dir, rds_prefix = \"imputed_penncath_lite\", # imputing the mode to address missing values impute_method = \"mode\", # overwrite existing files in temp_dir # (you can turn this feature off if you need to) overwrite = TRUE, # turning off parallelization - # leaving this on causes problems knitting this vignette parallel = FALSE) #> #> Preprocessing penncath_lite data: #> Creating penncath_lite.rds #> #> There are 1401 observations and 4367 genomic features in the specified data files, representing chromosomes 1 - 22 #> There are a total of 3514 SNPs with missing values #> Of these, 13 are missing in at least 50% of the samples #> #> Imputing the missing (genotype) values using mode method #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpzM7QaI/imputed_penncath_lite.rds pen <- readRDS(plink_data) # notice: this is a `processed_plink` object str(pen) # note: genotype data is *not* in memory #> List of 5 #> $ X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"imputed_penncath_lite.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpzM7QaI/\" #> .. .. ..$ totalRows : int 1401 #> .. .. ..$ totalCols : int 4367 #> .. .. ..$ rowOffset : num [1:2] 0 1401 #> .. .. ..$ colOffset : num [1:2] 0 4367 #> .. .. ..$ nrow : num 1401 #> .. .. ..$ ncol : num 4367 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : NULL #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ map:'data.frame': 4367 obs. of 6 variables: #> ..$ chromosome : int [1:4367] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ marker.ID : chr [1:4367] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> ..$ genetic.dist: int [1:4367] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ physical.pos: int [1:4367] 2056735 3188505 4275291 4280630 4286036 4302161 4364564 4388885 4606471 4643688 ... #> ..$ allele1 : chr [1:4367] \"C\" \"T\" \"T\" \"G\" ... #> ..$ allele2 : chr [1:4367] \"T\" \"C\" \"C\" \"A\" ... #> $ fam:'data.frame': 1401 obs. of 6 variables: #> ..$ family.ID : int [1:1401] 10002 10004 10005 10007 10008 10009 10010 10011 10012 10013 ... #> ..$ sample.ID : int [1:1401] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ paternal.ID: int [1:1401] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ maternal.ID: int [1:1401] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ sex : int [1:1401] 1 2 1 1 1 1 1 2 1 2 ... #> ..$ affection : int [1:1401] 1 1 2 1 2 2 2 1 2 -9 ... #> $ n : int 1401 #> $ p : int 4367 #> - attr(*, \"class\")= chr \"processed_plink\" # notice: no more missing values in X any(is.na(pen$genotypes[,])) #> [1] FALSE"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"creating-a-design","dir":"Articles","previous_headings":"","what":"Creating a design","title":"If your data is in PLINK files","text":"Now ready create plmm_design, object pieces need model: design matrix \\mathbf{X}, outcome vector \\mathbf{y}, vector penalty factor indicators (1 = feature penalized, 0 = feature penalized). side note: GWAS studies, typical include non-genomic factors unpenalized covariates part model. instance, may want adjust sex age – factors want ensure always included selected model. plmmr package allows include additional unpenalized predictors via ‘add_predictor’ ‘predictor_id’ options, passed create_design() internal function create_design_filebacked(). example options included create_design() documentation. key part create_design() standardizing columns genotype matrix. didactic example showing columns std_X element design mean = 0 variance = 1. Note something analysis – reads data memory.","code":"# get outcome data penncath_pheno <- read.csv(find_example_data(path = 'penncath_clinical.csv')) phen <- data.frame(FamID = as.character(penncath_pheno$FamID), CAD = penncath_pheno$CAD) pen_design <- create_design(data_file = plink_data, feature_id = \"FID\", rds_dir = temp_dir, new_file = \"std_penncath_lite\", add_outcome = phen, outcome_id = \"FamID\", outcome_col = \"CAD\", logfile = \"design\", # again, overwrite if needed; use with caution overwrite = TRUE) #> There are 62 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-17 19:29:42 #> Done with standardization. File formatting in progress # examine the design - notice the components of this object pen_design_rds <- readRDS(pen_design) str(pen_design_rds) #> List of 16 #> $ X_colnames : chr [1:4367] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> $ X_rownames : chr [1:1401] \"10002\" \"10004\" \"10005\" \"10007\" ... #> $ n : int 1401 #> $ p : int 4367 #> $ is_plink : logi TRUE #> $ outcome_idx : int [1:1401] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : Named int [1:1401] 1 1 1 1 1 1 1 1 1 0 ... #> ..- attr(*, \"names\")= chr [1:1401] \"CAD1\" \"CAD2\" \"CAD3\" \"CAD4\" ... #> $ std_X_rownames: chr [1:1401] \"10002\" \"10004\" \"10005\" \"10007\" ... #> $ ns : int [1:4305] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:4305] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_penncath_lite.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpzM7QaI/\" #> .. .. ..$ totalRows : int 1401 #> .. .. ..$ totalCols : int 4305 #> .. .. ..$ rowOffset : num [1:2] 0 1401 #> .. .. ..$ colOffset : num [1:2] 0 4305 #> .. .. ..$ nrow : num 1401 #> .. .. ..$ ncol : num 4305 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : NULL #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 1401 #> $ std_X_p : num 4305 #> $ std_X_center : num [1:4305] 0.00785 0.35974 1.01213 0.06067 0.46253 ... #> $ std_X_scale : num [1:4305] 0.0883 0.7783 0.8636 0.28 1.2791 ... #> $ penalty_factor: num [1:4305] 1 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\" # we can check to see that our data have been standardized std_X <- attach.big.matrix(pen_design_rds$std_X) colMeans(std_X[,]) |> summary() # columns have mean zero... #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> -1.356e-16 -2.334e-17 3.814e-19 9.868e-19 2.520e-17 2.635e-16 apply(std_X[,], 2, var) |> summary() # ... & variance 1 #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 1.001 1.001 1.001 1.001 1.001 1.001"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"fitting-a-model","dir":"Articles","previous_headings":"","what":"Fitting a model","title":"If your data is in PLINK files","text":"Now design object, ready fit model. default, model fitting results saved files folder specified rds_dir argument plmmm. want return model fitting results, set return_fit = TRUE plmm(). examine model results :","code":"pen_fit <- plmm(design = pen_design, trace = T, return_fit = T) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Input data passed all checks at 2024-12-17 19:29:43 #> Starting decomposition. #> Calculating the eigendecomposition of K #> Eigendecomposition finished at 2024-12-17 19:29:45 #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:29:45 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:29:48 #> Beta values are estimated -- almost done! #> Formatting results (backtransforming coefs. to original scale). #> Model ready at 2024-12-17 19:29:48 # you can turn off the trace messages by letting trace = F (default) summary(pen_fit, idx = 50) #> lasso-penalized regression model with n=1401, p=4368 at lambda=0.01211 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 537 #> ------------------------------------------------- plot(pen_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"cross-validation","dir":"Articles","previous_headings":"","what":"Cross validation","title":"If your data is in PLINK files","text":"choose tuning parameter model, plmmr offers cross validation method: plot summary methods CV models well:","code":"cv_fit <- cv_plmm(design = pen_design, type = \"blup\", return_fit = T, trace = T) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Starting decomposition. #> Calculating the eigendecomposition of K #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:29:50 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:29:53 #> 'Fold' argument is either NULL or missing; assigning folds randomly (by default). #> #> To specify folds for each observation, supply a vector with fold assignments. #> #> Starting cross validation #> | | | 0%Beginning eigendecomposition in fold 1 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 1 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:29:54 #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:29:56 #> | |============== | 20% #> Beginning eigendecomposition in fold 2 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 2 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:29:58 #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:30:00 #> | |============================ | 40%Beginning eigendecomposition in fold 3 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 3 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:30:01 #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:30:04 #> | |========================================== | 60%Beginning eigendecomposition in fold 4 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 4 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:30:05 #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:30:07 #> | |======================================================== | 80%Beginning eigendecomposition in fold 5 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 5 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:30:08 #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:30:11 #> | |======================================================================| 100% summary(cv_fit) # summary at lambda value that minimizes CV error #> lasso-penalized model with n=1401 and p=4368 #> At minimum cross-validation error (lambda=0.0406): #> ------------------------------------------------- #> Nonzero coefficients: 6 #> Cross-validation error (deviance): 0.22 #> Scale estimate (sigma): 0.471 plot(cv_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"details-create_design-for-plink-data","dir":"Articles","previous_headings":"","what":"Details: create_design() for PLINK data","title":"If your data is in PLINK files","text":"call create_design() involves steps: Integrate external phenotype information, supplied. Note: samples PLINK data phenotype value specified additional phenotype file removed analysis. Identify missing values samples SNPs/features. Impute missing values per user’s specified method. See R documentation bigsnpr::snp_fastImputeSimple() details. Note: plmmr package fit models datasets missing values. missing values must imputed subset analysis. Integrate external predictor information, supplied. matrix meta-data (e.g., age, principal components, etc.). Note: samples supplied file included PLINK data, removed. example, phenotyped participants genotyped participants study, plmmr::create_design() create matrix data representing genotyped samples also data supplied external phenotype file. Create design matrix represents nonsingular features samples predictor phenotype information available (case external data supplied). Standardize design matrix columns mean 0 variance 1.","code":""},{"path":"https://pbreheny.github.io/plmmr/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Tabitha K. Peter. Author. Anna C. Reisetter. Author. Patrick J. Breheny. Author, maintainer. Yujing Lu. Author.","code":""},{"path":"https://pbreheny.github.io/plmmr/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Reisetter , Breheny P (2021). “Penalized linear mixed models structured genetic data.” Genetic epidemiology, 45(5), 427–444. https://doi.org/10.1002/gepi.22384.","code":"@Article{, author = {Anna C. Reisetter and Patrick Breheny}, title = {Penalized linear mixed models for structured genetic data}, journal = {Genetic epidemiology}, year = {2021}, volume = {45}, pages = {427--444}, number = {5}, url = {https://doi.org/10.1002/gepi.22384}, }"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"plmmr-","dir":"","previous_headings":"","what":"plmmr","title":"Penalized Linear Mixed Models for Correlated Data","text":"plmmr (penalized linear mixed models R) package contains functions fit penalized linear mixed models correct unobserved confounding effects.","code":""},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Penalized Linear Mixed Models for Correlated Data","text":"install latest version package GitHub, use : can also install plmmr CRAN: description motivation functions package (along examples) refer second module GWAS data tutorial","code":"devtools::install_github(\"pbreheny/plmmr\") install.packages('plmmr')"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"minimal-example","dir":"","previous_headings":"","what":"Minimal example","title":"Penalized Linear Mixed Models for Correlated Data","text":"","code":"library(plmmr) X <- rnorm(100*20) |> matrix(100, 20) y <- rnorm(100) fit <- plmm(X, y) plot(fit) cvfit <- cv_plmm(X, y) plot(cvfit) summary(cvfit)"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"so-how-fast-is-plmmr-and-how-well-does-it-scale","dir":"","previous_headings":"","what":"So how fast is plmmr? And how well does it scale?","title":"Penalized Linear Mixed Models for Correlated Data","text":"illustrate important questions, created separate GitHub repository scripts plmmr workflow using publicly-available genome-wide association (GWAS) data. main takeaway: using GWAS data study 1,400 samples 800,000 SNPs, full plmmr analysis run half hour using single core laptop. Three smaller datasets ship plmmr, tutorials walking analyze data sets documented documentation site. datasets useful didactic purposes, large enough really highlight computational scalability plmmr – motivated creation separate repository GWAS workflow.","code":""},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"note-on-branches","dir":"","previous_headings":"","what":"Note on branches","title":"Penalized Linear Mixed Models for Correlated Data","text":"branches repo organized following way: master main (‘head’) branch. gh_pages keeping documentation plmmr gwas_scale archived branch contains development version package used run dissertation analysis. delete eventually.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"helper function implement MCP penalty helper functions implement penalty.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"","code":"MCP(z, l1, l2, gamma, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"z vector representing solution active set feature l1 upper bound (beta) l2 lower bound (beta) gamma tuning parameter MCP penalty v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"numeric vector MCP-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement SCAD penalty — SCAD","title":"helper function to implement SCAD penalty — SCAD","text":"helper function implement SCAD penalty","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement SCAD penalty — SCAD","text":"","code":"SCAD(z, l1, l2, gamma, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement SCAD penalty — SCAD","text":"z solution active set feature l1 upper bound l2 lower bound gamma tuning parameter SCAD penalty v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement SCAD penalty — SCAD","text":"numeric vector SCAD-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to add predictors to a filebacked matrix of data — add_predictors","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"helper function add predictors filebacked matrix data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"","code":"add_predictors(obj, add_predictor, id_var, rds_dir, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"obj bigSNP object add_predictor Optional: add additional covariates/predictors/features external file (.e., PLINK file). id_var String specifying column PLINK .fam file unique sample identifiers. rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir(process_plink() call) quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"list 2 components: 'obj' - bigSNP object added element representing matrix includes additional predictors first columns 'non_gen' - integer vector ranges 1 number added predictors. Example: 2 predictors added, unpen= 1:2","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":null,"dir":"Reference","previous_headings":"","what":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"function designed use BLUP prediction. objective get matrix estimated beta coefficients standardized scale, dimension original/training data. adding rows 0s std_scale_beta matrix corresponding singular features X.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"","code":"adjust_beta_dimension(std_scale_beta, p, std_X_details, fbm_flag, plink_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"std_scale_beta matrix estimated beta coefficients scale standardized original/training data Note: rows matrix represent nonsingular columns design matrix p number columns original/training design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' fbm_flag Logical: model fit filebacked? plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"std_scale_b_og_dim: matrix estimated beta coefs. still scale std_X, dimension X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":null,"dir":"Reference","previous_headings":"","what":"Admix: Semi-simulated SNP data — admix","title":"Admix: Semi-simulated SNP data — admix","text":"dataset containing 100 SNPs, demographic variable representing race, simulated outcome","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Admix: Semi-simulated SNP data — admix","text":"","code":"admix"},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Admix: Semi-simulated SNP data — admix","text":"list 3 components X SNP matrix (197 observations 100 SNPs) y vector simulated (continuous) outcomes race vector racial group categorization: # 0 = African, 1 = African American, 2 = European, 3 = Japanese","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"source","dir":"Reference","previous_headings":"","what":"Source","title":"Admix: Semi-simulated SNP data — admix","text":"https://hastie.su.domains/CASI/","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to support process_plink() — align_ids","title":"A helper function to support process_plink() — align_ids","text":"helper function support process_plink()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to support process_plink() — align_ids","text":"","code":"align_ids(id_var, quiet, add_predictor, og_ids)"},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to support process_plink() — align_ids","text":"id_var String specifying variable name ID column quiet Logical: message printed? add_predictor External data include design matrix. add_predictors... arg process_plink() og_ids Character vector PLINK ids (FID IID) original data (.e., data subsetting handling missing phenotypes)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to support process_plink() — align_ids","text":"matrix dimensions add_predictor","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":null,"dir":"Reference","previous_headings":"","what":"a version of cbind() for file-backed matrices — big_cbind","title":"a version of cbind() for file-backed matrices — big_cbind","text":"version cbind() file-backed matrices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a version of cbind() for file-backed matrices — big_cbind","text":"","code":"big_cbind(A, B, C, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a version of cbind() for file-backed matrices — big_cbind","text":"-memory data B file-backed data C file-backed placeholder combined data quiet Logical","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a version of cbind() for file-backed matrices — big_cbind","text":"C, filled column values B combined","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":null,"dir":"Reference","previous_headings":"","what":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"check_for_file_extension: function make package 'smart' enough handle .rds file extensions","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"","code":"check_for_file_extension(path)"},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"path string specifying file path ends file name, e.g. \"~/dir/my_file.rds\"","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"string filepath without extension, e.g. \"~/dir/my_file\"","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Coef method for ","title":"Coef method for ","text":"Coef method \"cv_plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Coef method for ","text":"","code":"# S3 method for class 'cv_plmm' coef(object, lambda, which = object$min, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Coef method for ","text":"object object class \"cv_plmm.\" lambda numeric vector lambda values. Vector lambda indices coefficients return. Defaults lambda index minimum CVE. ... Additional arguments (used).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Coef method for ","text":"Returns named numeric vector. Values coefficients model specified value either lambda . Names values lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Coef method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design, return_fit = TRUE) head(coef(cv_fit)) #> (Intercept) Snp1 Snp2 Snp3 Snp4 Snp5 #> 4.326474 0.000000 0.000000 0.000000 0.000000 0.000000"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Coef method for ","title":"Coef method for ","text":"Coef method \"plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Coef method for ","text":"","code":"# S3 method for class 'plmm' coef(object, lambda, which = 1:length(object$lambda), drop = TRUE, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Coef method for ","text":"object object class \"plmm.\" lambda numeric vector lambda values. Vector lambda indices coefficients return. drop Logical. ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Coef method for ","text":"Either numeric matrix (model fit data stored memory) sparse matrix (model fit data stored filebacked). Rownames feature names, columns values lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Coef method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) coef(fit)[1:10, 41:45] #> 0.02673 0.02493 0.02325 0.02168 0.02022 #> (Intercept) 6.556445885 6.59257224 6.62815211 6.66317769 6.69366816 #> Snp1 -0.768261488 -0.78098090 -0.79310257 -0.80456803 -0.81482505 #> Snp2 0.131945426 0.13991539 0.14735024 0.15387884 0.15929074 #> Snp3 2.826806831 2.83842545 2.84879468 2.85860151 2.86047026 #> Snp4 0.036981534 0.04652885 0.05543821 0.06376126 0.07133592 #> Snp5 0.546784811 0.57461391 0.60049082 0.62402782 0.64291324 #> Snp6 -0.026215632 -0.03072017 -0.03494534 -0.03889146 -0.04256362 #> Snp7 0.009342269 0.01539705 0.02103262 0.02615358 0.03069956 #> Snp8 0.000000000 0.00000000 0.00000000 0.00000000 0.00000000 #> Snp9 0.160794660 0.16217570 0.16337102 0.16464901 0.16638663"},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"function create estimated variance matrix PLMM fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"","code":"construct_variance(fit, K = NULL, eta = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"fit object returned plmm() K optional matrix eta optional numeric value 0 1; fit supplied, option must specified.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"Sigma_hat, matrix representing estimated variance","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to count constant features — count_constant_features","title":"A helper function to count constant features — count_constant_features","text":"helper function count constant features","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to count constant features — count_constant_features","text":"","code":"count_constant_features(fbm, outfile, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to count constant features — count_constant_features","text":"fbm filebacked big.matrix outfile String specifying name log file quiet Logical: message printed console","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to count constant features — count_constant_features","text":"ns numeric vector indices non-singular columns matrix associated counts","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to count the number of cores available on the current machine — count_cores","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"helper function count number cores available current machine","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"","code":"count_cores()"},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"number cores use; parallel installed, parallel::detectCores(). Otherwise, returns 1.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to create a design for PLMM modeling — create_design","title":"a function to create a design for PLMM modeling — create_design","text":"function create design PLMM modeling","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to create a design for PLMM modeling — create_design","text":"","code":"create_design(data_file = NULL, rds_dir = NULL, X = NULL, y = NULL, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a function to create a design for PLMM modeling — create_design","text":"data_file filebacked data (data process_plink() process_delim()), filepath processed data. Defaults NULL (argument apply -memory data). rds_dir filebacked data, filepath directory/folder want design saved. Note: include/append name want --created file – name argument new_file, passed create_design_filebacked(). Defaults NULL (argument apply -memory data). X -memory data (data matrix data frame), design matrix. Defaults NULL (argument apply filebacked data). y -memory data, numeric vector representing outcome. Defaults NULL (argument apply filebacked data). Note: responsibility user ensure rows X corresponding elements y row order, .e., observations must order design matrix outcome vector. ... Additional arguments pass create_design_filebacked() create_design_in_memory(). See documentation helper functions details.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to create a design for PLMM modeling — create_design","text":"filepath object class plmm_design, named list design matrix, outcome, penalty factor vector, details needed fitting model. list stored .rds file filebacked data, filebacked case string path file returned. -memory data, list returned.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"a function to create a design for PLMM modeling — create_design","text":"function wrapper create_design...() inner functions; arguments included passed along create_design...() inner function matches type data supplied. Note arguments optional ones . Additional arguments filebacked data: new_file User-specified filename (without .bk/.rds extension) --created .rds/.bk files. Must different existing .rds/.bk files folder. feature_id Optional: string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. - PLINK data: string specifying ID column PLINK .fam file. Options \"IID\" (default) \"FID\" - filebacked data: character vector unique identifiers (IDs) row feature data (.e., data processed process_delim()) - left NULL (default), X assumed row-order add_outcome. Note: assumption made error, calculations downstream incorrect. Pay close attention . add_outcome data frame matrix two columns: ID column column outcome value (used 'y' final design). IDs must characters, outcome must numeric. outcome_id string specifying name ID column 'add_outcome' outcome_col string specifying name phenotype column 'add_outcome' na_outcome_vals Optional: vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). overwrite Optional: logical - existing .rds files overwritten? Defaults FALSE. logfile Optional: name '.log' file written – Note: append .log filename; done automatically. quiet Optional: logical - messages printed console silenced? Defaults FALSE Additional arguments specific PLINK data: add_predictor Optional (PLINK data ): matrix data frame used adding additional unpenalized covariates/predictors/features external file (.e., PLINK file). matrix must one column ID column; columns aside ID used covariates design matrix. Columns must named. predictor_id Optional (PLINK data ): string specifying name column 'add_predictor' sample IDs. Required 'add_predictor' supplied. names used subset align external covariate supplied PLINK data. Additional arguments specific delimited file data: unpen Optional: character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, delimited file must column names. Additional arguments -memory data: unpen Optional: character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"a function to create a design for PLMM modeling — create_design","text":"","code":"## Example 1: matrix data in-memory ## admix_design <- create_design(X = admix$X, y = admix$y, unpen = \"Snp1\") ## Example 2: delimited data ## # process delimited data temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), overwrite = TRUE, rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", header = TRUE) #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpuyGgXe/processed_colon2.rds # prepare outcome data colon_outcome <- read.delim(find_example_data(path = \"colon2_outcome.txt\")) # create a design colon_design <- create_design(data_file = colon_dat, rds_dir = temp_dir, new_file = \"std_colon2\", add_outcome = colon_outcome, outcome_id = \"ID\", outcome_col = \"y\", unpen = \"sex\", overwrite = TRUE, logfile = \"test.log\") #> No feature_id supplied; will assume data X are in same row-order as add_outcome. #> There are 0 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-17 19:29:03 #> Done with standardization. File formatting in progress # look at the results colon_rds <- readRDS(colon_design) str(colon_rds) #> List of 18 #> $ X_colnames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ X_rownames : chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ n : num 62 #> $ p : num 2001 #> $ is_plink : logi FALSE #> $ outcome_idx : int [1:62] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : int [1:62] 1 0 1 0 1 0 1 0 1 0 ... #> $ std_X_rownames: chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ unpen : int 1 #> $ unpen_colnames: chr \"sex\" #> $ ns : int [1:2001] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpuyGgXe/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 62 #> $ std_X_p : num 2001 #> $ std_X_center : num [1:2001] 1.47 7015.79 4966.96 4094.73 3987.79 ... #> $ std_X_scale : num [1:2001] 0.499 3067.926 2171.166 1803.359 2002.738 ... #> $ penalty_factor: num [1:2001] 0 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\" ## Example 3: PLINK data ## # \\donttest{ # process PLINK data temp_dir <- tempdir() unzip_example_data(outdir = temp_dir) #> Unzipped files are saved in /tmp/RtmpuyGgXe plink_data <- process_plink(data_dir = temp_dir, data_prefix = \"penncath_lite\", rds_dir = temp_dir, rds_prefix = \"imputed_penncath_lite\", # imputing the mode to address missing values impute_method = \"mode\", # overwrite existing files in temp_dir # (you can turn this feature off if you need to) overwrite = TRUE, # turning off parallelization - leaving this on causes problems knitting this vignette parallel = FALSE) #> #> Preprocessing penncath_lite data: #> Creating penncath_lite.rds #> #> There are 1401 observations and 4367 genomic features in the specified data files, representing chromosomes 1 - 22 #> There are a total of 3514 SNPs with missing values #> Of these, 13 are missing in at least 50% of the samples #> #> Imputing the missing (genotype) values using mode method #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpuyGgXe/imputed_penncath_lite.rds # get outcome data penncath_pheno <- read.csv(find_example_data(path = 'penncath_clinical.csv')) outcome <- data.frame(FamID = as.character(penncath_pheno$FamID), CAD = penncath_pheno$CAD) unpen_predictors <- data.frame(FamID = as.character(penncath_pheno$FamID), sex = penncath_pheno$sex, age = penncath_pheno$age) # create design where sex and age are always included in the model pen_design <- create_design(data_file = plink_data, feature_id = \"FID\", rds_dir = temp_dir, new_file = \"std_penncath_lite\", add_outcome = outcome, outcome_id = \"FamID\", outcome_col = \"CAD\", add_predictor = unpen_predictors, predictor_id = \"FamID\", logfile = \"design\", # again, overwrite if needed; use with caution overwrite = TRUE) #> #> Aligning external data with the feature data by FamID #> Adding predictors from external data. #> Aligning IDs between fam and predictor files #> Column-wise combining data sets #> | | | 0% | | | 1% | |= | 1% | |= | 2% | |== | 2% | |== | 3% | |== | 4% | |=== | 4% | |=== | 5% | |==== | 5% | |==== | 6% | |===== | 6% | |===== | 7% | |===== | 8% | |====== | 8% | |====== | 9% | |======= | 9% | |======= | 10% | |======= | 11% | |======== | 11% | |======== | 12% | |========= | 12% | |========= | 13% | |========= | 14% | |========== | 14% | |========== | 15% | |=========== | 15% | |=========== | 16% | |============ | 16% | |============ | 17% | |============ | 18% | |============= | 18% | |============= | 19% | |============== | 19% | |============== | 20% | |============== | 21% | |=============== | 21% | |=============== | 22% | |================ | 22% | |================ | 23% | |================ | 24% | |================= | 24% | |================= | 25% | |================== | 25% | |================== | 26% | |=================== | 26% | |=================== | 27% | |=================== | 28% | |==================== | 28% | |==================== | 29% | |===================== | 29% | |===================== | 30% | |===================== | 31% | |====================== | 31% | |====================== | 32% | |======================= | 32% | |======================= | 33% | |======================= | 34% | |======================== | 34% | |======================== | 35% | |========================= | 35% | |========================= | 36% | |========================== | 36% | |========================== | 37% | |========================== | 38% | |=========================== | 38% | |=========================== | 39% | |============================ | 39% | |============================ | 40% | |============================ | 41% | |============================= | 41% | |============================= | 42% | |============================== | 42% | |============================== | 43% | |============================== | 44% | |=============================== | 44% | |=============================== | 45% | |================================ | 45% | |================================ | 46% | |================================= | 46% | |================================= | 47% | |================================= | 48% | |================================== | 48% | |================================== | 49% | |=================================== | 49% | |=================================== | 50% | |=================================== | 51% | |==================================== | 51% | |==================================== | 52% | |===================================== | 52% | |===================================== | 53% | |===================================== | 54% | |====================================== | 54% | |====================================== | 55% | |======================================= | 55% | |======================================= | 56% | |======================================== | 56% | |======================================== | 57% | |======================================== | 58% | |========================================= | 58% | |========================================= | 59% | |========================================== | 59% | |========================================== | 60% | |========================================== | 61% | |=========================================== | 61% | |=========================================== | 62% | |============================================ | 62% | |============================================ | 63% | |============================================ | 64% | |============================================= | 64% | |============================================= | 65% | |============================================== | 65% | |============================================== | 66% | |=============================================== | 66% | |=============================================== | 67% | |=============================================== | 68% | |================================================ | 68% | |================================================ | 69% | |================================================= | 69% | |================================================= | 70% | |================================================= | 71% | |================================================== | 71% | |================================================== | 72% | |=================================================== | 72% | |=================================================== | 73% | |=================================================== | 74% | |==================================================== | 74% | |==================================================== | 75% | |===================================================== | 75% | |===================================================== | 76% | |====================================================== | 76% | |====================================================== | 77% | |====================================================== | 78% | |======================================================= | 78% | |======================================================= | 79% | |======================================================== | 79% | |======================================================== | 80% | |======================================================== | 81% | |========================================================= | 81% | |========================================================= | 82% | |========================================================== | 82% | |========================================================== | 83% | |========================================================== | 84% | |=========================================================== | 84% | |=========================================================== | 85% | |============================================================ | 85% | |============================================================ | 86% | |============================================================= | 86% | |============================================================= | 87% | |============================================================= | 88% | |============================================================== | 88% | |============================================================== | 89% | |=============================================================== | 89% | |=============================================================== | 90% | |=============================================================== | 91% | |================================================================ | 91% | |================================================================ | 92% | |================================================================= | 92% | |================================================================= | 93% | |================================================================= | 94% | |================================================================== | 94% | |================================================================== | 95% | |=================================================================== | 95% | |=================================================================== | 96% | |==================================================================== | 96% | |==================================================================== | 97% | |==================================================================== | 98% | |===================================================================== | 98% | |===================================================================== | 99% | |======================================================================| 99% | |======================================================================| 100% #> There are 62 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-17 19:29:06 #> Done with standardization. File formatting in progress # examine the design - notice the components of this object pen_design_rds <- readRDS(pen_design) # }"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"function create design matrix, outcome, penalty factor passed model fitting function","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"","code":"create_design_filebacked( data_file, rds_dir, obj, new_file, feature_id = NULL, add_outcome, outcome_id, outcome_col, na_outcome_vals = c(-9, NA_integer_), add_predictor = NULL, predictor_id = NULL, unpen = NULL, logfile = NULL, overwrite = FALSE, quiet = FALSE )"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"data_file filepath rds file processed data (data process_plink() process_delim()) rds_dir path directory want create new '.rds' '.bk' files. obj RDS object read create_design() new_file User-specified filename (without .bk/.rds extension) --created .rds/.bk files. Must different existing .rds/.bk files folder. feature_id string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. - PLINK data: string specifying ID column PLINK .fam file. Options \"IID\" (default) \"FID\" - filebacked data: character vector unique identifiers (IDs) row feature data (.e., data processed process_delim()) - left NULL (default), X assumed row-order add_outcome. Note: assumption made error, calculations downstream incorrect. Pay close attention . add_outcome data frame matrix two columns: ID column column outcome value (used 'y' final design). IDs must characters, outcome must numeric. outcome_id string specifying name ID column 'add_outcome' outcome_col string specifying name phenotype column 'add_outcome' na_outcome_vals vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). add_predictor Optional (PLINK data ): matrix data frame used adding additional unpenalized covariates/predictors/features external file (.e., PLINK file). matrix must one column ID column; columns aside ID used covariates design matrix. Columns must named. predictor_id Optional (PLINK data ): string specifying name column 'add_predictor' sample IDs. Required 'add_predictor' supplied. names used subset align external covariate supplied PLINK data. unpen Optional (delimited file data ): optional character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names. logfile Optional: name '.log' file written – Note: append .log filename; done automatically. overwrite Logical: existing .rds files overwritten? Defaults FALSE. quiet Logical: messages printed console silenced? Defaults FALSE","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"filepath created .rds file containing information model fitting, including standardized X model design information","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to create a design with an in-memory X matrix — create_design_in_memory","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"function create design -memory X matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"","code":"create_design_in_memory(X, y, unpen = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"X numeric matrix rows correspond observations (e.g., samples) columns correspond features. y numeric vector representing outcome model. Note: responsibility user ensure outcome_col X row order! unpen optional character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"list elements including standardized X model design information","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":null,"dir":"Reference","previous_headings":"","what":"create_log_file — create_log","title":"create_log_file — create_log","text":"create_log_file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"create_log_file — create_log","text":"","code":"create_log(outfile, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"create_log_file — create_log","text":"outfile String specifying name --created file, without extension ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"create_log_file — create_log","text":"Nothing returned, intead text file suffix .log created.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Cross-validation for plmm — cv_plmm","title":"Cross-validation for plmm — cv_plmm","text":"Performs k-fold cross validation lasso-, MCP-, SCAD-penalized linear mixed models grid values regularization parameter lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Cross-validation for plmm — cv_plmm","text":"","code":"cv_plmm( design, y = NULL, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", type = \"blup\", gamma, alpha = 1, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, dfmax = NULL, warn = TRUE, init = NULL, cluster, nfolds = 5, seed, fold = NULL, returnY = FALSE, returnBiasDetails = FALSE, trace = FALSE, save_rds = NULL, save_fold_res = FALSE, return_fit = TRUE, compact_save = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Cross-validation for plmm — cv_plmm","text":"design first argument must one three things: (1) plmm_design object (created create_design()) (2) string file path design object (file path must end '.rds') (3) matrix data.frame object representing design matrix interest y Optional: case design matrix data.frame, user must also supply numeric outcome vector y argument. case, design y passed internally create_design(X = design, y = y). K Similarity matrix used rotate data. either (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'u', returned choose_k(). diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"lasso\" (default), \"SCAD\", \"MCP\". type character argument indicating returned predict.plmm(). type == 'lp', predictions based linear predictor, X beta. type == 'blup', predictions based sum linear predictor estimated random effect (BLUP). Defaults 'blup', shown superior prediction method many applications. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (future idea; yet incorporated) Calculate index objective function ceases locally convex? Default TRUE. dfmax (future idea; yet incorporated) Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. warn Return warning messages failures converge model saturation? Default TRUE. init Initial values coefficients. Default 0 columns X. cluster cv_plmm() can run parallel across cluster using parallel package. cluster must set advance using parallel::makeCluster(). cluster must passed cv_plmm(). nfolds number cross-validation folds. Default 5. seed may set seed random number generator order obtain reproducible results. fold fold observation belongs . default, observations randomly assigned. returnY cv_plmm() return linear predictors cross-validation folds? Default FALSE; TRUE, return matrix element row , column j fitted value observation fold observation excluded fit, jth value lambda. returnBiasDetails Logical: cross-validation bias (numeric value) loss (n x p matrix) returned? Defaults FALSE. trace set TRUE, inform user progress announcing beginning CV fold. Default FALSE. save_rds Optional: filepath name without '.rds' suffix specified (e.g., save_rds = \"~/dir/my_results\"), model results saved provided location (e.g., \"~/dir/my_results.rds\"). Defaults NULL, save result. save_fold_res Optional: logical value indicating whether results (loss predicted values) CV fold saved? TRUE, two '.rds' files saved ('loss' 'yhat') created directory 'save_rds'. files updated fold done. Defaults FALSE. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE. compact_save Optional: TRUE, three separate .rds files saved: one 'beta_vals', one 'K', one everything else (see ). Defaults FALSE. Note: must specify save_rds argument called. ... Additional arguments plmm_fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Cross-validation for plmm — cv_plmm","text":"list 12 items: type: type prediction used ('lp' 'blup') cve: numeric vector cross validation error (CVE) value lambda cvse: numeric vector estimated standard error associated value cve fold: numeric n length vector integers indicating fold observation assigned lambda: numeric vector lambda values fit: overall fit object, including predictors; list returned plmm() min: index corresponding value lambda minimizes cve lambda_min: lambda value cve minmized min1se: index corresponding value lambda within standard error minimizes cve lambda1se: largest value lambda error within 1 standard error minimum. null.dev: numeric value representing deviance intercept-model. supplied lambda sequence, quantity may meaningful. estimated_Sigma: n x n matrix representing estimated covariance matrix.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Cross-validation for plmm — cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) print(summary(cv_fit)) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2493): #> ------------------------------------------------- #> Nonzero coefficients: 5 #> Cross-validation error (deviance): 2.00 #> Scale estimate (sigma): 1.413 plot(cv_fit) # Note: for examples with filebacked data, see the filebacking vignette # https://pbreheny.github.io/plmmr/articles/filebacking.html"},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":null,"dir":"Reference","previous_headings":"","what":"Cross-validation internal function for cv_plmm — cvf","title":"Cross-validation internal function for cv_plmm — cvf","text":"Internal function cv_plmm calls plmm fold subset original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Cross-validation internal function for cv_plmm — cvf","text":"","code":"cvf(i, fold, type, cv_args, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Cross-validation internal function for cv_plmm — cvf","text":"Fold number excluded fit. fold n-length vector fold-assignments. type character argument indicating returned predict.plmm. type == 'lp' predictions based linear predictor, $X beta$. type == 'individual' predictions based linear predictor plus estimated random effect (BLUP). cv_args List additional arguments passed plmm. ... Optional arguments predict_within_cv","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Cross-validation internal function for cv_plmm — cvf","text":"list three elements: numeric vector loss value lambda numeric value indicating number lambda values used numeric value predicted outcome (y hat) values lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"function take eigendecomposition K Note: faster taking SVD X p >> n","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"","code":"eigen_K(std_X, fbm_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"std_X standardized design matrix, stored big.matrix object. fbm_flag Logical: std_X FBM obejct? Passed plmm().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"list eigenvectors eigenvalues K","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":null,"dir":"Reference","previous_headings":"","what":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"Estimate eta (used rotating data) function called internally plmm()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"","code":"estimate_eta(n, s, U, y, eta_star)"},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"n number observations s singular values K, realized relationship matrix U left-singular vectors standardized design matrix y Continuous outcome vector.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"numeric value estimated value eta, variance parameter","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":null,"dir":"Reference","previous_headings":"","what":"Functions to convert between FBM and big.matrix type objects — fbm2bm","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"Functions convert FBM big.matrix type objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"","code":"fbm2bm(fbm, desc = FALSE)"},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"fbm FBM object; see bigstatsr::FBM() details desc Logical: descriptor file desired (opposed filebacked big matrix)? Defaults FALSE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"big.matrix - see bigmemory::filebacked.big.matrix() details","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to get the file path of a file without the extension — file_sans_ext","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"helper function get file path file without extension","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"","code":"file_sans_ext(path)"},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"path path file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"path_sans_ext filepath without extension","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to help with accessing example PLINK files — find_example_data","title":"A function to help with accessing example PLINK files — find_example_data","text":"function help accessing example PLINK files","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to help with accessing example PLINK files — find_example_data","text":"","code":"find_example_data(path, parent = FALSE)"},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to help with accessing example PLINK files — find_example_data","text":"path Argument (string) specifying path (filename) external data file extdata/ parent path=TRUE user wants name parent directory file located, set parent=TRUE. Defaults FALSE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to help with accessing example PLINK files — find_example_data","text":"path=NULL, character vector file names returned. path given, character string full file path","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to help with accessing example PLINK files — find_example_data","text":"","code":"find_example_data(parent = TRUE) #> [1] \"/home/runner/work/_temp/Library/plmmr/extdata\""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":null,"dir":"Reference","previous_headings":"","what":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"Read processed data function intended called either process_plink() process_delim() called .","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"","code":"get_data(path, returnX = FALSE, trace = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"path file path RDS object containing processed data. add '.rds' extension path. returnX Logical: design matrix returned numeric matrix stored memory. default, FALSE. trace Logical: trace messages shown? Default TRUE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"list components: std_X, column-standardized design matrix either (1) numeric matrix (2) filebacked matrix (FBM). See bigstatsr::FBM() bigsnpr::bigSnp-class documentation details. (PLINK data) fam, data frame containing pedigree information (like .fam file PLINK) (PLINK data) map, data frame containing feature information (like .bim file PLINK) ns: vector indicating columns X contain nonsingular features (.e., features variance != 0. center: vector values centering column X scale: vector values scaling column X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to return the computer's host name — get_hostname","title":"a function to return the computer's host name — get_hostname","text":"function return computer's host name","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to return the computer's host name — get_hostname","text":"","code":"get_hostname()"},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to return the computer's host name — get_hostname","text":"String hostname current machine","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to impute SNP data — impute_snp_data","title":"A function to impute SNP data — impute_snp_data","text":"function impute SNP data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to impute SNP data — impute_snp_data","text":"","code":"impute_snp_data( obj, X, impute, impute_method, parallel, outfile, quiet, seed = as.numeric(Sys.Date()), ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to impute SNP data — impute_snp_data","text":"obj bigSNP object (created read_plink_files()) X matrix genotype data returned name_and_count_bigsnp impute Logical: data imputed? Default TRUE. impute_method 'impute' = TRUE, argument specify kind imputation desired. Options : mode (default): Imputes frequent call. See bigsnpr::snp_fastImputeSimple() details. random: Imputes sampling according allele frequencies. mean0: Imputes rounded mean. mean2: Imputes mean rounded 2 decimal places. xgboost: Imputes using algorithm based local XGBoost models. See bigsnpr::snp_fastImpute() details. Note: can take several minutes, even relatively small data set. parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults TRUE seed Numeric value passed seed impute_method = 'xgboost'. Defaults .numeric(Sys.Date()) ... Optional: additional arguments bigsnpr::snp_fastImpute() (relevant impute_method = \"xgboost\")","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to impute SNP data — impute_snp_data","text":"Nothing returned, obj$genotypes overwritten imputed version data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to align genotype and phenotype data — index_samples","title":"A function to align genotype and phenotype data — index_samples","text":"function align genotype phenotype data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to align genotype and phenotype data — index_samples","text":"","code":"index_samples( obj, rds_dir, indiv_id, add_outcome, outcome_id, outcome_col, na_outcome_vals, outfile, quiet )"},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to align genotype and phenotype data — index_samples","text":"obj object created process_plink() rds_dir path directory want create new '.rds' '.bk' files. indiv_id character string indicating ID column name 'fam' element genotype data list. Defaults 'sample.ID', equivalent 'IID' PLINK. option 'family.ID', equivalent 'FID' PLINK. add_outcome data frame least two columns: ID column phenotype column outcome_id string specifying name ID column pheno outcome_col string specifying name phenotype column pheno. column used default y argument 'plmm()'. na_outcome_vals vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). outfile string name filepath log file quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to align genotype and phenotype data — index_samples","text":"list two items: data.table rows corresponding samples genotype phenotype available. numeric vector indices indicating samples 'complete' (.e., samples add_outcome corresponding data PLINK files)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":null,"dir":"Reference","previous_headings":"","what":"Helper function to index standardized data — index_std_X","title":"Helper function to index standardized data — index_std_X","text":"Helper function index standardized data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Helper function to index standardized data — index_std_X","text":"","code":"index_std_X(std_X_p, non_genomic)"},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Helper function to index standardized data — index_std_X","text":"std_X_p number features standardized matrix data (may filebacked) non_genomic Integer vector columns std_X representing non-genomic data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Helper function to index standardized data — index_std_X","text":"list indices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":null,"dir":"Reference","previous_headings":"","what":"Generate nicely formatted lambda vec — lam_names","title":"Generate nicely formatted lambda vec — lam_names","text":"Generate nicely formatted lambda vec","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Generate nicely formatted lambda vec — lam_names","text":"","code":"lam_names(l)"},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Generate nicely formatted lambda vec — lam_names","text":"l Vector lambda values.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Generate nicely formatted lambda vec — lam_names","text":"character vector formatted lambda value names","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement lasso penalty — lasso","title":"helper function to implement lasso penalty — lasso","text":"helper function implement lasso penalty","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement lasso penalty — lasso","text":"","code":"lasso(z, l1, l2, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement lasso penalty — lasso","text":"z solution active set feature l1 upper bound l2 lower bound v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement lasso penalty — lasso","text":"numeric vector lasso-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":null,"dir":"Reference","previous_headings":"","what":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"function allows evaluate negative log-likelihood linear mixed model assumption null model order estimate variance parameter, eta.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"","code":"log_lik(eta, n, s, U, y, rot_y = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"eta proportion variance outcome attributable causal SNP effects. words, signal--noise ratio. n number observations s singular values K, realized relationship matrix U left-singular vectors standardized design matrix y Continuous outcome vector. rot_y Optional: y already rotated, can supplied.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"value log-likelihood PLMM, evaluated supplied parameters","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"helper function label summarize contents bigSNP","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"","code":"name_and_count_bigsnp(obj, id_var, quiet, outfile)"},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"obj bigSNP object, possibly subset add_external_phenotype() id_var String specifying column PLINK .fam file unique sample identifiers. Options \"IID\" (default) \"FID\". quiet Logical: messages printed console? Defaults TRUE outfile string name .log file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"list components: counts: column-wise summary minor allele counts 'genotypes' obj: modified bigSNP list additional components X: obj$genotypes FBM pos: obj$map$physical.pos vector","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"Fit linear mixed model via non-convex penalized maximum likelihood.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"","code":"plmm( design, y = NULL, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", init = NULL, gamma, alpha = 1, dfmax = NULL, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, warn = TRUE, trace = FALSE, save_rds = NULL, compact_save = FALSE, return_fit = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"design first argument must one three things: (1) plmm_design object (created create_design()) (2) string file path design object (file path must end '.rds') (3) matrix data.frame object representing design matrix interest y Optional: case design matrix data.frame, user must also supply numeric outcome vector y argument. case, design y passed internally create_design(X = design, y = y). K Similarity matrix used rotate data. either : (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'U', returned previous plmm() model fit data. diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"lasso\" (default), \"SCAD\", \"MCP\". init Initial values coefficients. Default 0 columns X. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. dfmax (Future idea; yet incorporated): Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (Future idea; yet incorporated): Calculate index objective function ceases locally convex? Default TRUE. warn Return warning messages failures converge model saturation? Default TRUE. trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. save_rds Optional: filepath name without '.rds' suffix specified (e.g., save_rds = \"~/dir/my_results\"), model results saved provided location (e.g., \"~/dir/my_results.rds\"). Defaults NULL, save result. compact_save Optional: TRUE, three separate .rds files saved: one 'beta_vals', one 'K', one linear predictors, one everything else (see ). Defaults FALSE. Note: must specify save_rds argument called. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE -memory data, defaults FALSE filebacked data. ... Additional optional arguments plmm_checks()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"list includes 19 items: beta_vals: matrix estimated coefficients original scale. Rows predictors, columns values lambda std_scale_beta: matrix estimated coefficients ~standardized~ scale. returned compact_save = TRUE. std_X_details: list 3 items: center & scale values used center/scale data, vector ('ns') nonsingular columns original data. Nonsingular columns standardized (definition), removed analysis. std_X: standardized design matrix; data filebacked, object filebacked.big.matrix bigmemory package. Note: std_X saved/returned return_fit = FALSE. y: outcome vector used model fitting. p: total number columns design matrix (including singular columns). plink_flag: logical flag: data come PLINK files? lambda: numeric vector lasso tuning parameter values used model fitting. eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure linear_predictors: matrix resulting product stdrot_X estimated coefficients ~rotated~ scale. penalty: character string indicating penalty model fit (e.g., 'MCP') gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. ns_idx: vector indices predictors non-singular features (.e., features variation). iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda K: list 2 elements, s U — s: vector eigenvalues relatedness matrix; see relatedness_mat() details. U: matrix eigenvectors relatedness matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"","code":"# using admix data admix_design <- create_design(X = admix$X, y = admix$y) fit_admix1 <- plmm(design = admix_design) s1 <- summary(fit_admix1, idx = 50) print(s1) #> lasso-penalized regression model with n=197, p=101 at lambda=0.01426 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 88 #> ------------------------------------------------- plot(fit_admix1) # Note: for examples with large data that are too big to fit in memory, # see the article \"PLINK files/file-backed matrices\" on our website # https://pbreheny.github.io/plmmr/articles/filebacking.html"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":null,"dir":"Reference","previous_headings":"","what":"plmm_checks — plmm_checks","title":"plmm_checks — plmm_checks","text":"plmm_checks","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"plmm_checks — plmm_checks","text":"","code":"plmm_checks( design, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", init = NULL, gamma, alpha = 1, dfmax = NULL, trace = FALSE, save_rds = NULL, return_fit = TRUE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"plmm_checks — plmm_checks","text":"design design object, created create_design() K Similarity matrix used rotate data. either (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'u', returned choose_k(). diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"MCP\" (default), \"SCAD\", \"lasso\". init Initial values coefficients. Default 0 columns X. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. dfmax Option added soon: Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. save_rds Optional: filepath name specified (e.g., save_rds = \"~/dir/my_results.rds\"), model results saved provided location. Defaults NULL, save result. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE. ... Additional arguments get_data()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"plmm_checks — plmm_checks","text":"list parameters pass model fitting. list includes standardized design matrix, outcome, meta-data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"PLMM fit: function fits PLMM using values returned plmm_prep()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"","code":"plmm_fit( prep, y, std_X_details, eta_star, penalty_factor, fbm_flag, penalty, gamma = 3, alpha = 1, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, dfmax = NULL, init = NULL, warn = TRUE, returnX = TRUE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"prep list returned plmm_prep y original (centered) outcome vector. Need intercept estimate std_X_details list components 'center' (values used center X), 'scale' (values used scale X), 'ns' (indices nonsignular columns X) eta_star ratio variances (passed plmm()) penalty_factor multiplicative factor penalty applied coefficient. supplied, penalty_factor must numeric vector length equal number columns X. purpose penalty_factor apply differential penalization coefficients thought likely others model. particular, penalty_factor can 0, case coefficient always model without shrinkage. fbm_flag Logical: std_X FBM object? Passed plmm(). penalty penalty applied model. Either \"MCP\" (default), \"SCAD\", \"lasso\". gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (future idea; yet incorporated) convex Calculate index objective function ceases locally convex? Default TRUE. dfmax (future idea; yet incorporated) Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. init Initial values coefficients. Default 0 columns X. warn Return warning messages failures converge model saturation? Default TRUE. returnX Return standardized design matrix along fit? default, option turned X 100 MB, turned larger matrices preserve memory. ... Additional arguments can passed biglasso::biglasso_simple_path()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"list components: std_scale_beta: coefficients estimated scale std_X centered_y: y-values 'centered' mean 0 s U, values vectors eigendecomposition K lambda: vector tuning parameter values linear_predictors: product stdrot_X b (linear predictors transformed restandardized scale) eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure. iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty: character string indicating penalty model fit (e.g., 'MCP') penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. ns: indices nonsingular values X feature_names: formatted column names design matrix nlambda: number lambda values used model fitting eps: tolerance ('epsilon') used model fitting max_iter: max. number iterations per model fit warn: logical - warnings given model fit converge? init: initial values model fitting trace: logical - messages printed console models fit?","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"PLMM format: function format output model constructed plmm_fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"","code":"plmm_format(fit, p, std_X_details, fbm_flag, plink_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"fit list parameters describing output model constructed plmm_fit p number features original data (including constant features) std_X_details list 3 items: * 'center': centering values columns X * 'scale': scaling values non-singular columns X * 'ns': indicesof nonsingular columns std_X fbm_flag Logical: corresponding design matrix filebacked? Passed plmm(). plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"list components: beta_vals: matrix estimated coefficients original scale. Rows predictors, columns values lambda lambda: numeric vector lasso tuning parameter values used model fitting. eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure. s: vectof eigenvalues relatedness matrix K; see relatedness_mat() details. U: matrix eigenvalues relatedness matrix K rot_y: vector outcome values rotated scale. scale model fit. linear_predictors: matrix resulting product stdrot_X estimated coefficients ~rotated~ scale. penalty: character string indicating penalty model fit (e.g., 'MCP') gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. ns_idx: vector indices predictors nonsingular features (.e., variation). iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":null,"dir":"Reference","previous_headings":"","what":"Loss method for ","title":"Loss method for ","text":"Loss method \"plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Loss method for ","text":"","code":"plmm_loss(y, yhat)"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Loss method for ","text":"y Observed outcomes (response) vector yhat Predicted outcomes (response) vector","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Loss method for ","text":"numeric vector squared-error loss values given observed predicted outcomes","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Loss method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design, K = relatedness_mat(admix$X)) yhat <- predict(object = fit, newX = admix$X, type = 'lp', lambda = 0.05) head(plmm_loss(yhat = yhat, y = admix$y)) #> [,1] #> [1,] 0.81638401 #> [2,] 0.09983799 #> [3,] 0.50281622 #> [4,] 0.14234359 #> [5,] 2.03696796 #> [6,] 2.72044268"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"PLMM prep: function run checks, SVD, rotation prior fitting PLMM model internal function cv_plmm","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"","code":"plmm_prep( std_X, std_X_n, std_X_p, genomic = 1:std_X_p, n, p, centered_y, k = NULL, K = NULL, diag_K = NULL, eta_star = NULL, fbm_flag, penalty_factor = rep(1, ncol(std_X)), trace = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"std_X Column standardized design matrix. May include clinical covariates non-SNP data. std_X_n number observations std_X (integer) std_X_p number features std_X (integer) genomic numeric vector indices indicating columns standardized X genomic covariates. Defaults columns. n number instances original design matrix X. altered standardization. p number features original design matrix X, including constant features centered_y Continuous outcome vector, centered. k integer specifying number singular values used approximation rotated design matrix. argument passed RSpectra::svds(). Defaults min(n, p) - 1, n p dimensions standardized design matrix. K Similarity matrix used rotate data. either known matrix reflects covariance y, estimate (Default \\(\\frac{1}{p}(XX^T)\\), X standardized). can also list, components d u (returned choose_k) diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Passed plmm(). eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. fbm_flag Logical: std_X FBM type object? set internally plmm(). trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. ... used yet","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"List components: centered_y: vector centered outcomes std_X: standardized design matrix K: list 2 elements. (1) s: vector eigenvalues K, (2) U: eigenvectors K (left singular values X). eta: numeric value estimated eta parameter trace: logical.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmmr-package.html","id":null,"dir":"Reference","previous_headings":"","what":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","title":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","text":"Fits penalized linear mixed models correct unobserved confounding factors. 'plmmr' infers corrects presence unobserved confounding effects population stratification environmental heterogeneity. fits linear model via penalized maximum likelihood. Originally designed multivariate analysis single nucleotide polymorphisms (SNPs) measured genome-wide association study (GWAS), 'plmmr' eliminates need subpopulation-specific analyses post-analysis p-value adjustments. Functions appropriate processing 'PLINK' files also supplied. examples, see package homepage. https://pbreheny.github.io/plmmr/.","code":""},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/reference/plmmr-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","text":"Maintainer: Patrick J. Breheny patrick-breheny@uiowa.edu (ORCID) Authors: Tabitha K. Peter tabitha-peter@uiowa.edu (ORCID) Anna C. Reisetter anna-reisetter@uiowa.edu (ORCID) Yujing Lu","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot method for cv_plmm class — plot.cv_plmm","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"Plot method cv_plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"","code":"# S3 method for class 'cv_plmm' plot( x, log.l = TRUE, type = c(\"cve\", \"rsq\", \"scale\", \"snr\", \"all\"), selected = TRUE, vertical.line = TRUE, col = \"red\", ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"x object class cv_plmm log.l Logical indicate plot returned natural log scale. Defaults log.l = FALSE. type Type plot return. Defaults \"cve.\" selected Logical indicate variables plotted. Defaults TRUE. vertical.line Logical indicate whether vertical line plotted minimum/maximum value. Defaults TRUE. col Color vertical line, plotted. Defaults \"red.\" ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"Nothing returned; instead, plot drawn representing relationship tuning parameter 'lambda' value (x-axis) cross validation error (y-axis).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cvfit <- cv_plmm(design = admix_design) plot(cvfit)"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot method for plmm class — plot.plmm","title":"Plot method for plmm class — plot.plmm","text":"Plot method plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot method for plmm class — plot.plmm","text":"","code":"# S3 method for class 'plmm' plot(x, alpha = 1, log.l = FALSE, shade = TRUE, col, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot method for plmm class — plot.plmm","text":"x object class plmm alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. log.l Logical indicate plot returned natural log scale. Defaults log.l = FALSE. shade Logical indicate whether local nonconvex region shaded. Defaults TRUE. col Vector colors coefficient lines. ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot method for plmm class — plot.plmm","text":"Nothing returned; instead, plot coefficient paths drawn value lambda (one 'path' coefficient).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot method for plmm class — plot.plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) plot(fit) plot(fit, log.l = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Predict method for plmm class — predict.plmm","title":"Predict method for plmm class — predict.plmm","text":"Predict method plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Predict method for plmm class — predict.plmm","text":"","code":"# S3 method for class 'plmm' predict( object, newX, type = c(\"blup\", \"coefficients\", \"vars\", \"nvars\", \"lp\"), lambda, idx = 1:length(object$lambda), ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Predict method for plmm class — predict.plmm","text":"object object class plmm. newX Matrix values predictions made (used type=\"coefficients\" type settings predict). can either FBM object 'matrix' object. Note: Columns argument must named! type character argument indicating type prediction returned. Options \"lp,\" \"coefficients,\" \"vars,\" \"nvars,\" \"blup.\" See details. lambda numeric vector regularization parameter lambda values predictions requested. idx Vector indices penalty parameter lambda predictions required. default, indices returned. ... Additional optional arguments","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Predict method for plmm class — predict.plmm","text":"Depends type - see Details","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Predict method for plmm class — predict.plmm","text":"Define beta-hat coefficients estimated value lambda minimizes cross-validation error (CVE). options type follows: 'response' (default): uses product newX beta-hat predict new values outcome. incorporate correlation structure data. stats folks , simply linear predictor. 'blup' (acronym Best Linear Unbiased Predictor): adds 'response' value represents esetimated random effect. addition way incorporating estimated correlation structure data prediction outcome. 'coefficients': returns estimated beta-hat 'vars': returns indices variables (e.g., SNPs) nonzero coefficients value lambda. EXCLUDES intercept. 'nvars': returns number variables (e.g., SNPs) nonzero coefficients value lambda. EXCLUDES intercept.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Predict method for plmm class — predict.plmm","text":"","code":"set.seed(123) train_idx <- sample(1:nrow(admix$X), 100) # Note: ^ shuffling is important here! Keeps test and train groups comparable. train <- list(X = admix$X[train_idx,], y = admix$y[train_idx]) train_design <- create_design(X = train$X, y = train$y) test <- list(X = admix$X[-train_idx,], y = admix$y[-train_idx]) fit <- plmm(design = train_design) # make predictions for all lambda values pred1 <- predict(object = fit, newX = test$X, type = \"lp\") pred2 <- predict(object = fit, newX = test$X, type = \"blup\") # look at mean squared prediction error mspe <- apply(pred1, 2, function(c){crossprod(test$y - c)/length(c)}) min(mspe) #> [1] 2.87754 mspe_blup <- apply(pred2, 2, function(c){crossprod(test$y - c)/length(c)}) min(mspe_blup) # BLUP is better #> [1] 2.128471 # compare the MSPE of our model to a null model, for reference # null model = intercept only -> y_hat is always mean(y) crossprod(mean(test$y) - test$y)/length(test$y) #> [,1] #> [1,] 6.381748"},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":null,"dir":"Reference","previous_headings":"","what":"Predict method to use in cross-validation (within cvf) — predict_within_cv","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"Predict method use cross-validation (within cvf)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"","code":"predict_within_cv( fit, trainX, trainY = NULL, testX, std_X_details, type, fbm = FALSE, plink_flag = FALSE, Sigma_11 = NULL, Sigma_21 = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"fit list components returned plmm_fit. trainX training data, pre-standardization pre-rotation trainY training outcome, centered. needed type = 'blup' testX design matrix used computing predicted values (.e, test data). std_X_details list 3 items: 'center': centering values columns X 'scale': scaling values non-singular columns X 'ns': indices nonsingular columns std_X. Note: vector really need ! type character argument indicating type prediction returned. Passed cvf(), Options \"lp,\" \"coefficients,\" \"vars,\" \"nvars,\" \"blup.\" See details. fbm Logical: trainX FBM object? , function expects testX also FBM. two X matrices must stored way. Sigma_11 Variance-covariance matrix training data. Extracted estimated_Sigma generated using observations. Required type == 'blup'. Sigma_21 Covariance matrix training testing data. Extracted estimated_Sigma generated using observations. Required type == 'blup'. ... Additional optional arguments","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"numeric vector predicted values","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"Define beta-hat coefficients estimated value lambda minimizes cross-validation error (CVE). options type follows: 'lp' (default): uses linear predictor (.e., product test data estimated coefficients) predict test values outcome. Note approach incorporate correlation structure data. 'blup' (acronym Best Linear Unbiased Predictor): adds 'lp' value represents estimated random effect. addition way incorporating estimated correlation structure data prediction outcome. Note: main difference function predict.plmm() method CV, predictions made standardized scale (.e., trainX testX data come std_X). predict.plmm() method makes predictions scale X (original scale)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to format the time — pretty_time","title":"a function to format the time — pretty_time","text":"function format time","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to format the time — pretty_time","text":"","code":"pretty_time()"},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to format the time — pretty_time","text":"string formatted current date time","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"Print method summary.cv_plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"","code":"# S3 method for class 'summary.cv_plmm' print(x, digits, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"x object class summary.cv_plmm digits number digits use formatting output ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"Nothing returned; instead, message printed console summarizing results cross-validated model fit.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) print(summary(cv_fit)) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2168): #> ------------------------------------------------- #> Nonzero coefficients: 10 #> Cross-validation error (deviance): 1.96 #> Scale estimate (sigma): 1.399"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to print the summary of a plmm model — print.summary.plmm","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"function print summary plmm model","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"","code":"# S3 method for class 'summary.plmm' print(x, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"x summary.plmm object ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"Nothing returned; instead, message printed console summarizing results model fit.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"","code":"lam <- rev(seq(0.01, 1, length.out=20)) |> round(2) # for sake of example admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design, lambda = lam) fit2 <- plmm(design = admix_design, penalty = \"SCAD\", lambda = lam) print(summary(fit, idx = 18)) #> lasso-penalized regression model with n=197, p=101 at lambda=0.1100 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 27 #> ------------------------------------------------- print(summary(fit2, idx = 18)) #> SCAD-penalized regression model with n=197, p=101 at lambda=0.1100 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 29 #> -------------------------------------------------"},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in large data files as an FBM — process_delim","title":"A function to read in large data files as an FBM — process_delim","text":"function read large data files FBM","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in large data files as an FBM — process_delim","text":"","code":"process_delim( data_dir, data_file, feature_id, rds_dir = data_dir, rds_prefix, logfile = NULL, overwrite = FALSE, quiet = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in large data files as an FBM — process_delim","text":"data_dir directory file. data_file file read , without filepath. file numeric values. Example: use data_file = \"myfile.txt\", data_file = \"~/mydirectory/myfile.txt\" Note: file headers/column names, set 'header = TRUE' – passed bigmemory::read.big.matrix(). feature_id string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. rds_dir directory user wants create '.rds' '.bk' files Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds file (create insie rds_dir folder) Note: 'rds_prefix' 'data_prefix' logfile Optional: name (character string) prefix logfile written. Defaults 'process_delim', .e. get 'process_delim.log' outfile. overwrite Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. Note: multiple .rds files names start \"std_prefix_...\", error . protect users accidentally deleting files saved results, one .rds file can removed option. quiet Logical: messages printed console silenced? Defaults FALSE. ... Optional: arguments passed bigmemory::read.big.matrix(). Note: 'sep' option pass , 'header'.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in large data files as an FBM — process_delim","text":"file path newly created '.rds' file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to read in large data files as an FBM — process_delim","text":"","code":"temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), overwrite = TRUE, rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", header = TRUE) #> #> Overwriting existing files:processed_colon2.bk/.rds/.desc #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpuyGgXe/processed_colon2.rds colon2 <- readRDS(colon_dat) str(colon2) #> List of 3 #> $ X:Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"processed_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpuyGgXe/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ n: num 62 #> $ p: num 2001 #> - attr(*, \"class\")= chr \"processed_delim\""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":null,"dir":"Reference","previous_headings":"","what":"Preprocess PLINK files using the bigsnpr package — process_plink","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"Preprocess PLINK files using bigsnpr package","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"","code":"process_plink( data_dir, data_prefix, rds_dir = data_dir, rds_prefix, logfile = NULL, impute = TRUE, impute_method = \"mode\", id_var = \"IID\", parallel = TRUE, quiet = FALSE, overwrite = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"data_dir path bed/bim/fam data files, without trailing \"/\" (e.g., use data_dir = '~/my_dir', data_dir = '~/my_dir/') data_prefix prefix (character string) bed/fam data files (e.g., data_prefix = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds file (create insie rds_dir folder) Note: 'rds_prefix' 'data_prefix' logfile Optional: name (character string) prefix logfile written 'rds_dir'. Default NULL (log file written). Note: supply file path argument, error \"file found\" error. supply string; e.g., want my_log.log, supply 'my_log', my_log.log file appear rds_dir. impute Logical: data imputed? Default TRUE. impute_method 'impute' = TRUE, argument specify kind imputation desired. Options : * mode (default): Imputes frequent call. See bigsnpr::snp_fastImputeSimple() details. * random: Imputes sampling according allele frequencies. * mean0: Imputes rounded mean. * mean2: Imputes mean rounded 2 decimal places. * xgboost: Imputes using algorithm based local XGBoost models. See bigsnpr::snp_fastImpute() details. Note: can take several minutes, even relatively small data set. id_var String specifying column PLINK .fam file unique sample identifiers. Options \"IID\" (default) \"FID\" parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. quiet Logical: messages printed console silenced? Defaults FALSE overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. Note: multiple .rds files names start \"std_prefix_...\", error . protect users accidentally deleting files saved results, one .rds file can removed option. ... Optional: additional arguments bigsnpr::snp_fastImpute() (relevant impute_method = \"xgboost\")","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"filepath '.rds' object created; see details explanation.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"Three files created location specified rds_dir: 'rds_prefix.rds': list three items: (1) X: filebacked bigmemory::big.matrix object pointing imputed genotype data. matrix type 'double', important downstream operations create_design() (2) map: data.frame PLINK 'bim' data (.e., variant information) (3) fam: data.frame PLINK 'fam' data (.e., pedigree information) 'prefix.bk': backingfile stores numeric data genotype matrix 'rds_prefix.desc'\" description file, needed Note process_plink() need run given set PLINK files; subsequent data analysis/scripts, get_data() access '.rds' file. example, see vignette processing PLINK files","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"function read large file numeric file-backed matrix (FBM) Note: function wrapper bigstatsr::big_read()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"","code":"read_data_files( data_file, data_dir, rds_dir, rds_prefix, outfile, overwrite, quiet, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"data_file name file read, including directory. Directory specified data_dir data_dir path directory 'file' rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds/.bk files (create insie rds_dir folder) Note: 'rds_prefix' 'data_file' outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. quiet Logical: messages printed console? Defaults TRUE ... Optional: arguments passed bigmemory::read.big.matrix(). Note: 'sep' option pass .","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"'.rds', '.bk', '.desc' files created data_dir, obj (filebacked bigmemory big.matrix object) returned. See bigmemory documentation info big.matrix class.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in PLINK files using bigsnpr methods — read_plink_files","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"function read PLINK files using bigsnpr methods","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"","code":"read_plink_files( data_dir, data_prefix, rds_dir, outfile, parallel, overwrite, quiet )"},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"data_dir path bed/bim/fam data files, without trailing \"/\" (e.g., use data_dir = '~/my_dir', data_dir = '~/my_dir/') data_prefix prefix (character string) bed/fam data files (e.g., prefix = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. quiet Logical: messages printed console? Defaults TRUE","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"'.rds' '.bk' files created data_dir, obj (bigSNP object) returned. See bigsnpr documentation info bigSNP class.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculate a relatedness matrix — relatedness_mat","title":"Calculate a relatedness matrix — relatedness_mat","text":"Given matrix genotypes, function estimates genetic relatedness matrix (GRM, also known RRM, see Hayes et al. 2009, doi:10.1017/S0016672308009981 ) among subjects: XX'/p, X standardized.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculate a relatedness matrix — relatedness_mat","text":"","code":"relatedness_mat(X, std = TRUE, fbm = FALSE, ns = NULL, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculate a relatedness matrix — relatedness_mat","text":"X n x p numeric matrix genotypes (fully-imputed data). Note: matrix include non-genetic features. std Logical: X standardized? set FALSE (can done data stored memory), good reason , standardization best practice. fbm Logical: X stored FBM? Defaults FALSE ns Optional vector values indicating indices nonsingular features ... optional arguments bigstatsr::bigapply() (like ncores = ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculate a relatedness matrix — relatedness_mat","text":"n x n numeric matrix capturing genomic relatedness samples represented X. notation, call matrix K 'kinship'; also known GRM RRM.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculate a relatedness matrix — relatedness_mat","text":"","code":"RRM <- relatedness_mat(X = admix$X) RRM[1:5, 1:5] #> [,1] [,2] [,3] [,4] [,5] #> [1,] 0.81268908 -0.09098097 -0.07888910 0.06770613 0.08311777 #> [2,] -0.09098097 0.81764801 0.20480021 0.02112812 -0.02640295 #> [3,] -0.07888910 0.20480021 0.82177986 -0.02864226 0.18693970 #> [4,] 0.06770613 0.02112812 -0.02864226 0.89327266 -0.03541470 #> [5,] 0.08311777 -0.02640295 0.18693970 -0.03541470 0.79589686"},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to rotate filebacked data — rotate_filebacked","title":"A function to rotate filebacked data — rotate_filebacked","text":"function rotate filebacked data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to rotate filebacked data — rotate_filebacked","text":"","code":"rotate_filebacked(prep, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to rotate filebacked data — rotate_filebacked","text":"list 4 items: stdrot_X: X rotated re-standardized scale rot_y: y rotated scale (numeric vector) stdrot_X_center: numeric vector values used center rot_X stdrot_X_scale: numeric vector values used scale rot_X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute sequence of lambda values — setup_lambda","title":"Compute sequence of lambda values — setup_lambda","text":"function allows compute sequence lambda values plmm models.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute sequence of lambda values — setup_lambda","text":"","code":"setup_lambda( X, y, alpha, lambda_min, nlambda, penalty_factor, intercept = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute sequence of lambda values — setup_lambda","text":"X Rotated standardized design matrix includes intercept column present. May include clinical covariates non-SNP data. can either 'matrix' 'FBM' object. y Continuous outcome vector. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. value lambda_min = 0 supported. nlambda desired number lambda values sequence generated. penalty_factor multiplicative factor penalty applied coefficient. supplied, penalty_factor must numeric vector length equal number columns X. purpose penalty_factor apply differential penalization coefficients thought likely others model. particular, penalty_factor can 0, case coefficient always model without shrinkage. intercept Logical: X contain intercept column? Defaults TRUE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute sequence of lambda values — setup_lambda","text":"numeric vector lambda values, equally spaced log scale","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to standardize a filebacked matrix — standardize_filebacked","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"helper function standardize filebacked matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"","code":"standardize_filebacked( X, new_file, rds_dir, non_gen, complete_outcome, id_var, outfile, quiet, overwrite )"},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"X list includes: (1) subset_X: big.matrix object subset &/additional predictors appended columns (2) ns: numeric vector indicating indices nonsingular columns subset_X new_file new_file (character string) bed/fam data files (e.g., new_file = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) new_file logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...) overwrite Logical: existing .bk/.rds files exist specified directory/new_file, overwritten?","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"list new component obj called 'std_X' - FBM column-standardized data. List also includes several indices/meta-data standardized matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to standardize matrices — standardize_in_memory","title":"A helper function to standardize matrices — standardize_in_memory","text":"helper function standardize matrices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to standardize matrices — standardize_in_memory","text":"","code":"standardize_in_memory(X)"},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to standardize matrices — standardize_in_memory","text":"X matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to standardize matrices — standardize_in_memory","text":"list standardized matrix, vectors centering/scaling values, vector indices nonsingular columns","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"A helper function to standardize matrices — standardize_in_memory","text":"function adapted https://github.com/pbreheny/ncvreg/blob/master/R/std.R NOTE: function returns matrix memory. standardizing filebacked data, use big_std() – see src/big_standardize.cpp","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to subset big.matrix objects — subset_filebacked","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"helper function subset big.matrix objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"","code":"subset_filebacked(X, new_file, complete_samples, ns, rds_dir, outfile, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"X filebacked big.matrix --standardized design matrix new_file Optional user-specified new_file --created .rds/.bk files. complete_samples Numeric vector indicesmarking rows original data non-missing entry 6th column .fam file ns Numeric vector indices non-singular columns vector created handle_missingness() rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) new_file logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"list two components. First, big.matrix object, 'subset_X', representing design matrix wherein: rows subset according user's specification handle_missing_phen columns subset constant features remain – important standardization downstream list also includes integer vector 'ns' marks columns original matrix 'non-singular' (.e. constant features). 'ns' index plays important role plmm_format() untransform() (helper functions model fitting)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A summary function for cv_plmm objects — summary.cv_plmm","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"summary function cv_plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"","code":"# S3 method for class 'cv_plmm' summary(object, lambda = \"min\", ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"object cv_plmm object lambda regularization parameter value inference reported. Can choose numeric value, 'min', '1se'. Defaults 'min.' ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"return value object S3 class summary.cv_plmm. class print method contains following list elements: lambda_min: lambda value minimum cross validation error lambda.1se: maximum lambda value within 1 standard error minimum cross validation error penalty: penalty applied fitted model nvars: number non-zero coefficients selected lambda value cve: cross validation error folds min: minimum cross validation error fit: plmm fit used cross validation returnBiasDetails = TRUE, two items returned: bias: mean bias cross validation loss: loss value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) summary(cv_fit) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2168): #> ------------------------------------------------- #> Nonzero coefficients: 10 #> Cross-validation error (deviance): 2.12 #> Scale estimate (sigma): 1.455"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A summary method for the plmm objects — summary.plmm","title":"A summary method for the plmm objects — summary.plmm","text":"summary method plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A summary method for the plmm objects — summary.plmm","text":"","code":"# S3 method for class 'plmm' summary(object, lambda, idx, eps = 1e-05, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A summary method for the plmm objects — summary.plmm","text":"object object class plmm lambda regularization parameter value inference reported. idx Alternatively, lambda may specified index; idx=10 means: report inference 10th value lambda along regularization path. lambda idx specified, lambda takes precedence. eps lambda given, eps tolerance difference given lambda value lambda value object. Defaults 0.0001 (1e-5) ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A summary method for the plmm objects — summary.plmm","text":"return value object S3 class summary.plmm. class print method contains following list elements: penalty: penalty used plmm (e.g. SCAD, MCP, lasso) n: Number instances/observations std_X_n: number observations standardized data; time differ 'n' data PLINK external data include samples p: Number regression coefficients (including intercept) converged: Logical indicator whether model converged lambda: lambda value inference reported lambda_char: formatted character string indicating lambda value nvars: number nonzero coefficients (, including intercept) value lambda nonzero: column names indicating nonzero coefficients model specified value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A summary method for the plmm objects — summary.plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) summary(fit, idx = 97) #> lasso-penalized regression model with n=197, p=101 at lambda=0.00054 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 98 #> -------------------------------------------------"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale — untransform","title":"Untransform coefficient values back to the original scale — untransform","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale — untransform","text":"","code":"untransform( std_scale_beta, p, std_X_details, fbm_flag, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale — untransform","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' fbm_flag Logical: corresponding design matrix filebacked? plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns. use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale — untransform","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"","code":"untransform_delim( std_scale_beta, p, std_X_details, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"","code":"untransform_in_memory(std_scale_beta, p, std_X_details, use_names = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"","code":"untransform_plink( std_scale_beta, p, std_X_details, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":null,"dir":"Reference","previous_headings":"","what":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"Linux/Unix MacOS , companion function unzip .gz files ship plmmr package","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"","code":"unzip_example_data(outdir)"},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"outdir file path directory .gz files written","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"Nothing returned; PLINK files ship plmmr package stored directory specified 'outdir'","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"example function, look vignette('plink_files', package = \"plmmr\"). Note : function work Windows systems - Linux/Unix MacOS.","code":""},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"bug-fixes-4-2-0","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"plmmr 4.2.0 (2024-12-13)","text":"recently caught couple bugs model fitting functions – apologize errors may caused downstream analysis, explain addressed issues : Bug BLUP: caught mathematical error earlier implementation best linear unbiased prediction. issue inconsistency scaling among terms used constructing predictor. issue impacted prediction within cross-validation well predict() method plmm class. recommend users used best linear unbiased prediction (BLUP) previous analysis re-run analysis using corrected version. Bug processing delimited files: noticed bug way models fit data delimited files. previous version correctly implementing transformation model results standardized scale original scale due inadvertent addition two rows beta_vals object (one row added, intercept). error corrected. recommend users used previous version plmmr analyze data delimited files re-run analyses.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"other-changes-4-2-0","dir":"Changelog","previous_headings":"","what":"Other changes","title":"plmmr 4.2.0 (2024-12-13)","text":"Change default settings prediction: default prediction method predict() cv_plmm() now ‘blup’ (best linear unbiased prediction). Change objects returned default plmm(): default, main model fitting function plmm() now returns std_X (copy standardized design matrix) , y (outcome vector used fit model), std_scale_beta (estimated coefficients standardized scale). components used construct best linear unbiased predictor. user can opt return items using return_fit = FALSE compact_save options. Change arguments passed predict(): tandem change returned plmm() default, predict() method longer needs separate X y argument supplied type = 'blup'. components needed BLUP returned default plmm. Note predict() still early stages development filebacked data; given complexities particularities filebacked data processed (particularly data constant features), edge cases predict() method handle yet. continue work developing method; now, example predict() filebacked data vignette delimited data. Note particular example delimited data, constant features design matrix.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-410-2024-10-23","dir":"Changelog","previous_headings":"","what":"plmmr 4.1.0 (2024-10-23)","title":"plmmr 4.1.0 (2024-10-23)","text":"CRAN release: 2024-10-23 Restore plmm(X,y) syntax: version 4.0.0 required create_design() always called prior plmm() cv_plmm(); update restores X,y syntax consistent packages (e.g., glmnet, ncvreg). Note syntax available case design matrix stored -memory matrix data.frame object. create_design() function still required cases design matrix/dataset stored external file. Bug fix: 4.0.0 version create_design() required X column names, errored uninformative message names supplied (see issue 61). now fixed – column names required unless user wants specify argument unpen. Argument name change: create_design(), argument specify outcome -memory case renamed y; makes syntax consistent, e.g., create_design(X, y). Note change relevant -memory data . Internal: Fixed LTO type mismatch bug.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-400-2024-10-07","dir":"Changelog","previous_headings":"","what":"plmmr 4.0.0 (2024-10-07)","title":"plmmr 4.0.0 (2024-10-07)","text":"CRAN release: 2024-10-11 Major re-structuring preprocessing pipeline: Data external files must now processed process_plink() process_delim(). data (including -memory data) must prepared analysis via create_design(). change ensures data funneled uniform format analysis. Documentation updated: vignettes package now revised include examples complete pipeline new create_design() syntax. article type data input (matrix/data.frame, delimited file, PLINK). CRAN: package CRAN now.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-320-2024-09-02","dir":"Changelog","previous_headings":"","what":"plmmr 3.2.0 (2024-09-02)","title":"plmmr 3.2.0 (2024-09-02)","text":"bigsnpr now Suggests, Imports: essential filebacking support now done bigmemory bigalgebra. bigsnpr package used processing PLINK files. dev branch gwas_scale version pipeline runs completely file-backed.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-310-2024-07-13","dir":"Changelog","previous_headings":"","what":"plmmr 3.1.0 (2024-07-13)","title":"plmmr 3.1.0 (2024-07-13)","text":"Enhancement: make plmmr better functionality writing scripts, functions process_plink(), plmmm(), cv_plmm() now (optionally) write ‘.log’ files, PLINK. Enhancement: cases users working large datasets, may practical desirable results returned plmmm() cv_plmm() saved single ‘.rds’ file. now option model fitting functions called ‘compact_save’, gives users option save output multiple, smaller ‘.rds’ files. Argument removed: Argument std_needed longer available plmm() cv_plmm() functions.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-300-2024-06-27","dir":"Changelog","previous_headings":"","what":"plmmr 3.0.0 (2024-06-27)","title":"plmmr 3.0.0 (2024-06-27)","text":"Bug fix: Cross-validation implementation issues fixed. Previously, full set eigenvalues used inside CV folds, ideal involves information outside fold. Now, entire modeling process cross-validated: standardization, eigendecomposition relatedness matrix, model fitting, backtransformation onto original scale prediction. Computational speedup: standardization rotation filebacked data now much faster; bigalgebra bigmemory now used computations. Internal: standardized scale, intercept PLMM mean outcome. derivation considerably simplifies handling intercept internally model fitting.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-221-2024-03-16","dir":"Changelog","previous_headings":"","what":"plmmr 2.2.1 (2024-03-16)","title":"plmmr 2.2.1 (2024-03-16)","text":"Name change: Changed package name plmmr; note plmm(), cv_plmm(), functions starting plmm_ changed names.","code":""}]
As with process_delim(), the create_design() function returns a filepath: . The output @@ -178,7 +178,7 @@
process_delim()
create_design()
Notice the messages that are printed out – this documentation may be optionally saved to another .log file using the logfile argument.
.log
logfile
temp_dir <- tempdir() # using a temp dir -- change to fit your preference unzip_example_data(outdir = temp_dir) -#> Unzipped files are saved in /tmp/Rtmph6hzBv
For GWAS data, we have to tell plmmr how to combine information across all three PLINK files (the .bed, .bim, and .fam files). We do this with @@ -139,7 +139,7 @@
plmmr
.bed
.bim
.fam
You’ll see a lot of messages printed to the console here … the result of all this is the creation of 3 files: imputed_penncath_lite.rds and @@ -162,7 +162,7 @@
imputed_penncath_lite.rds
We examine our model results below:
@@ -342,10 +342,10 @@ Cross validation#> Starting decomposition. #> Calculating the eigendecomposition of K #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:05 +#> Rotation (preconditiong) finished at 2024-12-17 19:29:50 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:08 +#> Model fitting finished at 2024-12-17 19:29:53 #> 'Fold' argument is either NULL or missing; assigning folds randomly (by default). #> #> To specify folds for each observation, supply a vector with fold assignments. @@ -356,42 +356,42 @@ Cross validation#> Calculating the eigendecomposition of K #> Fitting model in fold 1 : #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:09 +#> Rotation (preconditiong) finished at 2024-12-17 19:29:54 #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:11 +#> Model fitting finished at 2024-12-17 19:29:56 #> | |============== | 20% #> Beginning eigendecomposition in fold 2 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 2 : #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:13 +#> Rotation (preconditiong) finished at 2024-12-17 19:29:58 #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:15 +#> Model fitting finished at 2024-12-17 19:30:00 #> | |============================ | 40%Beginning eigendecomposition in fold 3 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 3 : #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:16 +#> Rotation (preconditiong) finished at 2024-12-17 19:30:01 #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:18 +#> Model fitting finished at 2024-12-17 19:30:04 #> | |========================================== | 60%Beginning eigendecomposition in fold 4 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 4 : #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:20 +#> Rotation (preconditiong) finished at 2024-12-17 19:30:05 #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:22 +#> Model fitting finished at 2024-12-17 19:30:07 #> | |======================================================== | 80%Beginning eigendecomposition in fold 5 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 5 : #> Beginning rotation ('preconditioning'). -#> Rotation (preconditiong) finished at 2024-12-13 20:59:23 +#> Rotation (preconditiong) finished at 2024-12-17 19:30:08 #> Beginning model fitting. -#> Model fitting finished at 2024-12-13 20:59:25 +#> Model fitting finished at 2024-12-17 19:30:11 #> | |======================================================================| 100%
There are plot and summary methods for CV models as well:
diff --git a/authors.html b/authors.html index b923f7dc..8e349b2d 100644 --- a/authors.html +++ b/authors.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/index.html b/index.html index 7f38a1f0..3804dc34 100644 --- a/index.html +++ b/index.html @@ -26,7 +26,7 @@ plmmr - 4.2.0 + 4.1.0.1 @@ -142,7 +142,7 @@ Developers Dev status - + diff --git a/news/index.html b/news/index.html index c242b485..c1d216c9 100644 --- a/news/index.html +++ b/news/index.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/pkgdown.yml b/pkgdown.yml index 6ba67070..10c68f32 100644 --- a/pkgdown.yml +++ b/pkgdown.yml @@ -7,7 +7,7 @@ articles: articles/matrix_data: matrix_data.html articles/notation: notation.html articles/plink_files: plink_files.html -last_built: 2024-12-13T20:58Z +last_built: 2024-12-17T19:28Z urls: reference: https://pbreheny.github.io/plmmr/reference article: https://pbreheny.github.io/plmmr/articles diff --git a/reference/MCP.html b/reference/MCP.html index f97ecbe0..5d7253e8 100644 --- a/reference/MCP.html +++ b/reference/MCP.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/SCAD.html b/reference/SCAD.html index 63cfe2aa..97150aab 100644 --- a/reference/SCAD.html +++ b/reference/SCAD.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/add_predictors.html b/reference/add_predictors.html index a67f7750..936c6a8f 100644 --- a/reference/add_predictors.html +++ b/reference/add_predictors.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/adjust_beta_dimension.html b/reference/adjust_beta_dimension.html index 1a4afb70..ad20a312 100644 --- a/reference/adjust_beta_dimension.html +++ b/reference/adjust_beta_dimension.html @@ -15,7 +15,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/admix.html b/reference/admix.html index 5ed67f61..5d921268 100644 --- a/reference/admix.html +++ b/reference/admix.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/align_ids.html b/reference/align_ids.html index f72b1b27..6db9b28a 100644 --- a/reference/align_ids.html +++ b/reference/align_ids.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/big_cbind.html b/reference/big_cbind.html index 8fa08a4e..ecb00d6e 100644 --- a/reference/big_cbind.html +++ b/reference/big_cbind.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/check_for_file_extension.html b/reference/check_for_file_extension.html index 49461d0b..419902f9 100644 --- a/reference/check_for_file_extension.html +++ b/reference/check_for_file_extension.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/coef.cv_plmm.html b/reference/coef.cv_plmm.html index 22efb4d2..7963aaac 100644 --- a/reference/coef.cv_plmm.html +++ b/reference/coef.cv_plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/coef.plmm.html b/reference/coef.plmm.html index 115ca66f..c779752a 100644 --- a/reference/coef.plmm.html +++ b/reference/coef.plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/construct_variance.html b/reference/construct_variance.html index d881b812..6af5a0d4 100644 --- a/reference/construct_variance.html +++ b/reference/construct_variance.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/count_constant_features.html b/reference/count_constant_features.html index a8b314b6..3235eac6 100644 --- a/reference/count_constant_features.html +++ b/reference/count_constant_features.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/count_cores.html b/reference/count_cores.html index 55c1fb11..80238e99 100644 --- a/reference/count_cores.html +++ b/reference/count_cores.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/create_design.html b/reference/create_design.html index 301ca7ba..3677a267 100644 --- a/reference/create_design.html +++ b/reference/create_design.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 @@ -132,7 +132,7 @@ Examples#> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed -#> Processed files now saved as /tmp/Rtmpm57oVf/processed_colon2.rds +#> Processed files now saved as /tmp/RtmpuyGgXe/processed_colon2.rds # prepare outcome data colon_outcome <- read.delim(find_example_data(path = "colon2_outcome.txt")) @@ -145,7 +145,7 @@ Examples#> There are 0 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... -#> Standardization completed at 2024-12-13 20:58:19 +#> Standardization completed at 2024-12-17 19:29:03 #> Done with standardization. File formatting in progress # look at the results @@ -168,7 +168,7 @@ Examples#> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr "FileBacked" #> .. .. ..$ filename : chr "std_colon2.bk" -#> .. .. ..$ dirname : chr "/tmp/Rtmpm57oVf/" +#> .. .. ..$ dirname : chr "/tmp/RtmpuyGgXe/" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 @@ -191,7 +191,7 @@ Examples# process PLINK data temp_dir <- tempdir() unzip_example_data(outdir = temp_dir) -#> Unzipped files are saved in /tmp/Rtmpm57oVf +#> Unzipped files are saved in /tmp/RtmpuyGgXe plink_data <- process_plink(data_dir = temp_dir, data_prefix = "penncath_lite", @@ -215,7 +215,7 @@ Examples#> Imputing the missing (genotype) values using mode method #> #> process_plink() completed -#> Processed files now saved as /tmp/Rtmpm57oVf/imputed_penncath_lite.rds +#> Processed files now saved as /tmp/RtmpuyGgXe/imputed_penncath_lite.rds # get outcome data penncath_pheno <- read.csv(find_example_data(path = 'penncath_clinical.csv')) @@ -250,7 +250,7 @@ Examples#> There are 62 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... -#> Standardization completed at 2024-12-13 20:58:21 +#> Standardization completed at 2024-12-17 19:29:06 #> Done with standardization. File formatting in progress # examine the design - notice the components of this object diff --git a/reference/create_design_filebacked.html b/reference/create_design_filebacked.html index 29f33c0f..a37f7c22 100644 --- a/reference/create_design_filebacked.html +++ b/reference/create_design_filebacked.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/create_design_in_memory.html b/reference/create_design_in_memory.html index 85c43c21..c1c141fe 100644 --- a/reference/create_design_in_memory.html +++ b/reference/create_design_in_memory.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/create_log.html b/reference/create_log.html index d6caf862..1e7f4576 100644 --- a/reference/create_log.html +++ b/reference/create_log.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/cv-plmm.log b/reference/cv-plmm.log index 9a0c657c..af69c49a 100644 --- a/reference/cv-plmm.log +++ b/reference/cv-plmm.log @@ -1,21 +1,21 @@ ### plmmr log file ### Logging to ./cv-plmm.log -Host: fv-az797-383 +Host: fv-az1074-827 Current working directory: /home/runner/work/plmmr/plmmr/docs/reference -Start log at: 2024-12-13 20:58:36 +Start log at: 2024-12-17 19:29:20 Call: cv_plmm(design = admix_design) -Input data passed all checks at 2024-12-13 20:58:36 +Input data passed all checks at 2024-12-17 19:29:20 -Eigendecomposition finished at 2024-12-13 20:58:36 +Eigendecomposition finished at 2024-12-17 19:29:20 -Full model fit finished at 2024-12-13 20:58:36 +Full model fit finished at 2024-12-17 19:29:20 -Formatting for full model finished at 2024-12-13 20:58:36 +Formatting for full model finished at 2024-12-17 19:29:20 -Cross validation started at: 2024-12-13 20:58:36 -Started fold 1 at 2024-12-13 20:58:36 -Started fold 2 at 2024-12-13 20:58:36 -Started fold 3 at 2024-12-13 20:58:37 -Started fold 4 at 2024-12-13 20:58:37 -Started fold 5 at 2024-12-13 20:58:37 +Cross validation started at: 2024-12-17 19:29:21 +Started fold 1 at 2024-12-17 19:29:21 +Started fold 2 at 2024-12-17 19:29:21 +Started fold 3 at 2024-12-17 19:29:21 +Started fold 4 at 2024-12-17 19:29:21 +Started fold 5 at 2024-12-17 19:29:21 diff --git a/reference/cv_plmm.html b/reference/cv_plmm.html index 9ff63694..82f73eab 100644 --- a/reference/cv_plmm.html +++ b/reference/cv_plmm.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/cvf.html b/reference/cvf.html index 339aff41..d36f525a 100644 --- a/reference/cvf.html +++ b/reference/cvf.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/eigen_K.html b/reference/eigen_K.html index bd1b73f2..1d8cea63 100644 --- a/reference/eigen_K.html +++ b/reference/eigen_K.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/estimate_eta.html b/reference/estimate_eta.html index 7b7b3140..e01f5022 100644 --- a/reference/estimate_eta.html +++ b/reference/estimate_eta.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/fbm2bm.html b/reference/fbm2bm.html index 126285e4..8ab3cd5f 100644 --- a/reference/fbm2bm.html +++ b/reference/fbm2bm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/file_sans_ext.html b/reference/file_sans_ext.html index a708d22a..5b94581e 100644 --- a/reference/file_sans_ext.html +++ b/reference/file_sans_ext.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/find_example_data.html b/reference/find_example_data.html index c9771bcb..970fcf14 100644 --- a/reference/find_example_data.html +++ b/reference/find_example_data.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/get_data.html b/reference/get_data.html index bc7173a1..9b0f99ec 100644 --- a/reference/get_data.html +++ b/reference/get_data.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/get_hostname.html b/reference/get_hostname.html index f5889bca..061b0837 100644 --- a/reference/get_hostname.html +++ b/reference/get_hostname.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/impute_snp_data.html b/reference/impute_snp_data.html index 25baac5a..816e4a82 100644 --- a/reference/impute_snp_data.html +++ b/reference/impute_snp_data.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/index.html b/reference/index.html index 60181768..8361ef20 100644 --- a/reference/index.html +++ b/reference/index.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/index_samples.html b/reference/index_samples.html index 7dd267e2..d1a78b0a 100644 --- a/reference/index_samples.html +++ b/reference/index_samples.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/index_std_X.html b/reference/index_std_X.html index 6b1a0dd5..5e13056f 100644 --- a/reference/index_std_X.html +++ b/reference/index_std_X.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/lam_names.html b/reference/lam_names.html index 8cb3a10c..6a3a25f5 100644 --- a/reference/lam_names.html +++ b/reference/lam_names.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/lasso.html b/reference/lasso.html index 866b21cd..ec2f4f1a 100644 --- a/reference/lasso.html +++ b/reference/lasso.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/log_lik.html b/reference/log_lik.html index 5c683426..5b1f59b5 100644 --- a/reference/log_lik.html +++ b/reference/log_lik.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/name_and_count_bigsnp.html b/reference/name_and_count_bigsnp.html index 05e0e53c..481e37ad 100644 --- a/reference/name_and_count_bigsnp.html +++ b/reference/name_and_count_bigsnp.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm.html b/reference/plmm.html index 9ead38bb..8308c76e 100644 --- a/reference/plmm.html +++ b/reference/plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm.log b/reference/plmm.log index 450ddcbf..2844046d 100644 --- a/reference/plmm.log +++ b/reference/plmm.log @@ -1,11 +1,11 @@ ### plmmr log file ### Logging to ./plmm.log -Host: fv-az797-383 +Host: fv-az1074-827 Current working directory: /home/runner/work/plmmr/plmmr/docs/reference -Start log at: 2024-12-13 20:58:38 +Start log at: 2024-12-17 19:29:22 Call: plmm(design = admix_design) -Input data passed all checks at 2024-12-13 20:58:38 +Input data passed all checks at 2024-12-17 19:29:22 -Eigendecomposition finished at 2024-12-13 20:58:38 +Eigendecomposition finished at 2024-12-17 19:29:22 -Model ready at 2024-12-13 20:58:38 +Model ready at 2024-12-17 19:29:22 diff --git a/reference/plmm_checks.html b/reference/plmm_checks.html index 722bfd7b..3ddf7c82 100644 --- a/reference/plmm_checks.html +++ b/reference/plmm_checks.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm_fit.html b/reference/plmm_fit.html index 875dc69f..7c9c0b6d 100644 --- a/reference/plmm_fit.html +++ b/reference/plmm_fit.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm_format.html b/reference/plmm_format.html index 00812a22..1498c488 100644 --- a/reference/plmm_format.html +++ b/reference/plmm_format.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm_loss.html b/reference/plmm_loss.html index efeaf083..f7c5b1ad 100644 --- a/reference/plmm_loss.html +++ b/reference/plmm_loss.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmm_prep.html b/reference/plmm_prep.html index 94979bab..67edf2f1 100644 --- a/reference/plmm_prep.html +++ b/reference/plmm_prep.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plmmr-package.html b/reference/plmmr-package.html index 142e9a2a..89d9769b 100644 --- a/reference/plmmr-package.html +++ b/reference/plmmr-package.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plot.cv_plmm.html b/reference/plot.cv_plmm.html index 39b3fa0e..300cda4e 100644 --- a/reference/plot.cv_plmm.html +++ b/reference/plot.cv_plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/plot.plmm.html b/reference/plot.plmm.html index 084ac612..cebd5419 100644 --- a/reference/plot.plmm.html +++ b/reference/plot.plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/predict.plmm.html b/reference/predict.plmm.html index c9d1a0b6..4fc374b8 100644 --- a/reference/predict.plmm.html +++ b/reference/predict.plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/predict_within_cv.html b/reference/predict_within_cv.html index cd297447..3356b8dc 100644 --- a/reference/predict_within_cv.html +++ b/reference/predict_within_cv.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/pretty_time.html b/reference/pretty_time.html index 575cad2a..9948ad16 100644 --- a/reference/pretty_time.html +++ b/reference/pretty_time.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/print.summary.cv_plmm.html b/reference/print.summary.cv_plmm.html index ae0229b1..a229d685 100644 --- a/reference/print.summary.cv_plmm.html +++ b/reference/print.summary.cv_plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/print.summary.plmm.html b/reference/print.summary.plmm.html index 3f777aeb..6249b7ea 100644 --- a/reference/print.summary.plmm.html +++ b/reference/print.summary.plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/process_delim.html b/reference/process_delim.html index 591d232e..a9bdb06c 100644 --- a/reference/process_delim.html +++ b/reference/process_delim.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 @@ -125,7 +125,7 @@ Examples#> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed -#> Processed files now saved as /tmp/Rtmpm57oVf/processed_colon2.rds +#> Processed files now saved as /tmp/RtmpuyGgXe/processed_colon2.rds colon2 <- readRDS(colon_dat) str(colon2) @@ -134,7 +134,7 @@ Examples#> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr "FileBacked" #> .. .. ..$ filename : chr "processed_colon2.bk" -#> .. .. ..$ dirname : chr "/tmp/Rtmpm57oVf/" +#> .. .. ..$ dirname : chr "/tmp/RtmpuyGgXe/" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 diff --git a/reference/process_plink.html b/reference/process_plink.html index 12c8f844..194ad4bb 100644 --- a/reference/process_plink.html +++ b/reference/process_plink.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/read_data_files.html b/reference/read_data_files.html index 6cb0b73f..ee5459a5 100644 --- a/reference/read_data_files.html +++ b/reference/read_data_files.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/read_plink_files.html b/reference/read_plink_files.html index 6d9c0e2a..b7c7c783 100644 --- a/reference/read_plink_files.html +++ b/reference/read_plink_files.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/relatedness_mat.html b/reference/relatedness_mat.html index 9993520b..70ff9d07 100644 --- a/reference/relatedness_mat.html +++ b/reference/relatedness_mat.html @@ -13,7 +13,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/rotate_filebacked.html b/reference/rotate_filebacked.html index e4b449cd..e32718ed 100644 --- a/reference/rotate_filebacked.html +++ b/reference/rotate_filebacked.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/setup_lambda.html b/reference/setup_lambda.html index faf74199..aba7b89a 100644 --- a/reference/setup_lambda.html +++ b/reference/setup_lambda.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/standardize_filebacked.html b/reference/standardize_filebacked.html index c8c9fd87..c41de5b6 100644 --- a/reference/standardize_filebacked.html +++ b/reference/standardize_filebacked.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/standardize_in_memory.html b/reference/standardize_in_memory.html index 608fcb53..1caec0d8 100644 --- a/reference/standardize_in_memory.html +++ b/reference/standardize_in_memory.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/subset_filebacked.html b/reference/subset_filebacked.html index de66a96f..6b9a0bd3 100644 --- a/reference/subset_filebacked.html +++ b/reference/subset_filebacked.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/summary.cv_plmm.html b/reference/summary.cv_plmm.html index e1126ca9..00859686 100644 --- a/reference/summary.cv_plmm.html +++ b/reference/summary.cv_plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/summary.plmm.html b/reference/summary.plmm.html index 8a9f02ed..3e8425b2 100644 --- a/reference/summary.plmm.html +++ b/reference/summary.plmm.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/untransform.html b/reference/untransform.html index c0d04515..9686b990 100644 --- a/reference/untransform.html +++ b/reference/untransform.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/untransform_delim.html b/reference/untransform_delim.html index 8b0d0f78..5364e87b 100644 --- a/reference/untransform_delim.html +++ b/reference/untransform_delim.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/untransform_in_memory.html b/reference/untransform_in_memory.html index 69739408..667e85e0 100644 --- a/reference/untransform_in_memory.html +++ b/reference/untransform_in_memory.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/untransform_plink.html b/reference/untransform_plink.html index a4aedd2c..227249a7 100644 --- a/reference/untransform_plink.html +++ b/reference/untransform_plink.html @@ -9,7 +9,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/reference/unzip_example_data.html b/reference/unzip_example_data.html index fc874259..97d1ddd2 100644 --- a/reference/unzip_example_data.html +++ b/reference/unzip_example_data.html @@ -7,7 +7,7 @@ plmmr - 4.2.0 + 4.1.0.1 diff --git a/search.json b/search.json index df7d0a58..1ee43d7c 100644 --- a/search.json +++ b/search.json @@ -1 +1 @@ -[{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"process-the-data","dir":"Articles","previous_headings":"","what":"Process the data","title":"If your data is in a delimited file","text":"output messages indicate data processed. call created 2 files, one .rds file corresponding .bk file. .bk file special type binary file can used store large data sets. .rds file contains pointer .bk file, along meta-data. Note returned process_delim() character string filepath: .","code":"# I will create the processed data files in a temporary directory; # fill in the `rds_dir` argument with the directory of your choice temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", overwrite = TRUE, header = TRUE) #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpQdavVL/processed_colon2.rds # look at what is created colon <- readRDS(colon_dat)"},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"create-a-design","dir":"Articles","previous_headings":"","what":"Create a design","title":"If your data is in a delimited file","text":"Creating design ensures data uniform format prior analysis. delimited files, two main processes happening create_design(): (1) standardization columns (2) construction penalty factor vector. Standardization columns ensures features evaluated model uniform scale; done transforming column design matrix mean 0 variance 1. penalty factor vector indicator vector 0 represents feature always model – feature unpenalized. specify columns want unpenalized, use ‘unpen’ argument. example, choosing make ‘sex’ unpenalized covariate. side note unpenalized covariates: delimited file data, features want include model – penalized unpenalized features – must included delimited file. differs PLINK file data analyzed; look create_design() documentation details examples. process_delim(), create_design() function returns filepath: . output messages document steps create design procedure, messages saved text file colon_design.log rds_dir folder. didactic purposes, can look design:","code":"# prepare outcome data colon_outcome <- read.delim(find_example_data(path = \"colon2_outcome.txt\")) # create a design colon_design <- create_design(data_file = colon_dat, rds_dir = temp_dir, new_file = \"std_colon2\", add_outcome = colon_outcome, outcome_id = \"ID\", outcome_col = \"y\", unpen = \"sex\", # this will keep 'sex' in the final model logfile = \"colon_design\") #> No feature_id supplied; will assume data X are in same row-order as add_outcome. #> There are 0 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-13 20:58:41 #> Done with standardization. File formatting in progress # look at the results colon_rds <- readRDS(colon_design) str(colon_rds) #> List of 18 #> $ X_colnames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ X_rownames : chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ n : num 62 #> $ p : num 2001 #> $ is_plink : logi FALSE #> $ outcome_idx : int [1:62] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : int [1:62] 1 0 1 0 1 0 1 0 1 0 ... #> $ std_X_rownames: chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ unpen : int 1 #> $ unpen_colnames: chr \"sex\" #> $ ns : int [1:2001] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpQdavVL/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 62 #> $ std_X_p : num 2001 #> $ std_X_center : num [1:2001] 1.47 7015.79 4966.96 4094.73 3987.79 ... #> $ std_X_scale : num [1:2001] 0.499 3067.926 2171.166 1803.359 2002.738 ... #> $ penalty_factor: num [1:2001] 0 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\""},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"fit-a-model","dir":"Articles","previous_headings":"","what":"Fit a model","title":"If your data is in a delimited file","text":"fit model using design follows: Notice messages printed – documentation may optionally saved another .log file using logfile argument. can examine results specific \\lambda value: may also plot paths estimated coefficients:","code":"colon_fit <- plmm(design = colon_design, return_fit = TRUE, trace = TRUE) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Input data passed all checks at 2024-12-13 20:58:42 #> Starting decomposition. #> Calculating the eigendecomposition of K #> Eigendecomposition finished at 2024-12-13 20:58:42 #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:58:42 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:58:42 #> Beta values are estimated -- almost done! #> Formatting results (backtransforming coefs. to original scale). #> Model ready at 2024-12-13 20:58:42 summary(colon_fit, idx = 50) #> lasso-penalized regression model with n=62, p=2002 at lambda=0.0597 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 30 #> ------------------------------------------------- plot(colon_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"prediction-for-filebacked-data","dir":"Articles","previous_headings":"","what":"Prediction for filebacked data","title":"If your data is in a delimited file","text":"example shows experimental option, wherein working add prediction method filebacked outside cross-validation.","code":"# linear predictor yhat_lp <- predict(object = colon_fit, newX = attach.big.matrix(colon$X), type = \"lp\") # best linear unbiased predictor yhat_blup <- predict(object = colon_fit, newX = attach.big.matrix(colon$X), type = \"blup\") # look at mean squared prediction error mspe_lp <- apply(yhat_lp, 2, function(c){crossprod(colon_outcome$y - c)/length(c)}) mspe_blup <- apply(yhat_blup, 2, function(c){crossprod(colon_outcome$y - c)/length(c)}) min(mspe_lp) #> [1] 0.007659158 min(mspe_blup) #> [1] 0.00617254"},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Getting started with plmmr","text":"plmmr package fitting Penalized Linear Mixed Models R. package created purpose fitting penalized regression models high dimensional data observations correlated. instance, kind data arises often context genetics (e.g., GWAS population structure /family grouping). novelties plmmr : Integration: plmmr combines functionality several packages order quality control, model fitting/analysis, data visualization one package. example, GWAS data, plmmr take PLINK files way list SNPs downstream analysis. Accessibility: plmmr can run R session typical desktop laptop computer. user need access supercomputer experience command line order fit models plmmr. Handling correlation: plmmr uses transformation (1) measures correlation among samples (2) uses correlation measurement improve predictions (via best linear unbiased predictor, BLUP). means plmm(), ’s need filter data ‘maximum subset unrelated samples.’","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"minimal-example","dir":"Articles","previous_headings":"","what":"Minimal example","title":"Getting started with plmmr","text":"minimal reproducible example plmmr can used:","code":"# library(plmmr) fit <- plmm(admix$X, admix$y) # admix data ships with package plot(fit) cvfit <- cv_plmm(admix$X, admix$y) plot(cvfit) summary(cvfit) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2325): #> ------------------------------------------------- #> Nonzero coefficients: 8 #> Cross-validation error (deviance): 2.12 #> Scale estimate (sigma): 1.455"},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"file-backing","dir":"Articles","previous_headings":"Computational capability","what":"File-backing","title":"Getting started with plmmr","text":"many applications high dimensional data analysis, dataset large read R – session crash lack memory. particularly common analyzing data genome-wide association studies (GWAS). analyze large datasets, plmmr equipped analyze data using filebacking - strategy lets R ‘point’ file disk, rather reading file R session. Many packages use technique - bigstatsr biglasso two examples packages use filebacking technique. package plmmr uses create store filebacked objects bigmemory. filebacked computation relies biglasso package Yaohui Zeng et al. bigalgebra Michael Kane et al. processing PLINK files, use methods bigsnpr package Florian Privé.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"numeric-outcomes-only","dir":"Articles","previous_headings":"Computational capability","what":"Numeric outcomes only","title":"Getting started with plmmr","text":"time, package designed linear regression – , considering continuous (numeric) outcomes. maintain treating binary outcomes numeric values appropriate contexts, described Hastie et al. Elements Statistical Learning, chapter 4. future, like extend package handle dichotomous outcomes via logistic regression; theoretical work underlying open problem.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"types-of-penalization","dir":"Articles","previous_headings":"Computational capability","what":"3 types of penalization","title":"Getting started with plmmr","text":"Since focused penalized regression package, plmmr offers 3 choices penalty: minimax concave (MCP), smoothly clipped absolute deviation (SCAD), least absolute shrinkage selection operator (LASSO). implementation penalties built concepts/techniques provided ncvreg package.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"data-size-and-dimensionality","dir":"Articles","previous_headings":"Computational capability","what":"Data size and dimensionality","title":"Getting started with plmmr","text":"distinguish data attributes ‘big’ ‘high dimensional.’ ‘Big’ describes amount space data takes computer, ‘high dimensional’ describes context ratio features (also called ‘variables’ ‘predictors’) observations (e.g., samples) high. instance, data 100 samples 100 variables high dimensional, big. contrast, data 10 million observations 100 variables big, high dimensional. plmmr optimized data high dimensional – methods using estimate relatedness among observations perform best high number features relative number observations. plmmr also designed accommodate data large analyze -memory. accommodate data file-backing (described ). current analysis pipeline works well data files 40 Gb size. practice, means plmmr equipped analyze GWAS data, biobank-sized data.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"data-input-types","dir":"Articles","previous_headings":"","what":"Data input types","title":"Getting started with plmmr","text":"plmmr currently works three types data input: Data stored -memory matrix data frame Data stored PLINK files Data stored delimited files","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"example-data-sets","dir":"Articles","previous_headings":"Data input types","what":"Example data sets","title":"Getting started with plmmr","text":"plmmr currently includes three example data sets, one type data input. admix data example matrix input data. admix small data set (197 observations, 100 SNPs) describes individuals different ancestry groups. outcome admix simulated include population structure effects (.e. race/ethnicity impact SNP associations). data set available whenever library(plmmr) called. example analysis admix data available vignette('matrix_data', package = \"plmmr\"). penncath_lite data example PLINK input data. penncath_lite (data coronary artery disease PennCath study) high dimensional data set (1401 observations, 4217 SNPs) several health outcomes well age sex information. features data set represent small subset much larger GWAS data set (original data 800K SNPs). information data set, refer original publication. example analysis penncath_lite data available vignette('plink_files', package = \"plmmr\"). colon2 data example delimited-file input data. colon2 variation colon data included biglasso package. colon2 62 observations 2,001 features representing study colon disease. 2000 features original data, ‘sex’ feature simulated. example analysis colon2 data available vignette('delim_files', package = \"plmmr\").","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"basic-model-fitting","dir":"Articles","previous_headings":"","what":"Basic model fitting","title":"If your data is in a matrix or data frame","text":"admix dataset now ready analyze call plmmr::plmm() (one main functions plmmr): Notice: passing admix$X design argument plmm(); internally, plmm() taken X input created plmm_design object. also supply X y create_design() make step explicit. returned beta_vals item matrix whose rows \\hat\\beta coefficients whose columns represent values penalization parameter \\lambda. default, plmm fits 100 values \\lambda (see setup_lambda function details). Note values \\lambda, SNP 8 \\hat \\beta = 0. SNP 8 constant feature, feature (.e., column \\mathbf{X}) whose values vary among members population. can summarize fit nth \\lambda value: can also plot path fit see model coefficients vary \\lambda: Plot path model fit Suppose also know ancestry groups person admix data self-identified. probably want include model unpenalized covariate (.e., want ‘ancestry’ always model). specify unpenalized covariate, need use create_design() function prior calling plmm(). look: may compare results model includes ‘ancestry’ first model:","code":"admix_fit <- plmm(admix$X, admix$y) summary(admix_fit, lambda = admix_fit$lambda[50]) #> lasso-penalized regression model with n=197, p=101 at lambda=0.01426 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 88 #> ------------------------------------------------- admix_fit$beta_vals[1:10, 97:100] |> knitr::kable(digits = 3, format = \"html\") # for n = 25 summary(admix_fit, lambda = admix_fit$lambda[25]) #> lasso-penalized regression model with n=197, p=101 at lambda=0.08163 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 46 #> ------------------------------------------------- plot(admix_fit) # add ancestry to design matrix X_plus_ancestry <- cbind(admix$ancestry, admix$X) # adjust column names -- need these for designating 'unpen' argument colnames(X_plus_ancestry) <- c(\"ancestry\", colnames(admix$X)) # create a design admix_design2 <- create_design(X = X_plus_ancestry, y = admix$y, # below, I mark ancestry variable as unpenalized # we want ancestry to always be in the model unpen = \"ancestry\") # now fit a model admix_fit2 <- plmm(design = admix_design2) summary(admix_fit2, idx = 25) #> lasso-penalized regression model with n=197, p=102 at lambda=0.09886 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 14 #> ------------------------------------------------- plot(admix_fit2)"},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"cross-validation","dir":"Articles","previous_headings":"","what":"Cross validation","title":"If your data is in a matrix or data frame","text":"select \\lambda value, often use cross validation. example using cv_plmm select \\lambda minimizes cross-validation error: can also plot cross-validation error (CVE) versus \\lambda (log scale): Plot CVE","code":"admix_cv <- cv_plmm(design = admix_design2, return_fit = T) admix_cv_s <- summary(admix_cv, lambda = \"min\") print(admix_cv_s) #> lasso-penalized model with n=197 and p=102 #> At minimum cross-validation error (lambda=0.1853): #> ------------------------------------------------- #> Nonzero coefficients: 3 #> Cross-validation error (deviance): 1.33 #> Scale estimate (sigma): 1.154 plot(admix_cv)"},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"predicted-values","dir":"Articles","previous_headings":"","what":"Predicted values","title":"If your data is in a matrix or data frame","text":"example predict() methods PLMMs: can compare predictions predictions get intercept-model using mean squared prediction error (MSPE) – lower better: see model better predictions null.","code":"# make predictions for select lambda value(s) y_hat <- predict(object = admix_fit, newX = admix$X, type = \"blup\", X = admix$X, y = admix$y) # intercept-only (or 'null') model crossprod(admix$y - mean(admix$y))/length(admix$y) #> [,1] #> [1,] 5.928528 # our model at its best value of lambda apply(y_hat, 2, function(c){crossprod(admix$y - c)/length(c)}) -> mse min(mse) #> [1] 0.6930826 # ^ across all values of lambda, our model has MSPE lower than the null model"},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"math-notation","dir":"Articles","previous_headings":"","what":"Math notation","title":"Notes on notation","text":"concepts need denote, order usage derivations. blocked sections corresponding steps model fitting process.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"statistical-model-the-overall-framework","dir":"Articles","previous_headings":"Math notation","what":"Statistical model (the overall framework)","title":"Notes on notation","text":"overall model can written \\mathbf{y} = \\mathbf{X}\\boldsymbol{\\beta} + \\mathbf{Z}\\boldsymbol{\\gamma} + \\boldsymbol{\\epsilon} equivalently \\mathbf{y} = \\dot{\\mathbf{X}}\\dot{\\boldsymbol{\\beta}} + \\mathbf{u} + \\boldsymbol{\\epsilon} : \\mathbf{X} \\mathbf{y} n \\times p design matrix data n \\times 1 vector outcomes, respectively. , n number observations (e.g., number patients, number samples, etc.) p number features (e.g., number SNPs, number variables, number covariates, etc.). \\dot{\\mathbf{X}} column-standardized \\mathbf{X}, p columns mean 0 standard deviation 1. Note: \\dot{\\mathbf{X}} excludes singular features (columns constants) original \\mathbf{X}. \\dot{\\boldsymbol{\\beta}} represents coefficients standardized scale. \\mathbf{Z} n \\times b matrix indicators corresponding grouping structure, \\boldsymbol{\\gamma} vector values describing grouping associated \\mathbf{y}. real data, values typically unknown. \\boldsymbol{\\epsilon} n \\times 1 vector noise. define realized (empirical) relatedness matrix \\mathbf{K} \\equiv \\frac{1}{p}\\dot{\\mathbf{X}}\\dot{\\mathbf{X}}^\\top model assumes: \\boldsymbol{\\epsilon} \\perp \\mathbf{u} \\boldsymbol{\\epsilon} \\sim N(0, \\sigma^2_{\\epsilon}\\mathbf{}) \\mathbf{u} \\sim N(0, \\sigma^2_{s}\\mathbf{K}) assumptions, may write \\mathbf{y} \\sim N(\\dot{\\mathbf{X}}\\dot{\\boldsymbol{\\beta}}, \\boldsymbol{\\Sigma}) Indices: \\1,..., n indexes observations j \\1,..., p indexes features h \\1,..., b indexes batches (e.g., different family groups, different data collection sites, etc.)","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"decomposition-and-rotation-prep-and-first-part-of-fit","dir":"Articles","previous_headings":"Math notation","what":"Decomposition and rotation (prep and first part of fit)","title":"Notes on notation","text":"Beginning eigendecomposition, \\mathbf{U} \\mathbf{s} eigenvectors eigenvalues \\mathbf{K}, one obtain \\text{eigen}(\\mathbf{K)} \\equiv \\mathbf{U}\\mathbf{S}\\mathbf{U}^\\top. elements \\mathbf{s} diagonal values \\mathbf{S}. Note, random effect \\mathbf{u} distinct columns matrix \\mathbf{U}. k represents number nonzero eigenvalues represented \\mathbf{U} \\mathbf{d}, k \\leq \\text{min}(n,p). , \\mathbf{K} \\equiv \\frac{1}{p}\\dot{\\mathbf{X}}\\dot{\\mathbf{X}}^{\\top} often referred literature realized relatedness matrix (RRM) genomic relatedness matrix (GRM). \\mathbf{K} dimension n \\times n. \\eta ratio \\frac{\\sigma^2_s}{\\sigma^2_e + \\sigma^2_s}. estimate \\hat{\\eta} null model (details come). \\mathbf{\\Sigma} variance outcome, \\mathbb{V}({\\mathbf{y}}) \\propto \\eta \\mathbf{K} + (1 - \\eta)\\mathbf{}_n. \\mathbf{w} vector weights defined (\\eta\\mathbf{\\mathbf{s}} + (1-\\eta))^{-1/2}. values \\mathbf{w} nonzero values diagonal matrix \\mathbf{W} \\equiv (\\eta\\mathbf{S} + (1 - \\eta)\\mathbf{})^{-1/2}. matrix used rotating (preconditioning) data \\mathbf{\\Sigma}^{-1/2} \\equiv \\mathbf{W}\\mathbf{U}^\\top. \\tilde{\\dot{\\mathbf{X}}} \\equiv \\mathbf{W}\\mathbf{U}^\\top\\dot{\\mathbf{X}} rotated data, data transformed scale. \\tilde{\\mathbf{y}} \\equiv \\mathbf{\\Sigma}^{-1/2}\\mathbf{y} outcome rotated scale. \\tilde{\\ddot{\\mathbf{X}}} standardized rotated data. Note: standardization involves scaling, centering. post-rotation standardization impacts estimated coefficients well; define {\\ddot{\\boldsymbol{\\beta}}} estimated coefficients scale.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"model-fitting-with-penalization","dir":"Articles","previous_headings":"Math notation","what":"Model fitting with penalization","title":"Notes on notation","text":"fit \\tilde{\\mathbf{y}} \\sim \\tilde{\\ddot{\\mathbf{X}}} using penalized linear mixed model, obtain \\hat{\\ddot{\\boldsymbol{\\beta}}} estimated coefficients. penalty parameter values (e.g., values lasso tuning parameter) indexed \\lambda_l \\1,..., t.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"rescaling-results-format","dir":"Articles","previous_headings":"Math notation","what":"Rescaling results (format)","title":"Notes on notation","text":"obtain estimated coefficients original scale, values estimated model must unscaled (‘untransformed’) twice: adjust post-rotation standardization, adjust pre-rotation standardization. process written \\hat{\\ddot{\\boldsymbol{\\beta}}} \\rightarrow \\hat{\\dot{\\boldsymbol{\\beta}}} \\rightarrow \\hat{\\boldsymbol{\\beta}}.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"object-names-in-source-code","dir":"Articles","previous_headings":"","what":"Object names in source code","title":"Notes on notation","text":"code, denote objects way: \\mathbf{X} \\mathbf{y} X y \\dot{\\mathbf{X}} std_X \\tilde{\\dot{\\mathbf{X}}} rot_X \\ddot{\\tilde{\\mathbf{X}}} stdrot_X \\hat{\\boldsymbol{\\beta}} named og_scale_beta helper functions (clarity) returned plmm objects beta_vals. beta_vals og_scale_beta equivalent; represent estimated coefficients original scale. \\hat{\\dot{\\boldsymbol{\\beta}}} std_scale_beta \\hat{\\ddot{\\boldsymbol{\\beta}}} stdrot_scale_beta \\dot{\\mathbf{X}}\\hat{\\dot{\\boldsymbol{\\beta}}} Xb \\ddot{\\tilde{\\mathbf{X}}} \\hat{\\ddot{\\boldsymbol{\\beta}}} linear_predictors. Note: words, means linear_predictors code scale rotated re-standardized data! \\hat{\\boldsymbol{\\Sigma}} \\equiv \\hat{\\eta}\\mathbf{K} + (1 - \\hat{\\eta})\\mathbf{} estimated_Sigma. Similarly, \\hat{\\boldsymbol{\\Sigma}}_{11} Sigma_11, etc.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"processing-plink-files","dir":"Articles","previous_headings":"","what":"Processing PLINK files","title":"If your data is in PLINK files","text":"First, unzip PLINK files zipped. example data, penncath_lite data ships plmmr zipped; MacOS Linux, can run command unzip: GWAS data, tell plmmr combine information across three PLINK files (.bed, .bim, .fam files). process_plink(). , create files want temporary directory just sake example. Users can specify folder choice rds_dir, shown : ’ll see lot messages printed console … result creation 3 files: imputed_penncath_lite.rds imputed_penncath_lite.bk contain data. 1 show folder PLINK data . returned filepath. .rds object filepath contains processed data, now use create design. didactic purposes, let’s examine ’s imputed_penncath_lite.rds using readRDS() function (Note Don’t analysis - section reads data memory. just illustration):","code":"temp_dir <- tempdir() # using a temp dir -- change to fit your preference unzip_example_data(outdir = temp_dir) #> Unzipped files are saved in /tmp/Rtmph6hzBv # temp_dir <- tempdir() # using a temporary directory (if you didn't already create one above) plink_data <- process_plink(data_dir = temp_dir, data_prefix = \"penncath_lite\", rds_dir = temp_dir, rds_prefix = \"imputed_penncath_lite\", # imputing the mode to address missing values impute_method = \"mode\", # overwrite existing files in temp_dir # (you can turn this feature off if you need to) overwrite = TRUE, # turning off parallelization - # leaving this on causes problems knitting this vignette parallel = FALSE) #> #> Preprocessing penncath_lite data: #> Creating penncath_lite.rds #> #> There are 1401 observations and 4367 genomic features in the specified data files, representing chromosomes 1 - 22 #> There are a total of 3514 SNPs with missing values #> Of these, 13 are missing in at least 50% of the samples #> #> Imputing the missing (genotype) values using mode method #> #> process_plink() completed #> Processed files now saved as /tmp/Rtmph6hzBv/imputed_penncath_lite.rds pen <- readRDS(plink_data) # notice: this is a `processed_plink` object str(pen) # note: genotype data is *not* in memory #> List of 5 #> $ X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"imputed_penncath_lite.bk\" #> .. .. ..$ dirname : chr \"/tmp/Rtmph6hzBv/\" #> .. .. ..$ totalRows : int 1401 #> .. .. ..$ totalCols : int 4367 #> .. .. ..$ rowOffset : num [1:2] 0 1401 #> .. .. ..$ colOffset : num [1:2] 0 4367 #> .. .. ..$ nrow : num 1401 #> .. .. ..$ ncol : num 4367 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : NULL #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ map:'data.frame': 4367 obs. of 6 variables: #> ..$ chromosome : int [1:4367] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ marker.ID : chr [1:4367] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> ..$ genetic.dist: int [1:4367] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ physical.pos: int [1:4367] 2056735 3188505 4275291 4280630 4286036 4302161 4364564 4388885 4606471 4643688 ... #> ..$ allele1 : chr [1:4367] \"C\" \"T\" \"T\" \"G\" ... #> ..$ allele2 : chr [1:4367] \"T\" \"C\" \"C\" \"A\" ... #> $ fam:'data.frame': 1401 obs. of 6 variables: #> ..$ family.ID : int [1:1401] 10002 10004 10005 10007 10008 10009 10010 10011 10012 10013 ... #> ..$ sample.ID : int [1:1401] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ paternal.ID: int [1:1401] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ maternal.ID: int [1:1401] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ sex : int [1:1401] 1 2 1 1 1 1 1 2 1 2 ... #> ..$ affection : int [1:1401] 1 1 2 1 2 2 2 1 2 -9 ... #> $ n : int 1401 #> $ p : int 4367 #> - attr(*, \"class\")= chr \"processed_plink\" # notice: no more missing values in X any(is.na(pen$genotypes[,])) #> [1] FALSE"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"creating-a-design","dir":"Articles","previous_headings":"","what":"Creating a design","title":"If your data is in PLINK files","text":"Now ready create plmm_design, object pieces need model: design matrix \\mathbf{X}, outcome vector \\mathbf{y}, vector penalty factor indicators (1 = feature penalized, 0 = feature penalized). side note: GWAS studies, typical include non-genomic factors unpenalized covariates part model. instance, may want adjust sex age – factors want ensure always included selected model. plmmr package allows include additional unpenalized predictors via ‘add_predictor’ ‘predictor_id’ options, passed create_design() internal function create_design_filebacked(). example options included create_design() documentation. key part create_design() standardizing columns genotype matrix. didactic example showing columns std_X element design mean = 0 variance = 1. Note something analysis – reads data memory.","code":"# get outcome data penncath_pheno <- read.csv(find_example_data(path = 'penncath_clinical.csv')) phen <- data.frame(FamID = as.character(penncath_pheno$FamID), CAD = penncath_pheno$CAD) pen_design <- create_design(data_file = plink_data, feature_id = \"FID\", rds_dir = temp_dir, new_file = \"std_penncath_lite\", add_outcome = phen, outcome_id = \"FamID\", outcome_col = \"CAD\", logfile = \"design\", # again, overwrite if needed; use with caution overwrite = TRUE) #> There are 62 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-13 20:58:57 #> Done with standardization. File formatting in progress # examine the design - notice the components of this object pen_design_rds <- readRDS(pen_design) str(pen_design_rds) #> List of 16 #> $ X_colnames : chr [1:4367] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> $ X_rownames : chr [1:1401] \"10002\" \"10004\" \"10005\" \"10007\" ... #> $ n : int 1401 #> $ p : int 4367 #> $ is_plink : logi TRUE #> $ outcome_idx : int [1:1401] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : Named int [1:1401] 1 1 1 1 1 1 1 1 1 0 ... #> ..- attr(*, \"names\")= chr [1:1401] \"CAD1\" \"CAD2\" \"CAD3\" \"CAD4\" ... #> $ std_X_rownames: chr [1:1401] \"10002\" \"10004\" \"10005\" \"10007\" ... #> $ ns : int [1:4305] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:4305] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_penncath_lite.bk\" #> .. .. ..$ dirname : chr \"/tmp/Rtmph6hzBv/\" #> .. .. ..$ totalRows : int 1401 #> .. .. ..$ totalCols : int 4305 #> .. .. ..$ rowOffset : num [1:2] 0 1401 #> .. .. ..$ colOffset : num [1:2] 0 4305 #> .. .. ..$ nrow : num 1401 #> .. .. ..$ ncol : num 4305 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : NULL #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 1401 #> $ std_X_p : num 4305 #> $ std_X_center : num [1:4305] 0.00785 0.35974 1.01213 0.06067 0.46253 ... #> $ std_X_scale : num [1:4305] 0.0883 0.7783 0.8636 0.28 1.2791 ... #> $ penalty_factor: num [1:4305] 1 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\" # we can check to see that our data have been standardized std_X <- attach.big.matrix(pen_design_rds$std_X) colMeans(std_X[,]) |> summary() # columns have mean zero... #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> -1.356e-16 -2.334e-17 3.814e-19 9.868e-19 2.520e-17 2.635e-16 apply(std_X[,], 2, var) |> summary() # ... & variance 1 #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 1.001 1.001 1.001 1.001 1.001 1.001"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"fitting-a-model","dir":"Articles","previous_headings":"","what":"Fitting a model","title":"If your data is in PLINK files","text":"Now design object, ready fit model. default, model fitting results saved files folder specified rds_dir argument plmmm. want return model fitting results, set return_fit = TRUE plmm(). examine model results :","code":"pen_fit <- plmm(design = pen_design, trace = T, return_fit = T) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Input data passed all checks at 2024-12-13 20:58:58 #> Starting decomposition. #> Calculating the eigendecomposition of K #> Eigendecomposition finished at 2024-12-13 20:59:00 #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:00 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:03 #> Beta values are estimated -- almost done! #> Formatting results (backtransforming coefs. to original scale). #> Model ready at 2024-12-13 20:59:03 # you can turn off the trace messages by letting trace = F (default) summary(pen_fit, idx = 50) #> lasso-penalized regression model with n=1401, p=4368 at lambda=0.01211 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 537 #> ------------------------------------------------- plot(pen_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"cross-validation","dir":"Articles","previous_headings":"","what":"Cross validation","title":"If your data is in PLINK files","text":"choose tuning parameter model, plmmr offers cross validation method: plot summary methods CV models well:","code":"cv_fit <- cv_plmm(design = pen_design, type = \"blup\", return_fit = T, trace = T) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Starting decomposition. #> Calculating the eigendecomposition of K #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:05 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:08 #> 'Fold' argument is either NULL or missing; assigning folds randomly (by default). #> #> To specify folds for each observation, supply a vector with fold assignments. #> #> Starting cross validation #> | | | 0%Beginning eigendecomposition in fold 1 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 1 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:09 #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:11 #> | |============== | 20% #> Beginning eigendecomposition in fold 2 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 2 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:13 #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:15 #> | |============================ | 40%Beginning eigendecomposition in fold 3 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 3 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:16 #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:18 #> | |========================================== | 60%Beginning eigendecomposition in fold 4 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 4 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:20 #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:22 #> | |======================================================== | 80%Beginning eigendecomposition in fold 5 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 5 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-13 20:59:23 #> Beginning model fitting. #> Model fitting finished at 2024-12-13 20:59:25 #> | |======================================================================| 100% summary(cv_fit) # summary at lambda value that minimizes CV error #> lasso-penalized model with n=1401 and p=4368 #> At minimum cross-validation error (lambda=0.0406): #> ------------------------------------------------- #> Nonzero coefficients: 6 #> Cross-validation error (deviance): 0.22 #> Scale estimate (sigma): 0.471 plot(cv_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"details-create_design-for-plink-data","dir":"Articles","previous_headings":"","what":"Details: create_design() for PLINK data","title":"If your data is in PLINK files","text":"call create_design() involves steps: Integrate external phenotype information, supplied. Note: samples PLINK data phenotype value specified additional phenotype file removed analysis. Identify missing values samples SNPs/features. Impute missing values per user’s specified method. See R documentation bigsnpr::snp_fastImputeSimple() details. Note: plmmr package fit models datasets missing values. missing values must imputed subset analysis. Integrate external predictor information, supplied. matrix meta-data (e.g., age, principal components, etc.). Note: samples supplied file included PLINK data, removed. example, phenotyped participants genotyped participants study, plmmr::create_design() create matrix data representing genotyped samples also data supplied external phenotype file. Create design matrix represents nonsingular features samples predictor phenotype information available (case external data supplied). Standardize design matrix columns mean 0 variance 1.","code":""},{"path":"https://pbreheny.github.io/plmmr/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Tabitha K. Peter. Author. Anna C. Reisetter. Author. Patrick J. Breheny. Author, maintainer. Yujing Lu. Author.","code":""},{"path":"https://pbreheny.github.io/plmmr/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Reisetter , Breheny P (2021). “Penalized linear mixed models structured genetic data.” Genetic epidemiology, 45(5), 427–444. https://doi.org/10.1002/gepi.22384.","code":"@Article{, author = {Anna C. Reisetter and Patrick Breheny}, title = {Penalized linear mixed models for structured genetic data}, journal = {Genetic epidemiology}, year = {2021}, volume = {45}, pages = {427--444}, number = {5}, url = {https://doi.org/10.1002/gepi.22384}, }"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"plmmr-","dir":"","previous_headings":"","what":"plmmr","title":"Penalized Linear Mixed Models for Correlated Data","text":"plmmr (penalized linear mixed models R) package contains functions fit penalized linear mixed models correct unobserved confounding effects.","code":""},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Penalized Linear Mixed Models for Correlated Data","text":"install latest version package GitHub, use : can also install plmmr CRAN: description motivation functions package (along examples) refer second module GWAS data tutorial","code":"devtools::install_github(\"pbreheny/plmmr\") install.packages('plmmr')"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"minimal-example","dir":"","previous_headings":"","what":"Minimal example","title":"Penalized Linear Mixed Models for Correlated Data","text":"","code":"library(plmmr) X <- rnorm(100*20) |> matrix(100, 20) y <- rnorm(100) fit <- plmm(X, y) plot(fit) cvfit <- cv_plmm(X, y) plot(cvfit) summary(cvfit)"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"so-how-fast-is-plmmr-and-how-well-does-it-scale","dir":"","previous_headings":"","what":"So how fast is plmmr? And how well does it scale?","title":"Penalized Linear Mixed Models for Correlated Data","text":"illustrate important questions, created separate GitHub repository scripts plmmr workflow using publicly-available genome-wide association (GWAS) data. main takeaway: using GWAS data study 1,400 samples 800,000 SNPs, full plmmr analysis run half hour using single core laptop. Three smaller datasets ship plmmr, tutorials walking analyze data sets documented documentation site. datasets useful didactic purposes, large enough really highlight computational scalability plmmr – motivated creation separate repository GWAS workflow.","code":""},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"note-on-branches","dir":"","previous_headings":"","what":"Note on branches","title":"Penalized Linear Mixed Models for Correlated Data","text":"branches repo organized following way: master main (‘head’) branch. gh_pages keeping documentation plmmr gwas_scale archived branch contains development version package used run dissertation analysis. delete eventually.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"helper function implement MCP penalty helper functions implement penalty.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"","code":"MCP(z, l1, l2, gamma, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"z vector representing solution active set feature l1 upper bound (beta) l2 lower bound (beta) gamma tuning parameter MCP penalty v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"numeric vector MCP-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement SCAD penalty — SCAD","title":"helper function to implement SCAD penalty — SCAD","text":"helper function implement SCAD penalty","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement SCAD penalty — SCAD","text":"","code":"SCAD(z, l1, l2, gamma, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement SCAD penalty — SCAD","text":"z solution active set feature l1 upper bound l2 lower bound gamma tuning parameter SCAD penalty v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement SCAD penalty — SCAD","text":"numeric vector SCAD-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to add predictors to a filebacked matrix of data — add_predictors","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"helper function add predictors filebacked matrix data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"","code":"add_predictors(obj, add_predictor, id_var, rds_dir, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"obj bigSNP object add_predictor Optional: add additional covariates/predictors/features external file (.e., PLINK file). id_var String specifying column PLINK .fam file unique sample identifiers. rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir(process_plink() call) quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"list 2 components: 'obj' - bigSNP object added element representing matrix includes additional predictors first columns 'non_gen' - integer vector ranges 1 number added predictors. Example: 2 predictors added, unpen= 1:2","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":null,"dir":"Reference","previous_headings":"","what":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"function designed use BLUP prediction. objective get matrix estimated beta coefficients standardized scale, dimension original/training data. adding rows 0s std_scale_beta matrix corresponding singular features X.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"","code":"adjust_beta_dimension(std_scale_beta, p, std_X_details, fbm_flag, plink_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"std_scale_beta matrix estimated beta coefficients scale standardized original/training data Note: rows matrix represent nonsingular columns design matrix p number columns original/training design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' fbm_flag Logical: model fit filebacked? plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"std_scale_b_og_dim: matrix estimated beta coefs. still scale std_X, dimension X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":null,"dir":"Reference","previous_headings":"","what":"Admix: Semi-simulated SNP data — admix","title":"Admix: Semi-simulated SNP data — admix","text":"dataset containing 100 SNPs, demographic variable representing race, simulated outcome","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Admix: Semi-simulated SNP data — admix","text":"","code":"admix"},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Admix: Semi-simulated SNP data — admix","text":"list 3 components X SNP matrix (197 observations 100 SNPs) y vector simulated (continuous) outcomes race vector racial group categorization: # 0 = African, 1 = African American, 2 = European, 3 = Japanese","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"source","dir":"Reference","previous_headings":"","what":"Source","title":"Admix: Semi-simulated SNP data — admix","text":"https://hastie.su.domains/CASI/","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to support process_plink() — align_ids","title":"A helper function to support process_plink() — align_ids","text":"helper function support process_plink()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to support process_plink() — align_ids","text":"","code":"align_ids(id_var, quiet, add_predictor, og_ids)"},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to support process_plink() — align_ids","text":"id_var String specifying variable name ID column quiet Logical: message printed? add_predictor External data include design matrix. add_predictors... arg process_plink() og_ids Character vector PLINK ids (FID IID) original data (.e., data subsetting handling missing phenotypes)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to support process_plink() — align_ids","text":"matrix dimensions add_predictor","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":null,"dir":"Reference","previous_headings":"","what":"a version of cbind() for file-backed matrices — big_cbind","title":"a version of cbind() for file-backed matrices — big_cbind","text":"version cbind() file-backed matrices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a version of cbind() for file-backed matrices — big_cbind","text":"","code":"big_cbind(A, B, C, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a version of cbind() for file-backed matrices — big_cbind","text":"-memory data B file-backed data C file-backed placeholder combined data quiet Logical","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a version of cbind() for file-backed matrices — big_cbind","text":"C, filled column values B combined","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":null,"dir":"Reference","previous_headings":"","what":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"check_for_file_extension: function make package 'smart' enough handle .rds file extensions","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"","code":"check_for_file_extension(path)"},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"path string specifying file path ends file name, e.g. \"~/dir/my_file.rds\"","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"string filepath without extension, e.g. \"~/dir/my_file\"","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Coef method for ","title":"Coef method for ","text":"Coef method \"cv_plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Coef method for ","text":"","code":"# S3 method for class 'cv_plmm' coef(object, lambda, which = object$min, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Coef method for ","text":"object object class \"cv_plmm.\" lambda numeric vector lambda values. Vector lambda indices coefficients return. Defaults lambda index minimum CVE. ... Additional arguments (used).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Coef method for ","text":"Returns named numeric vector. Values coefficients model specified value either lambda . Names values lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Coef method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design, return_fit = TRUE) head(coef(cv_fit)) #> (Intercept) Snp1 Snp2 Snp3 Snp4 Snp5 #> 4.326474 0.000000 0.000000 0.000000 0.000000 0.000000"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Coef method for ","title":"Coef method for ","text":"Coef method \"plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Coef method for ","text":"","code":"# S3 method for class 'plmm' coef(object, lambda, which = 1:length(object$lambda), drop = TRUE, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Coef method for ","text":"object object class \"plmm.\" lambda numeric vector lambda values. Vector lambda indices coefficients return. drop Logical. ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Coef method for ","text":"Either numeric matrix (model fit data stored memory) sparse matrix (model fit data stored filebacked). Rownames feature names, columns values lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Coef method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) coef(fit)[1:10, 41:45] #> 0.02673 0.02493 0.02325 0.02168 0.02022 #> (Intercept) 6.556445885 6.59257224 6.62815211 6.66317769 6.69366816 #> Snp1 -0.768261488 -0.78098090 -0.79310257 -0.80456803 -0.81482505 #> Snp2 0.131945426 0.13991539 0.14735024 0.15387884 0.15929074 #> Snp3 2.826806831 2.83842545 2.84879468 2.85860151 2.86047026 #> Snp4 0.036981534 0.04652885 0.05543821 0.06376126 0.07133592 #> Snp5 0.546784811 0.57461391 0.60049082 0.62402782 0.64291324 #> Snp6 -0.026215632 -0.03072017 -0.03494534 -0.03889146 -0.04256362 #> Snp7 0.009342269 0.01539705 0.02103262 0.02615358 0.03069956 #> Snp8 0.000000000 0.00000000 0.00000000 0.00000000 0.00000000 #> Snp9 0.160794660 0.16217570 0.16337102 0.16464901 0.16638663"},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"function create estimated variance matrix PLMM fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"","code":"construct_variance(fit, K = NULL, eta = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"fit object returned plmm() K optional matrix eta optional numeric value 0 1; fit supplied, option must specified.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"Sigma_hat, matrix representing estimated variance","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to count constant features — count_constant_features","title":"A helper function to count constant features — count_constant_features","text":"helper function count constant features","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to count constant features — count_constant_features","text":"","code":"count_constant_features(fbm, outfile, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to count constant features — count_constant_features","text":"fbm filebacked big.matrix outfile String specifying name log file quiet Logical: message printed console","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to count constant features — count_constant_features","text":"ns numeric vector indices non-singular columns matrix associated counts","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to count the number of cores available on the current machine — count_cores","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"helper function count number cores available current machine","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"","code":"count_cores()"},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"number cores use; parallel installed, parallel::detectCores(). Otherwise, returns 1.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to create a design for PLMM modeling — create_design","title":"a function to create a design for PLMM modeling — create_design","text":"function create design PLMM modeling","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to create a design for PLMM modeling — create_design","text":"","code":"create_design(data_file = NULL, rds_dir = NULL, X = NULL, y = NULL, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a function to create a design for PLMM modeling — create_design","text":"data_file filebacked data (data process_plink() process_delim()), filepath processed data. Defaults NULL (argument apply -memory data). rds_dir filebacked data, filepath directory/folder want design saved. Note: include/append name want --created file – name argument new_file, passed create_design_filebacked(). Defaults NULL (argument apply -memory data). X -memory data (data matrix data frame), design matrix. Defaults NULL (argument apply filebacked data). y -memory data, numeric vector representing outcome. Defaults NULL (argument apply filebacked data). Note: responsibility user ensure rows X corresponding elements y row order, .e., observations must order design matrix outcome vector. ... Additional arguments pass create_design_filebacked() create_design_in_memory(). See documentation helper functions details.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to create a design for PLMM modeling — create_design","text":"filepath object class plmm_design, named list design matrix, outcome, penalty factor vector, details needed fitting model. list stored .rds file filebacked data, filebacked case string path file returned. -memory data, list returned.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"a function to create a design for PLMM modeling — create_design","text":"function wrapper create_design...() inner functions; arguments included passed along create_design...() inner function matches type data supplied. Note arguments optional ones . Additional arguments filebacked data: new_file User-specified filename (without .bk/.rds extension) --created .rds/.bk files. Must different existing .rds/.bk files folder. feature_id Optional: string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. - PLINK data: string specifying ID column PLINK .fam file. Options \"IID\" (default) \"FID\" - filebacked data: character vector unique identifiers (IDs) row feature data (.e., data processed process_delim()) - left NULL (default), X assumed row-order add_outcome. Note: assumption made error, calculations downstream incorrect. Pay close attention . add_outcome data frame matrix two columns: ID column column outcome value (used 'y' final design). IDs must characters, outcome must numeric. outcome_id string specifying name ID column 'add_outcome' outcome_col string specifying name phenotype column 'add_outcome' na_outcome_vals Optional: vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). overwrite Optional: logical - existing .rds files overwritten? Defaults FALSE. logfile Optional: name '.log' file written – Note: append .log filename; done automatically. quiet Optional: logical - messages printed console silenced? Defaults FALSE Additional arguments specific PLINK data: add_predictor Optional (PLINK data ): matrix data frame used adding additional unpenalized covariates/predictors/features external file (.e., PLINK file). matrix must one column ID column; columns aside ID used covariates design matrix. Columns must named. predictor_id Optional (PLINK data ): string specifying name column 'add_predictor' sample IDs. Required 'add_predictor' supplied. names used subset align external covariate supplied PLINK data. Additional arguments specific delimited file data: unpen Optional: character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, delimited file must column names. Additional arguments -memory data: unpen Optional: character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"a function to create a design for PLMM modeling — create_design","text":"","code":"## Example 1: matrix data in-memory ## admix_design <- create_design(X = admix$X, y = admix$y, unpen = \"Snp1\") ## Example 2: delimited data ## # process delimited data temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), overwrite = TRUE, rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", header = TRUE) #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/Rtmpm57oVf/processed_colon2.rds # prepare outcome data colon_outcome <- read.delim(find_example_data(path = \"colon2_outcome.txt\")) # create a design colon_design <- create_design(data_file = colon_dat, rds_dir = temp_dir, new_file = \"std_colon2\", add_outcome = colon_outcome, outcome_id = \"ID\", outcome_col = \"y\", unpen = \"sex\", overwrite = TRUE, logfile = \"test.log\") #> No feature_id supplied; will assume data X are in same row-order as add_outcome. #> There are 0 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-13 20:58:19 #> Done with standardization. File formatting in progress # look at the results colon_rds <- readRDS(colon_design) str(colon_rds) #> List of 18 #> $ X_colnames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ X_rownames : chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ n : num 62 #> $ p : num 2001 #> $ is_plink : logi FALSE #> $ outcome_idx : int [1:62] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : int [1:62] 1 0 1 0 1 0 1 0 1 0 ... #> $ std_X_rownames: chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ unpen : int 1 #> $ unpen_colnames: chr \"sex\" #> $ ns : int [1:2001] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/Rtmpm57oVf/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 62 #> $ std_X_p : num 2001 #> $ std_X_center : num [1:2001] 1.47 7015.79 4966.96 4094.73 3987.79 ... #> $ std_X_scale : num [1:2001] 0.499 3067.926 2171.166 1803.359 2002.738 ... #> $ penalty_factor: num [1:2001] 0 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\" ## Example 3: PLINK data ## # \\donttest{ # process PLINK data temp_dir <- tempdir() unzip_example_data(outdir = temp_dir) #> Unzipped files are saved in /tmp/Rtmpm57oVf plink_data <- process_plink(data_dir = temp_dir, data_prefix = \"penncath_lite\", rds_dir = temp_dir, rds_prefix = \"imputed_penncath_lite\", # imputing the mode to address missing values impute_method = \"mode\", # overwrite existing files in temp_dir # (you can turn this feature off if you need to) overwrite = TRUE, # turning off parallelization - leaving this on causes problems knitting this vignette parallel = FALSE) #> #> Preprocessing penncath_lite data: #> Creating penncath_lite.rds #> #> There are 1401 observations and 4367 genomic features in the specified data files, representing chromosomes 1 - 22 #> There are a total of 3514 SNPs with missing values #> Of these, 13 are missing in at least 50% of the samples #> #> Imputing the missing (genotype) values using mode method #> #> process_plink() completed #> Processed files now saved as /tmp/Rtmpm57oVf/imputed_penncath_lite.rds # get outcome data penncath_pheno <- read.csv(find_example_data(path = 'penncath_clinical.csv')) outcome <- data.frame(FamID = as.character(penncath_pheno$FamID), CAD = penncath_pheno$CAD) unpen_predictors <- data.frame(FamID = as.character(penncath_pheno$FamID), sex = penncath_pheno$sex, age = penncath_pheno$age) # create design where sex and age are always included in the model pen_design <- create_design(data_file = plink_data, feature_id = \"FID\", rds_dir = temp_dir, new_file = \"std_penncath_lite\", add_outcome = outcome, outcome_id = \"FamID\", outcome_col = \"CAD\", add_predictor = unpen_predictors, predictor_id = \"FamID\", logfile = \"design\", # again, overwrite if needed; use with caution overwrite = TRUE) #> #> Aligning external data with the feature data by FamID #> Adding predictors from external data. #> Aligning IDs between fam and predictor files #> Column-wise combining data sets #> | | | 0% | | | 1% | |= | 1% | |= | 2% | |== | 2% | |== | 3% | |== | 4% | |=== | 4% | |=== | 5% | |==== | 5% | |==== | 6% | |===== | 6% | |===== | 7% | |===== | 8% | |====== | 8% | |====== | 9% | |======= | 9% | |======= | 10% | |======= | 11% | |======== | 11% | |======== | 12% | |========= | 12% | |========= | 13% | |========= | 14% | |========== | 14% | |========== | 15% | |=========== | 15% | |=========== | 16% | |============ | 16% | |============ | 17% | |============ | 18% | |============= | 18% | |============= | 19% | |============== | 19% | |============== | 20% | |============== | 21% | |=============== | 21% | |=============== | 22% | |================ | 22% | |================ | 23% | |================ | 24% | |================= | 24% | |================= | 25% | |================== | 25% | |================== | 26% | |=================== | 26% | |=================== | 27% | |=================== | 28% | |==================== | 28% | |==================== | 29% | |===================== | 29% | |===================== | 30% | |===================== | 31% | |====================== | 31% | |====================== | 32% | |======================= | 32% | |======================= | 33% | |======================= | 34% | |======================== | 34% | |======================== | 35% | |========================= | 35% | |========================= | 36% | |========================== | 36% | |========================== | 37% | |========================== | 38% | |=========================== | 38% | |=========================== | 39% | |============================ | 39% | |============================ | 40% | |============================ | 41% | |============================= | 41% | |============================= | 42% | |============================== | 42% | |============================== | 43% | |============================== | 44% | |=============================== | 44% | |=============================== | 45% | |================================ | 45% | |================================ | 46% | |================================= | 46% | |================================= | 47% | |================================= | 48% | |================================== | 48% | |================================== | 49% | |=================================== | 49% | |=================================== | 50% | |=================================== | 51% | |==================================== | 51% | |==================================== | 52% | |===================================== | 52% | |===================================== | 53% | |===================================== | 54% | |====================================== | 54% | |====================================== | 55% | |======================================= | 55% | |======================================= | 56% | |======================================== | 56% | |======================================== | 57% | |======================================== | 58% | |========================================= | 58% | |========================================= | 59% | |========================================== | 59% | |========================================== | 60% | |========================================== | 61% | |=========================================== | 61% | |=========================================== | 62% | |============================================ | 62% | |============================================ | 63% | |============================================ | 64% | |============================================= | 64% | |============================================= | 65% | |============================================== | 65% | |============================================== | 66% | |=============================================== | 66% | |=============================================== | 67% | |=============================================== | 68% | |================================================ | 68% | |================================================ | 69% | |================================================= | 69% | |================================================= | 70% | |================================================= | 71% | |================================================== | 71% | |================================================== | 72% | |=================================================== | 72% | |=================================================== | 73% | |=================================================== | 74% | |==================================================== | 74% | |==================================================== | 75% | |===================================================== | 75% | |===================================================== | 76% | |====================================================== | 76% | |====================================================== | 77% | |====================================================== | 78% | |======================================================= | 78% | |======================================================= | 79% | |======================================================== | 79% | |======================================================== | 80% | |======================================================== | 81% | |========================================================= | 81% | |========================================================= | 82% | |========================================================== | 82% | |========================================================== | 83% | |========================================================== | 84% | |=========================================================== | 84% | |=========================================================== | 85% | |============================================================ | 85% | |============================================================ | 86% | |============================================================= | 86% | |============================================================= | 87% | |============================================================= | 88% | |============================================================== | 88% | |============================================================== | 89% | |=============================================================== | 89% | |=============================================================== | 90% | |=============================================================== | 91% | |================================================================ | 91% | |================================================================ | 92% | |================================================================= | 92% | |================================================================= | 93% | |================================================================= | 94% | |================================================================== | 94% | |================================================================== | 95% | |=================================================================== | 95% | |=================================================================== | 96% | |==================================================================== | 96% | |==================================================================== | 97% | |==================================================================== | 98% | |===================================================================== | 98% | |===================================================================== | 99% | |======================================================================| 99% | |======================================================================| 100% #> There are 62 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-13 20:58:21 #> Done with standardization. File formatting in progress # examine the design - notice the components of this object pen_design_rds <- readRDS(pen_design) # }"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"function create design matrix, outcome, penalty factor passed model fitting function","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"","code":"create_design_filebacked( data_file, rds_dir, obj, new_file, feature_id = NULL, add_outcome, outcome_id, outcome_col, na_outcome_vals = c(-9, NA_integer_), add_predictor = NULL, predictor_id = NULL, unpen = NULL, logfile = NULL, overwrite = FALSE, quiet = FALSE )"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"data_file filepath rds file processed data (data process_plink() process_delim()) rds_dir path directory want create new '.rds' '.bk' files. obj RDS object read create_design() new_file User-specified filename (without .bk/.rds extension) --created .rds/.bk files. Must different existing .rds/.bk files folder. feature_id string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. - PLINK data: string specifying ID column PLINK .fam file. Options \"IID\" (default) \"FID\" - filebacked data: character vector unique identifiers (IDs) row feature data (.e., data processed process_delim()) - left NULL (default), X assumed row-order add_outcome. Note: assumption made error, calculations downstream incorrect. Pay close attention . add_outcome data frame matrix two columns: ID column column outcome value (used 'y' final design). IDs must characters, outcome must numeric. outcome_id string specifying name ID column 'add_outcome' outcome_col string specifying name phenotype column 'add_outcome' na_outcome_vals vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). add_predictor Optional (PLINK data ): matrix data frame used adding additional unpenalized covariates/predictors/features external file (.e., PLINK file). matrix must one column ID column; columns aside ID used covariates design matrix. Columns must named. predictor_id Optional (PLINK data ): string specifying name column 'add_predictor' sample IDs. Required 'add_predictor' supplied. names used subset align external covariate supplied PLINK data. unpen Optional (delimited file data ): optional character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names. logfile Optional: name '.log' file written – Note: append .log filename; done automatically. overwrite Logical: existing .rds files overwritten? Defaults FALSE. quiet Logical: messages printed console silenced? Defaults FALSE","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"filepath created .rds file containing information model fitting, including standardized X model design information","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to create a design with an in-memory X matrix — create_design_in_memory","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"function create design -memory X matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"","code":"create_design_in_memory(X, y, unpen = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"X numeric matrix rows correspond observations (e.g., samples) columns correspond features. y numeric vector representing outcome model. Note: responsibility user ensure outcome_col X row order! unpen optional character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"list elements including standardized X model design information","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":null,"dir":"Reference","previous_headings":"","what":"create_log_file — create_log","title":"create_log_file — create_log","text":"create_log_file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"create_log_file — create_log","text":"","code":"create_log(outfile, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"create_log_file — create_log","text":"outfile String specifying name --created file, without extension ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"create_log_file — create_log","text":"Nothing returned, intead text file suffix .log created.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Cross-validation for plmm — cv_plmm","title":"Cross-validation for plmm — cv_plmm","text":"Performs k-fold cross validation lasso-, MCP-, SCAD-penalized linear mixed models grid values regularization parameter lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Cross-validation for plmm — cv_plmm","text":"","code":"cv_plmm( design, y = NULL, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", type = \"blup\", gamma, alpha = 1, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, dfmax = NULL, warn = TRUE, init = NULL, cluster, nfolds = 5, seed, fold = NULL, returnY = FALSE, returnBiasDetails = FALSE, trace = FALSE, save_rds = NULL, save_fold_res = FALSE, return_fit = TRUE, compact_save = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Cross-validation for plmm — cv_plmm","text":"design first argument must one three things: (1) plmm_design object (created create_design()) (2) string file path design object (file path must end '.rds') (3) matrix data.frame object representing design matrix interest y Optional: case design matrix data.frame, user must also supply numeric outcome vector y argument. case, design y passed internally create_design(X = design, y = y). K Similarity matrix used rotate data. either (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'u', returned choose_k(). diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"lasso\" (default), \"SCAD\", \"MCP\". type character argument indicating returned predict.plmm(). type == 'lp', predictions based linear predictor, X beta. type == 'blup', predictions based sum linear predictor estimated random effect (BLUP). Defaults 'blup', shown superior prediction method many applications. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (future idea; yet incorporated) Calculate index objective function ceases locally convex? Default TRUE. dfmax (future idea; yet incorporated) Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. warn Return warning messages failures converge model saturation? Default TRUE. init Initial values coefficients. Default 0 columns X. cluster cv_plmm() can run parallel across cluster using parallel package. cluster must set advance using parallel::makeCluster(). cluster must passed cv_plmm(). nfolds number cross-validation folds. Default 5. seed may set seed random number generator order obtain reproducible results. fold fold observation belongs . default, observations randomly assigned. returnY cv_plmm() return linear predictors cross-validation folds? Default FALSE; TRUE, return matrix element row , column j fitted value observation fold observation excluded fit, jth value lambda. returnBiasDetails Logical: cross-validation bias (numeric value) loss (n x p matrix) returned? Defaults FALSE. trace set TRUE, inform user progress announcing beginning CV fold. Default FALSE. save_rds Optional: filepath name without '.rds' suffix specified (e.g., save_rds = \"~/dir/my_results\"), model results saved provided location (e.g., \"~/dir/my_results.rds\"). Defaults NULL, save result. save_fold_res Optional: logical value indicating whether results (loss predicted values) CV fold saved? TRUE, two '.rds' files saved ('loss' 'yhat') created directory 'save_rds'. files updated fold done. Defaults FALSE. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE. compact_save Optional: TRUE, three separate .rds files saved: one 'beta_vals', one 'K', one everything else (see ). Defaults FALSE. Note: must specify save_rds argument called. ... Additional arguments plmm_fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Cross-validation for plmm — cv_plmm","text":"list 12 items: type: type prediction used ('lp' 'blup') cve: numeric vector cross validation error (CVE) value lambda cvse: numeric vector estimated standard error associated value cve fold: numeric n length vector integers indicating fold observation assigned lambda: numeric vector lambda values fit: overall fit object, including predictors; list returned plmm() min: index corresponding value lambda minimizes cve lambda_min: lambda value cve minmized min1se: index corresponding value lambda within standard error minimizes cve lambda1se: largest value lambda error within 1 standard error minimum. null.dev: numeric value representing deviance intercept-model. supplied lambda sequence, quantity may meaningful. estimated_Sigma: n x n matrix representing estimated covariance matrix.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Cross-validation for plmm — cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) print(summary(cv_fit)) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2493): #> ------------------------------------------------- #> Nonzero coefficients: 5 #> Cross-validation error (deviance): 2.00 #> Scale estimate (sigma): 1.413 plot(cv_fit) # Note: for examples with filebacked data, see the filebacking vignette # https://pbreheny.github.io/plmmr/articles/filebacking.html"},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":null,"dir":"Reference","previous_headings":"","what":"Cross-validation internal function for cv_plmm — cvf","title":"Cross-validation internal function for cv_plmm — cvf","text":"Internal function cv_plmm calls plmm fold subset original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Cross-validation internal function for cv_plmm — cvf","text":"","code":"cvf(i, fold, type, cv_args, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Cross-validation internal function for cv_plmm — cvf","text":"Fold number excluded fit. fold n-length vector fold-assignments. type character argument indicating returned predict.plmm. type == 'lp' predictions based linear predictor, $X beta$. type == 'individual' predictions based linear predictor plus estimated random effect (BLUP). cv_args List additional arguments passed plmm. ... Optional arguments predict_within_cv","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Cross-validation internal function for cv_plmm — cvf","text":"list three elements: numeric vector loss value lambda numeric value indicating number lambda values used numeric value predicted outcome (y hat) values lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"function take eigendecomposition K Note: faster taking SVD X p >> n","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"","code":"eigen_K(std_X, fbm_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"std_X standardized design matrix, stored big.matrix object. fbm_flag Logical: std_X FBM obejct? Passed plmm().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"list eigenvectors eigenvalues K","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":null,"dir":"Reference","previous_headings":"","what":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"Estimate eta (used rotating data) function called internally plmm()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"","code":"estimate_eta(n, s, U, y, eta_star)"},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"n number observations s singular values K, realized relationship matrix U left-singular vectors standardized design matrix y Continuous outcome vector.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"numeric value estimated value eta, variance parameter","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":null,"dir":"Reference","previous_headings":"","what":"Functions to convert between FBM and big.matrix type objects — fbm2bm","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"Functions convert FBM big.matrix type objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"","code":"fbm2bm(fbm, desc = FALSE)"},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"fbm FBM object; see bigstatsr::FBM() details desc Logical: descriptor file desired (opposed filebacked big matrix)? Defaults FALSE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"big.matrix - see bigmemory::filebacked.big.matrix() details","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to get the file path of a file without the extension — file_sans_ext","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"helper function get file path file without extension","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"","code":"file_sans_ext(path)"},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"path path file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"path_sans_ext filepath without extension","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to help with accessing example PLINK files — find_example_data","title":"A function to help with accessing example PLINK files — find_example_data","text":"function help accessing example PLINK files","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to help with accessing example PLINK files — find_example_data","text":"","code":"find_example_data(path, parent = FALSE)"},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to help with accessing example PLINK files — find_example_data","text":"path Argument (string) specifying path (filename) external data file extdata/ parent path=TRUE user wants name parent directory file located, set parent=TRUE. Defaults FALSE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to help with accessing example PLINK files — find_example_data","text":"path=NULL, character vector file names returned. path given, character string full file path","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to help with accessing example PLINK files — find_example_data","text":"","code":"find_example_data(parent = TRUE) #> [1] \"/home/runner/work/_temp/Library/plmmr/extdata\""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":null,"dir":"Reference","previous_headings":"","what":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"Read processed data function intended called either process_plink() process_delim() called .","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"","code":"get_data(path, returnX = FALSE, trace = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"path file path RDS object containing processed data. add '.rds' extension path. returnX Logical: design matrix returned numeric matrix stored memory. default, FALSE. trace Logical: trace messages shown? Default TRUE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"list components: std_X, column-standardized design matrix either (1) numeric matrix (2) filebacked matrix (FBM). See bigstatsr::FBM() bigsnpr::bigSnp-class documentation details. (PLINK data) fam, data frame containing pedigree information (like .fam file PLINK) (PLINK data) map, data frame containing feature information (like .bim file PLINK) ns: vector indicating columns X contain nonsingular features (.e., features variance != 0. center: vector values centering column X scale: vector values scaling column X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to return the computer's host name — get_hostname","title":"a function to return the computer's host name — get_hostname","text":"function return computer's host name","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to return the computer's host name — get_hostname","text":"","code":"get_hostname()"},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to return the computer's host name — get_hostname","text":"String hostname current machine","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to impute SNP data — impute_snp_data","title":"A function to impute SNP data — impute_snp_data","text":"function impute SNP data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to impute SNP data — impute_snp_data","text":"","code":"impute_snp_data( obj, X, impute, impute_method, parallel, outfile, quiet, seed = as.numeric(Sys.Date()), ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to impute SNP data — impute_snp_data","text":"obj bigSNP object (created read_plink_files()) X matrix genotype data returned name_and_count_bigsnp impute Logical: data imputed? Default TRUE. impute_method 'impute' = TRUE, argument specify kind imputation desired. Options : mode (default): Imputes frequent call. See bigsnpr::snp_fastImputeSimple() details. random: Imputes sampling according allele frequencies. mean0: Imputes rounded mean. mean2: Imputes mean rounded 2 decimal places. xgboost: Imputes using algorithm based local XGBoost models. See bigsnpr::snp_fastImpute() details. Note: can take several minutes, even relatively small data set. parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults TRUE seed Numeric value passed seed impute_method = 'xgboost'. Defaults .numeric(Sys.Date()) ... Optional: additional arguments bigsnpr::snp_fastImpute() (relevant impute_method = \"xgboost\")","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to impute SNP data — impute_snp_data","text":"Nothing returned, obj$genotypes overwritten imputed version data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to align genotype and phenotype data — index_samples","title":"A function to align genotype and phenotype data — index_samples","text":"function align genotype phenotype data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to align genotype and phenotype data — index_samples","text":"","code":"index_samples( obj, rds_dir, indiv_id, add_outcome, outcome_id, outcome_col, na_outcome_vals, outfile, quiet )"},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to align genotype and phenotype data — index_samples","text":"obj object created process_plink() rds_dir path directory want create new '.rds' '.bk' files. indiv_id character string indicating ID column name 'fam' element genotype data list. Defaults 'sample.ID', equivalent 'IID' PLINK. option 'family.ID', equivalent 'FID' PLINK. add_outcome data frame least two columns: ID column phenotype column outcome_id string specifying name ID column pheno outcome_col string specifying name phenotype column pheno. column used default y argument 'plmm()'. na_outcome_vals vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). outfile string name filepath log file quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to align genotype and phenotype data — index_samples","text":"list two items: data.table rows corresponding samples genotype phenotype available. numeric vector indices indicating samples 'complete' (.e., samples add_outcome corresponding data PLINK files)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":null,"dir":"Reference","previous_headings":"","what":"Helper function to index standardized data — index_std_X","title":"Helper function to index standardized data — index_std_X","text":"Helper function index standardized data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Helper function to index standardized data — index_std_X","text":"","code":"index_std_X(std_X_p, non_genomic)"},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Helper function to index standardized data — index_std_X","text":"std_X_p number features standardized matrix data (may filebacked) non_genomic Integer vector columns std_X representing non-genomic data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Helper function to index standardized data — index_std_X","text":"list indices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":null,"dir":"Reference","previous_headings":"","what":"Generate nicely formatted lambda vec — lam_names","title":"Generate nicely formatted lambda vec — lam_names","text":"Generate nicely formatted lambda vec","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Generate nicely formatted lambda vec — lam_names","text":"","code":"lam_names(l)"},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Generate nicely formatted lambda vec — lam_names","text":"l Vector lambda values.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Generate nicely formatted lambda vec — lam_names","text":"character vector formatted lambda value names","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement lasso penalty — lasso","title":"helper function to implement lasso penalty — lasso","text":"helper function implement lasso penalty","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement lasso penalty — lasso","text":"","code":"lasso(z, l1, l2, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement lasso penalty — lasso","text":"z solution active set feature l1 upper bound l2 lower bound v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement lasso penalty — lasso","text":"numeric vector lasso-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":null,"dir":"Reference","previous_headings":"","what":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"function allows evaluate negative log-likelihood linear mixed model assumption null model order estimate variance parameter, eta.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"","code":"log_lik(eta, n, s, U, y, rot_y = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"eta proportion variance outcome attributable causal SNP effects. words, signal--noise ratio. n number observations s singular values K, realized relationship matrix U left-singular vectors standardized design matrix y Continuous outcome vector. rot_y Optional: y already rotated, can supplied.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"value log-likelihood PLMM, evaluated supplied parameters","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"helper function label summarize contents bigSNP","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"","code":"name_and_count_bigsnp(obj, id_var, quiet, outfile)"},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"obj bigSNP object, possibly subset add_external_phenotype() id_var String specifying column PLINK .fam file unique sample identifiers. Options \"IID\" (default) \"FID\". quiet Logical: messages printed console? Defaults TRUE outfile string name .log file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"list components: counts: column-wise summary minor allele counts 'genotypes' obj: modified bigSNP list additional components X: obj$genotypes FBM pos: obj$map$physical.pos vector","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"Fit linear mixed model via non-convex penalized maximum likelihood.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"","code":"plmm( design, y = NULL, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", init = NULL, gamma, alpha = 1, dfmax = NULL, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, warn = TRUE, trace = FALSE, save_rds = NULL, compact_save = FALSE, return_fit = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"design first argument must one three things: (1) plmm_design object (created create_design()) (2) string file path design object (file path must end '.rds') (3) matrix data.frame object representing design matrix interest y Optional: case design matrix data.frame, user must also supply numeric outcome vector y argument. case, design y passed internally create_design(X = design, y = y). K Similarity matrix used rotate data. either : (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'U', returned previous plmm() model fit data. diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"lasso\" (default), \"SCAD\", \"MCP\". init Initial values coefficients. Default 0 columns X. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. dfmax (Future idea; yet incorporated): Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (Future idea; yet incorporated): Calculate index objective function ceases locally convex? Default TRUE. warn Return warning messages failures converge model saturation? Default TRUE. trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. save_rds Optional: filepath name without '.rds' suffix specified (e.g., save_rds = \"~/dir/my_results\"), model results saved provided location (e.g., \"~/dir/my_results.rds\"). Defaults NULL, save result. compact_save Optional: TRUE, three separate .rds files saved: one 'beta_vals', one 'K', one linear predictors, one everything else (see ). Defaults FALSE. Note: must specify save_rds argument called. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE -memory data, defaults FALSE filebacked data. ... Additional optional arguments plmm_checks()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"list includes 19 items: beta_vals: matrix estimated coefficients original scale. Rows predictors, columns values lambda std_scale_beta: matrix estimated coefficients ~standardized~ scale. returned compact_save = TRUE. std_X_details: list 3 items: center & scale values used center/scale data, vector ('ns') nonsingular columns original data. Nonsingular columns standardized (definition), removed analysis. std_X: standardized design matrix; data filebacked, object filebacked.big.matrix bigmemory package. Note: std_X saved/returned return_fit = FALSE. y: outcome vector used model fitting. p: total number columns design matrix (including singular columns). plink_flag: logical flag: data come PLINK files? lambda: numeric vector lasso tuning parameter values used model fitting. eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure linear_predictors: matrix resulting product stdrot_X estimated coefficients ~rotated~ scale. penalty: character string indicating penalty model fit (e.g., 'MCP') gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. ns_idx: vector indices predictors non-singular features (.e., features variation). iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda K: list 2 elements, s U — s: vector eigenvalues relatedness matrix; see relatedness_mat() details. U: matrix eigenvectors relatedness matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"","code":"# using admix data admix_design <- create_design(X = admix$X, y = admix$y) fit_admix1 <- plmm(design = admix_design) s1 <- summary(fit_admix1, idx = 50) print(s1) #> lasso-penalized regression model with n=197, p=101 at lambda=0.01426 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 88 #> ------------------------------------------------- plot(fit_admix1) # Note: for examples with large data that are too big to fit in memory, # see the article \"PLINK files/file-backed matrices\" on our website # https://pbreheny.github.io/plmmr/articles/filebacking.html"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":null,"dir":"Reference","previous_headings":"","what":"plmm_checks — plmm_checks","title":"plmm_checks — plmm_checks","text":"plmm_checks","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"plmm_checks — plmm_checks","text":"","code":"plmm_checks( design, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", init = NULL, gamma, alpha = 1, dfmax = NULL, trace = FALSE, save_rds = NULL, return_fit = TRUE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"plmm_checks — plmm_checks","text":"design design object, created create_design() K Similarity matrix used rotate data. either (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'u', returned choose_k(). diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"MCP\" (default), \"SCAD\", \"lasso\". init Initial values coefficients. Default 0 columns X. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. dfmax Option added soon: Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. save_rds Optional: filepath name specified (e.g., save_rds = \"~/dir/my_results.rds\"), model results saved provided location. Defaults NULL, save result. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE. ... Additional arguments get_data()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"plmm_checks — plmm_checks","text":"list parameters pass model fitting. list includes standardized design matrix, outcome, meta-data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"PLMM fit: function fits PLMM using values returned plmm_prep()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"","code":"plmm_fit( prep, y, std_X_details, eta_star, penalty_factor, fbm_flag, penalty, gamma = 3, alpha = 1, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, dfmax = NULL, init = NULL, warn = TRUE, returnX = TRUE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"prep list returned plmm_prep y original (centered) outcome vector. Need intercept estimate std_X_details list components 'center' (values used center X), 'scale' (values used scale X), 'ns' (indices nonsignular columns X) eta_star ratio variances (passed plmm()) penalty_factor multiplicative factor penalty applied coefficient. supplied, penalty_factor must numeric vector length equal number columns X. purpose penalty_factor apply differential penalization coefficients thought likely others model. particular, penalty_factor can 0, case coefficient always model without shrinkage. fbm_flag Logical: std_X FBM object? Passed plmm(). penalty penalty applied model. Either \"MCP\" (default), \"SCAD\", \"lasso\". gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (future idea; yet incorporated) convex Calculate index objective function ceases locally convex? Default TRUE. dfmax (future idea; yet incorporated) Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. init Initial values coefficients. Default 0 columns X. warn Return warning messages failures converge model saturation? Default TRUE. returnX Return standardized design matrix along fit? default, option turned X 100 MB, turned larger matrices preserve memory. ... Additional arguments can passed biglasso::biglasso_simple_path()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"list components: std_scale_beta: coefficients estimated scale std_X centered_y: y-values 'centered' mean 0 s U, values vectors eigendecomposition K lambda: vector tuning parameter values linear_predictors: product stdrot_X b (linear predictors transformed restandardized scale) eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure. iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty: character string indicating penalty model fit (e.g., 'MCP') penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. ns: indices nonsingular values X feature_names: formatted column names design matrix nlambda: number lambda values used model fitting eps: tolerance ('epsilon') used model fitting max_iter: max. number iterations per model fit warn: logical - warnings given model fit converge? init: initial values model fitting trace: logical - messages printed console models fit?","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"PLMM format: function format output model constructed plmm_fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"","code":"plmm_format(fit, p, std_X_details, fbm_flag, plink_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"fit list parameters describing output model constructed plmm_fit p number features original data (including constant features) std_X_details list 3 items: * 'center': centering values columns X * 'scale': scaling values non-singular columns X * 'ns': indicesof nonsingular columns std_X fbm_flag Logical: corresponding design matrix filebacked? Passed plmm(). plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"list components: beta_vals: matrix estimated coefficients original scale. Rows predictors, columns values lambda lambda: numeric vector lasso tuning parameter values used model fitting. eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure. s: vectof eigenvalues relatedness matrix K; see relatedness_mat() details. U: matrix eigenvalues relatedness matrix K rot_y: vector outcome values rotated scale. scale model fit. linear_predictors: matrix resulting product stdrot_X estimated coefficients ~rotated~ scale. penalty: character string indicating penalty model fit (e.g., 'MCP') gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. ns_idx: vector indices predictors nonsingular features (.e., variation). iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":null,"dir":"Reference","previous_headings":"","what":"Loss method for ","title":"Loss method for ","text":"Loss method \"plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Loss method for ","text":"","code":"plmm_loss(y, yhat)"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Loss method for ","text":"y Observed outcomes (response) vector yhat Predicted outcomes (response) vector","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Loss method for ","text":"numeric vector squared-error loss values given observed predicted outcomes","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Loss method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design, K = relatedness_mat(admix$X)) yhat <- predict(object = fit, newX = admix$X, type = 'lp', lambda = 0.05) head(plmm_loss(yhat = yhat, y = admix$y)) #> [,1] #> [1,] 0.81638401 #> [2,] 0.09983799 #> [3,] 0.50281622 #> [4,] 0.14234359 #> [5,] 2.03696796 #> [6,] 2.72044268"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"PLMM prep: function run checks, SVD, rotation prior fitting PLMM model internal function cv_plmm","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"","code":"plmm_prep( std_X, std_X_n, std_X_p, genomic = 1:std_X_p, n, p, centered_y, k = NULL, K = NULL, diag_K = NULL, eta_star = NULL, fbm_flag, penalty_factor = rep(1, ncol(std_X)), trace = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"std_X Column standardized design matrix. May include clinical covariates non-SNP data. std_X_n number observations std_X (integer) std_X_p number features std_X (integer) genomic numeric vector indices indicating columns standardized X genomic covariates. Defaults columns. n number instances original design matrix X. altered standardization. p number features original design matrix X, including constant features centered_y Continuous outcome vector, centered. k integer specifying number singular values used approximation rotated design matrix. argument passed RSpectra::svds(). Defaults min(n, p) - 1, n p dimensions standardized design matrix. K Similarity matrix used rotate data. either known matrix reflects covariance y, estimate (Default \\(\\frac{1}{p}(XX^T)\\), X standardized). can also list, components d u (returned choose_k) diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Passed plmm(). eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. fbm_flag Logical: std_X FBM type object? set internally plmm(). trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. ... used yet","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"List components: centered_y: vector centered outcomes std_X: standardized design matrix K: list 2 elements. (1) s: vector eigenvalues K, (2) U: eigenvectors K (left singular values X). eta: numeric value estimated eta parameter trace: logical.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmmr-package.html","id":null,"dir":"Reference","previous_headings":"","what":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","title":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","text":"Fits penalized linear mixed models correct unobserved confounding factors. 'plmmr' infers corrects presence unobserved confounding effects population stratification environmental heterogeneity. fits linear model via penalized maximum likelihood. Originally designed multivariate analysis single nucleotide polymorphisms (SNPs) measured genome-wide association study (GWAS), 'plmmr' eliminates need subpopulation-specific analyses post-analysis p-value adjustments. Functions appropriate processing 'PLINK' files also supplied. examples, see package homepage. https://pbreheny.github.io/plmmr/.","code":""},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/reference/plmmr-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","text":"Maintainer: Patrick J. Breheny patrick-breheny@uiowa.edu (ORCID) Authors: Tabitha K. Peter tabitha-peter@uiowa.edu (ORCID) Anna C. Reisetter anna-reisetter@uiowa.edu (ORCID) Yujing Lu","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot method for cv_plmm class — plot.cv_plmm","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"Plot method cv_plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"","code":"# S3 method for class 'cv_plmm' plot( x, log.l = TRUE, type = c(\"cve\", \"rsq\", \"scale\", \"snr\", \"all\"), selected = TRUE, vertical.line = TRUE, col = \"red\", ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"x object class cv_plmm log.l Logical indicate plot returned natural log scale. Defaults log.l = FALSE. type Type plot return. Defaults \"cve.\" selected Logical indicate variables plotted. Defaults TRUE. vertical.line Logical indicate whether vertical line plotted minimum/maximum value. Defaults TRUE. col Color vertical line, plotted. Defaults \"red.\" ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"Nothing returned; instead, plot drawn representing relationship tuning parameter 'lambda' value (x-axis) cross validation error (y-axis).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cvfit <- cv_plmm(design = admix_design) plot(cvfit)"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot method for plmm class — plot.plmm","title":"Plot method for plmm class — plot.plmm","text":"Plot method plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot method for plmm class — plot.plmm","text":"","code":"# S3 method for class 'plmm' plot(x, alpha = 1, log.l = FALSE, shade = TRUE, col, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot method for plmm class — plot.plmm","text":"x object class plmm alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. log.l Logical indicate plot returned natural log scale. Defaults log.l = FALSE. shade Logical indicate whether local nonconvex region shaded. Defaults TRUE. col Vector colors coefficient lines. ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot method for plmm class — plot.plmm","text":"Nothing returned; instead, plot coefficient paths drawn value lambda (one 'path' coefficient).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot method for plmm class — plot.plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) plot(fit) plot(fit, log.l = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Predict method for plmm class — predict.plmm","title":"Predict method for plmm class — predict.plmm","text":"Predict method plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Predict method for plmm class — predict.plmm","text":"","code":"# S3 method for class 'plmm' predict( object, newX, type = c(\"blup\", \"coefficients\", \"vars\", \"nvars\", \"lp\"), lambda, idx = 1:length(object$lambda), ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Predict method for plmm class — predict.plmm","text":"object object class plmm. newX Matrix values predictions made (used type=\"coefficients\" type settings predict). can either FBM object 'matrix' object. Note: Columns argument must named! type character argument indicating type prediction returned. Options \"lp,\" \"coefficients,\" \"vars,\" \"nvars,\" \"blup.\" See details. lambda numeric vector regularization parameter lambda values predictions requested. idx Vector indices penalty parameter lambda predictions required. default, indices returned. ... Additional optional arguments","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Predict method for plmm class — predict.plmm","text":"Depends type - see Details","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Predict method for plmm class — predict.plmm","text":"Define beta-hat coefficients estimated value lambda minimizes cross-validation error (CVE). options type follows: 'response' (default): uses product newX beta-hat predict new values outcome. incorporate correlation structure data. stats folks , simply linear predictor. 'blup' (acronym Best Linear Unbiased Predictor): adds 'response' value represents esetimated random effect. addition way incorporating estimated correlation structure data prediction outcome. 'coefficients': returns estimated beta-hat 'vars': returns indices variables (e.g., SNPs) nonzero coefficients value lambda. EXCLUDES intercept. 'nvars': returns number variables (e.g., SNPs) nonzero coefficients value lambda. EXCLUDES intercept.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Predict method for plmm class — predict.plmm","text":"","code":"set.seed(123) train_idx <- sample(1:nrow(admix$X), 100) # Note: ^ shuffling is important here! Keeps test and train groups comparable. train <- list(X = admix$X[train_idx,], y = admix$y[train_idx]) train_design <- create_design(X = train$X, y = train$y) test <- list(X = admix$X[-train_idx,], y = admix$y[-train_idx]) fit <- plmm(design = train_design) # make predictions for all lambda values pred1 <- predict(object = fit, newX = test$X, type = \"lp\") pred2 <- predict(object = fit, newX = test$X, type = \"blup\") # look at mean squared prediction error mspe <- apply(pred1, 2, function(c){crossprod(test$y - c)/length(c)}) min(mspe) #> [1] 2.87754 mspe_blup <- apply(pred2, 2, function(c){crossprod(test$y - c)/length(c)}) min(mspe_blup) # BLUP is better #> [1] 2.128471 # compare the MSPE of our model to a null model, for reference # null model = intercept only -> y_hat is always mean(y) crossprod(mean(test$y) - test$y)/length(test$y) #> [,1] #> [1,] 6.381748"},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":null,"dir":"Reference","previous_headings":"","what":"Predict method to use in cross-validation (within cvf) — predict_within_cv","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"Predict method use cross-validation (within cvf)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"","code":"predict_within_cv( fit, trainX, trainY = NULL, testX, std_X_details, type, fbm = FALSE, plink_flag = FALSE, Sigma_11 = NULL, Sigma_21 = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"fit list components returned plmm_fit. trainX training data, pre-standardization pre-rotation trainY training outcome, centered. needed type = 'blup' testX design matrix used computing predicted values (.e, test data). std_X_details list 3 items: 'center': centering values columns X 'scale': scaling values non-singular columns X 'ns': indices nonsingular columns std_X. Note: vector really need ! type character argument indicating type prediction returned. Passed cvf(), Options \"lp,\" \"coefficients,\" \"vars,\" \"nvars,\" \"blup.\" See details. fbm Logical: trainX FBM object? , function expects testX also FBM. two X matrices must stored way. Sigma_11 Variance-covariance matrix training data. Extracted estimated_Sigma generated using observations. Required type == 'blup'. Sigma_21 Covariance matrix training testing data. Extracted estimated_Sigma generated using observations. Required type == 'blup'. ... Additional optional arguments","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"numeric vector predicted values","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"Define beta-hat coefficients estimated value lambda minimizes cross-validation error (CVE). options type follows: 'lp' (default): uses linear predictor (.e., product test data estimated coefficients) predict test values outcome. Note approach incorporate correlation structure data. 'blup' (acronym Best Linear Unbiased Predictor): adds 'lp' value represents estimated random effect. addition way incorporating estimated correlation structure data prediction outcome. Note: main difference function predict.plmm() method CV, predictions made standardized scale (.e., trainX testX data come std_X). predict.plmm() method makes predictions scale X (original scale)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to format the time — pretty_time","title":"a function to format the time — pretty_time","text":"function format time","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to format the time — pretty_time","text":"","code":"pretty_time()"},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to format the time — pretty_time","text":"string formatted current date time","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"Print method summary.cv_plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"","code":"# S3 method for class 'summary.cv_plmm' print(x, digits, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"x object class summary.cv_plmm digits number digits use formatting output ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"Nothing returned; instead, message printed console summarizing results cross-validated model fit.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) print(summary(cv_fit)) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2168): #> ------------------------------------------------- #> Nonzero coefficients: 10 #> Cross-validation error (deviance): 1.96 #> Scale estimate (sigma): 1.399"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to print the summary of a plmm model — print.summary.plmm","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"function print summary plmm model","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"","code":"# S3 method for class 'summary.plmm' print(x, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"x summary.plmm object ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"Nothing returned; instead, message printed console summarizing results model fit.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"","code":"lam <- rev(seq(0.01, 1, length.out=20)) |> round(2) # for sake of example admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design, lambda = lam) fit2 <- plmm(design = admix_design, penalty = \"SCAD\", lambda = lam) print(summary(fit, idx = 18)) #> lasso-penalized regression model with n=197, p=101 at lambda=0.1100 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 27 #> ------------------------------------------------- print(summary(fit2, idx = 18)) #> SCAD-penalized regression model with n=197, p=101 at lambda=0.1100 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 29 #> -------------------------------------------------"},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in large data files as an FBM — process_delim","title":"A function to read in large data files as an FBM — process_delim","text":"function read large data files FBM","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in large data files as an FBM — process_delim","text":"","code":"process_delim( data_dir, data_file, feature_id, rds_dir = data_dir, rds_prefix, logfile = NULL, overwrite = FALSE, quiet = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in large data files as an FBM — process_delim","text":"data_dir directory file. data_file file read , without filepath. file numeric values. Example: use data_file = \"myfile.txt\", data_file = \"~/mydirectory/myfile.txt\" Note: file headers/column names, set 'header = TRUE' – passed bigmemory::read.big.matrix(). feature_id string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. rds_dir directory user wants create '.rds' '.bk' files Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds file (create insie rds_dir folder) Note: 'rds_prefix' 'data_prefix' logfile Optional: name (character string) prefix logfile written. Defaults 'process_delim', .e. get 'process_delim.log' outfile. overwrite Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. Note: multiple .rds files names start \"std_prefix_...\", error . protect users accidentally deleting files saved results, one .rds file can removed option. quiet Logical: messages printed console silenced? Defaults FALSE. ... Optional: arguments passed bigmemory::read.big.matrix(). Note: 'sep' option pass , 'header'.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in large data files as an FBM — process_delim","text":"file path newly created '.rds' file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to read in large data files as an FBM — process_delim","text":"","code":"temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), overwrite = TRUE, rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", header = TRUE) #> #> Overwriting existing files:processed_colon2.bk/.rds/.desc #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/Rtmpm57oVf/processed_colon2.rds colon2 <- readRDS(colon_dat) str(colon2) #> List of 3 #> $ X:Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"processed_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/Rtmpm57oVf/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ n: num 62 #> $ p: num 2001 #> - attr(*, \"class\")= chr \"processed_delim\""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":null,"dir":"Reference","previous_headings":"","what":"Preprocess PLINK files using the bigsnpr package — process_plink","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"Preprocess PLINK files using bigsnpr package","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"","code":"process_plink( data_dir, data_prefix, rds_dir = data_dir, rds_prefix, logfile = NULL, impute = TRUE, impute_method = \"mode\", id_var = \"IID\", parallel = TRUE, quiet = FALSE, overwrite = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"data_dir path bed/bim/fam data files, without trailing \"/\" (e.g., use data_dir = '~/my_dir', data_dir = '~/my_dir/') data_prefix prefix (character string) bed/fam data files (e.g., data_prefix = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds file (create insie rds_dir folder) Note: 'rds_prefix' 'data_prefix' logfile Optional: name (character string) prefix logfile written 'rds_dir'. Default NULL (log file written). Note: supply file path argument, error \"file found\" error. supply string; e.g., want my_log.log, supply 'my_log', my_log.log file appear rds_dir. impute Logical: data imputed? Default TRUE. impute_method 'impute' = TRUE, argument specify kind imputation desired. Options : * mode (default): Imputes frequent call. See bigsnpr::snp_fastImputeSimple() details. * random: Imputes sampling according allele frequencies. * mean0: Imputes rounded mean. * mean2: Imputes mean rounded 2 decimal places. * xgboost: Imputes using algorithm based local XGBoost models. See bigsnpr::snp_fastImpute() details. Note: can take several minutes, even relatively small data set. id_var String specifying column PLINK .fam file unique sample identifiers. Options \"IID\" (default) \"FID\" parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. quiet Logical: messages printed console silenced? Defaults FALSE overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. Note: multiple .rds files names start \"std_prefix_...\", error . protect users accidentally deleting files saved results, one .rds file can removed option. ... Optional: additional arguments bigsnpr::snp_fastImpute() (relevant impute_method = \"xgboost\")","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"filepath '.rds' object created; see details explanation.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"Three files created location specified rds_dir: 'rds_prefix.rds': list three items: (1) X: filebacked bigmemory::big.matrix object pointing imputed genotype data. matrix type 'double', important downstream operations create_design() (2) map: data.frame PLINK 'bim' data (.e., variant information) (3) fam: data.frame PLINK 'fam' data (.e., pedigree information) 'prefix.bk': backingfile stores numeric data genotype matrix 'rds_prefix.desc'\" description file, needed Note process_plink() need run given set PLINK files; subsequent data analysis/scripts, get_data() access '.rds' file. example, see vignette processing PLINK files","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"function read large file numeric file-backed matrix (FBM) Note: function wrapper bigstatsr::big_read()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"","code":"read_data_files( data_file, data_dir, rds_dir, rds_prefix, outfile, overwrite, quiet, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"data_file name file read, including directory. Directory specified data_dir data_dir path directory 'file' rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds/.bk files (create insie rds_dir folder) Note: 'rds_prefix' 'data_file' outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. quiet Logical: messages printed console? Defaults TRUE ... Optional: arguments passed bigmemory::read.big.matrix(). Note: 'sep' option pass .","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"'.rds', '.bk', '.desc' files created data_dir, obj (filebacked bigmemory big.matrix object) returned. See bigmemory documentation info big.matrix class.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in PLINK files using bigsnpr methods — read_plink_files","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"function read PLINK files using bigsnpr methods","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"","code":"read_plink_files( data_dir, data_prefix, rds_dir, outfile, parallel, overwrite, quiet )"},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"data_dir path bed/bim/fam data files, without trailing \"/\" (e.g., use data_dir = '~/my_dir', data_dir = '~/my_dir/') data_prefix prefix (character string) bed/fam data files (e.g., prefix = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. quiet Logical: messages printed console? Defaults TRUE","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"'.rds' '.bk' files created data_dir, obj (bigSNP object) returned. See bigsnpr documentation info bigSNP class.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculate a relatedness matrix — relatedness_mat","title":"Calculate a relatedness matrix — relatedness_mat","text":"Given matrix genotypes, function estimates genetic relatedness matrix (GRM, also known RRM, see Hayes et al. 2009, doi:10.1017/S0016672308009981 ) among subjects: XX'/p, X standardized.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculate a relatedness matrix — relatedness_mat","text":"","code":"relatedness_mat(X, std = TRUE, fbm = FALSE, ns = NULL, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculate a relatedness matrix — relatedness_mat","text":"X n x p numeric matrix genotypes (fully-imputed data). Note: matrix include non-genetic features. std Logical: X standardized? set FALSE (can done data stored memory), good reason , standardization best practice. fbm Logical: X stored FBM? Defaults FALSE ns Optional vector values indicating indices nonsingular features ... optional arguments bigstatsr::bigapply() (like ncores = ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculate a relatedness matrix — relatedness_mat","text":"n x n numeric matrix capturing genomic relatedness samples represented X. notation, call matrix K 'kinship'; also known GRM RRM.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculate a relatedness matrix — relatedness_mat","text":"","code":"RRM <- relatedness_mat(X = admix$X) RRM[1:5, 1:5] #> [,1] [,2] [,3] [,4] [,5] #> [1,] 0.81268908 -0.09098097 -0.07888910 0.06770613 0.08311777 #> [2,] -0.09098097 0.81764801 0.20480021 0.02112812 -0.02640295 #> [3,] -0.07888910 0.20480021 0.82177986 -0.02864226 0.18693970 #> [4,] 0.06770613 0.02112812 -0.02864226 0.89327266 -0.03541470 #> [5,] 0.08311777 -0.02640295 0.18693970 -0.03541470 0.79589686"},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to rotate filebacked data — rotate_filebacked","title":"A function to rotate filebacked data — rotate_filebacked","text":"function rotate filebacked data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to rotate filebacked data — rotate_filebacked","text":"","code":"rotate_filebacked(prep, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to rotate filebacked data — rotate_filebacked","text":"list 4 items: stdrot_X: X rotated re-standardized scale rot_y: y rotated scale (numeric vector) stdrot_X_center: numeric vector values used center rot_X stdrot_X_scale: numeric vector values used scale rot_X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute sequence of lambda values — setup_lambda","title":"Compute sequence of lambda values — setup_lambda","text":"function allows compute sequence lambda values plmm models.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute sequence of lambda values — setup_lambda","text":"","code":"setup_lambda( X, y, alpha, lambda_min, nlambda, penalty_factor, intercept = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute sequence of lambda values — setup_lambda","text":"X Rotated standardized design matrix includes intercept column present. May include clinical covariates non-SNP data. can either 'matrix' 'FBM' object. y Continuous outcome vector. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. value lambda_min = 0 supported. nlambda desired number lambda values sequence generated. penalty_factor multiplicative factor penalty applied coefficient. supplied, penalty_factor must numeric vector length equal number columns X. purpose penalty_factor apply differential penalization coefficients thought likely others model. particular, penalty_factor can 0, case coefficient always model without shrinkage. intercept Logical: X contain intercept column? Defaults TRUE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute sequence of lambda values — setup_lambda","text":"numeric vector lambda values, equally spaced log scale","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to standardize a filebacked matrix — standardize_filebacked","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"helper function standardize filebacked matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"","code":"standardize_filebacked( X, new_file, rds_dir, non_gen, complete_outcome, id_var, outfile, quiet, overwrite )"},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"X list includes: (1) subset_X: big.matrix object subset &/additional predictors appended columns (2) ns: numeric vector indicating indices nonsingular columns subset_X new_file new_file (character string) bed/fam data files (e.g., new_file = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) new_file logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...) overwrite Logical: existing .bk/.rds files exist specified directory/new_file, overwritten?","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"list new component obj called 'std_X' - FBM column-standardized data. List also includes several indices/meta-data standardized matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to standardize matrices — standardize_in_memory","title":"A helper function to standardize matrices — standardize_in_memory","text":"helper function standardize matrices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to standardize matrices — standardize_in_memory","text":"","code":"standardize_in_memory(X)"},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to standardize matrices — standardize_in_memory","text":"X matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to standardize matrices — standardize_in_memory","text":"list standardized matrix, vectors centering/scaling values, vector indices nonsingular columns","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"A helper function to standardize matrices — standardize_in_memory","text":"function adapted https://github.com/pbreheny/ncvreg/blob/master/R/std.R NOTE: function returns matrix memory. standardizing filebacked data, use big_std() – see src/big_standardize.cpp","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to subset big.matrix objects — subset_filebacked","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"helper function subset big.matrix objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"","code":"subset_filebacked(X, new_file, complete_samples, ns, rds_dir, outfile, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"X filebacked big.matrix --standardized design matrix new_file Optional user-specified new_file --created .rds/.bk files. complete_samples Numeric vector indicesmarking rows original data non-missing entry 6th column .fam file ns Numeric vector indices non-singular columns vector created handle_missingness() rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) new_file logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"list two components. First, big.matrix object, 'subset_X', representing design matrix wherein: rows subset according user's specification handle_missing_phen columns subset constant features remain – important standardization downstream list also includes integer vector 'ns' marks columns original matrix 'non-singular' (.e. constant features). 'ns' index plays important role plmm_format() untransform() (helper functions model fitting)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A summary function for cv_plmm objects — summary.cv_plmm","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"summary function cv_plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"","code":"# S3 method for class 'cv_plmm' summary(object, lambda = \"min\", ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"object cv_plmm object lambda regularization parameter value inference reported. Can choose numeric value, 'min', '1se'. Defaults 'min.' ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"return value object S3 class summary.cv_plmm. class print method contains following list elements: lambda_min: lambda value minimum cross validation error lambda.1se: maximum lambda value within 1 standard error minimum cross validation error penalty: penalty applied fitted model nvars: number non-zero coefficients selected lambda value cve: cross validation error folds min: minimum cross validation error fit: plmm fit used cross validation returnBiasDetails = TRUE, two items returned: bias: mean bias cross validation loss: loss value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) summary(cv_fit) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2168): #> ------------------------------------------------- #> Nonzero coefficients: 10 #> Cross-validation error (deviance): 2.12 #> Scale estimate (sigma): 1.455"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A summary method for the plmm objects — summary.plmm","title":"A summary method for the plmm objects — summary.plmm","text":"summary method plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A summary method for the plmm objects — summary.plmm","text":"","code":"# S3 method for class 'plmm' summary(object, lambda, idx, eps = 1e-05, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A summary method for the plmm objects — summary.plmm","text":"object object class plmm lambda regularization parameter value inference reported. idx Alternatively, lambda may specified index; idx=10 means: report inference 10th value lambda along regularization path. lambda idx specified, lambda takes precedence. eps lambda given, eps tolerance difference given lambda value lambda value object. Defaults 0.0001 (1e-5) ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A summary method for the plmm objects — summary.plmm","text":"return value object S3 class summary.plmm. class print method contains following list elements: penalty: penalty used plmm (e.g. SCAD, MCP, lasso) n: Number instances/observations std_X_n: number observations standardized data; time differ 'n' data PLINK external data include samples p: Number regression coefficients (including intercept) converged: Logical indicator whether model converged lambda: lambda value inference reported lambda_char: formatted character string indicating lambda value nvars: number nonzero coefficients (, including intercept) value lambda nonzero: column names indicating nonzero coefficients model specified value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A summary method for the plmm objects — summary.plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) summary(fit, idx = 97) #> lasso-penalized regression model with n=197, p=101 at lambda=0.00054 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 98 #> -------------------------------------------------"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale — untransform","title":"Untransform coefficient values back to the original scale — untransform","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale — untransform","text":"","code":"untransform( std_scale_beta, p, std_X_details, fbm_flag, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale — untransform","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' fbm_flag Logical: corresponding design matrix filebacked? plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns. use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale — untransform","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"","code":"untransform_delim( std_scale_beta, p, std_X_details, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"","code":"untransform_in_memory(std_scale_beta, p, std_X_details, use_names = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"","code":"untransform_plink( std_scale_beta, p, std_X_details, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":null,"dir":"Reference","previous_headings":"","what":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"Linux/Unix MacOS , companion function unzip .gz files ship plmmr package","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"","code":"unzip_example_data(outdir)"},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"outdir file path directory .gz files written","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"Nothing returned; PLINK files ship plmmr package stored directory specified 'outdir'","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"example function, look vignette('plink_files', package = \"plmmr\"). Note : function work Windows systems - Linux/Unix MacOS.","code":""},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"bug-fixes-4-2-0","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"plmmr 4.2.0 (2024-12-13)","text":"recently caught couple bugs model fitting functions – apologize errors may caused downstream analysis, explain addressed issues : Bug BLUP: caught mathematical error earlier implementation best linear unbiased prediction. issue inconsistency scaling among terms used constructing predictor. issue impacted prediction within cross-validation well predict() method plmm class. recommend users used best linear unbiased prediction (BLUP) previous analysis re-run analysis using corrected version. Bug processing delimited files: noticed bug way models fit data delimited files. previous version correctly implementing transformation model results standardized scale original scale due inadvertent addition two rows beta_vals object (one row added, intercept). error corrected. recommend users used previous version plmmr analyze data delimited files re-run analyses.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"other-changes-4-2-0","dir":"Changelog","previous_headings":"","what":"Other changes","title":"plmmr 4.2.0 (2024-12-13)","text":"Change default settings prediction: default prediction method predict() cv_plmm() now ‘blup’ (best linear unbiased prediction). Change objects returned default plmm(): default, main model fitting function plmm() now returns std_X (copy standardized design matrix) , y (outcome vector used fit model), std_scale_beta (estimated coefficients standardized scale). components used construct best linear unbiased predictor. user can opt return items using return_fit = FALSE compact_save options. Change arguments passed predict(): tandem change returned plmm() default, predict() method longer needs separate X y argument supplied type = 'blup'. components needed BLUP returned default plmm. Note predict() still early stages development filebacked data; given complexities particularities filebacked data processed (particularly data constant features), edge cases predict() method handle yet. continue work developing method; now, example predict() filebacked data vignette delimited data. Note particular example delimited data, constant features design matrix.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-410-2024-10-23","dir":"Changelog","previous_headings":"","what":"plmmr 4.1.0 (2024-10-23)","title":"plmmr 4.1.0 (2024-10-23)","text":"CRAN release: 2024-10-23 Restore plmm(X,y) syntax: version 4.0.0 required create_design() always called prior plmm() cv_plmm(); update restores X,y syntax consistent packages (e.g., glmnet, ncvreg). Note syntax available case design matrix stored -memory matrix data.frame object. create_design() function still required cases design matrix/dataset stored external file. Bug fix: 4.0.0 version create_design() required X column names, errored uninformative message names supplied (see issue 61). now fixed – column names required unless user wants specify argument unpen. Argument name change: create_design(), argument specify outcome -memory case renamed y; makes syntax consistent, e.g., create_design(X, y). Note change relevant -memory data . Internal: Fixed LTO type mismatch bug.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-400-2024-10-07","dir":"Changelog","previous_headings":"","what":"plmmr 4.0.0 (2024-10-07)","title":"plmmr 4.0.0 (2024-10-07)","text":"CRAN release: 2024-10-11 Major re-structuring preprocessing pipeline: Data external files must now processed process_plink() process_delim(). data (including -memory data) must prepared analysis via create_design(). change ensures data funneled uniform format analysis. Documentation updated: vignettes package now revised include examples complete pipeline new create_design() syntax. article type data input (matrix/data.frame, delimited file, PLINK). CRAN: package CRAN now.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-320-2024-09-02","dir":"Changelog","previous_headings":"","what":"plmmr 3.2.0 (2024-09-02)","title":"plmmr 3.2.0 (2024-09-02)","text":"bigsnpr now Suggests, Imports: essential filebacking support now done bigmemory bigalgebra. bigsnpr package used processing PLINK files. dev branch gwas_scale version pipeline runs completely file-backed.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-310-2024-07-13","dir":"Changelog","previous_headings":"","what":"plmmr 3.1.0 (2024-07-13)","title":"plmmr 3.1.0 (2024-07-13)","text":"Enhancement: make plmmr better functionality writing scripts, functions process_plink(), plmmm(), cv_plmm() now (optionally) write ‘.log’ files, PLINK. Enhancement: cases users working large datasets, may practical desirable results returned plmmm() cv_plmm() saved single ‘.rds’ file. now option model fitting functions called ‘compact_save’, gives users option save output multiple, smaller ‘.rds’ files. Argument removed: Argument std_needed longer available plmm() cv_plmm() functions.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-300-2024-06-27","dir":"Changelog","previous_headings":"","what":"plmmr 3.0.0 (2024-06-27)","title":"plmmr 3.0.0 (2024-06-27)","text":"Bug fix: Cross-validation implementation issues fixed. Previously, full set eigenvalues used inside CV folds, ideal involves information outside fold. Now, entire modeling process cross-validated: standardization, eigendecomposition relatedness matrix, model fitting, backtransformation onto original scale prediction. Computational speedup: standardization rotation filebacked data now much faster; bigalgebra bigmemory now used computations. Internal: standardized scale, intercept PLMM mean outcome. derivation considerably simplifies handling intercept internally model fitting.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-221-2024-03-16","dir":"Changelog","previous_headings":"","what":"plmmr 2.2.1 (2024-03-16)","title":"plmmr 2.2.1 (2024-03-16)","text":"Name change: Changed package name plmmr; note plmm(), cv_plmm(), functions starting plmm_ changed names.","code":""}] +[{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"process-the-data","dir":"Articles","previous_headings":"","what":"Process the data","title":"If your data is in a delimited file","text":"output messages indicate data processed. call created 2 files, one .rds file corresponding .bk file. .bk file special type binary file can used store large data sets. .rds file contains pointer .bk file, along meta-data. Note returned process_delim() character string filepath: .","code":"# I will create the processed data files in a temporary directory; # fill in the `rds_dir` argument with the directory of your choice temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", overwrite = TRUE, header = TRUE) #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpZ5yqi0/processed_colon2.rds # look at what is created colon <- readRDS(colon_dat)"},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"create-a-design","dir":"Articles","previous_headings":"","what":"Create a design","title":"If your data is in a delimited file","text":"Creating design ensures data uniform format prior analysis. delimited files, two main processes happening create_design(): (1) standardization columns (2) construction penalty factor vector. Standardization columns ensures features evaluated model uniform scale; done transforming column design matrix mean 0 variance 1. penalty factor vector indicator vector 0 represents feature always model – feature unpenalized. specify columns want unpenalized, use ‘unpen’ argument. example, choosing make ‘sex’ unpenalized covariate. side note unpenalized covariates: delimited file data, features want include model – penalized unpenalized features – must included delimited file. differs PLINK file data analyzed; look create_design() documentation details examples. process_delim(), create_design() function returns filepath: . output messages document steps create design procedure, messages saved text file colon_design.log rds_dir folder. didactic purposes, can look design:","code":"# prepare outcome data colon_outcome <- read.delim(find_example_data(path = \"colon2_outcome.txt\")) # create a design colon_design <- create_design(data_file = colon_dat, rds_dir = temp_dir, new_file = \"std_colon2\", add_outcome = colon_outcome, outcome_id = \"ID\", outcome_col = \"y\", unpen = \"sex\", # this will keep 'sex' in the final model logfile = \"colon_design\") #> No feature_id supplied; will assume data X are in same row-order as add_outcome. #> There are 0 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-17 19:29:26 #> Done with standardization. File formatting in progress # look at the results colon_rds <- readRDS(colon_design) str(colon_rds) #> List of 18 #> $ X_colnames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ X_rownames : chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ n : num 62 #> $ p : num 2001 #> $ is_plink : logi FALSE #> $ outcome_idx : int [1:62] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : int [1:62] 1 0 1 0 1 0 1 0 1 0 ... #> $ std_X_rownames: chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ unpen : int 1 #> $ unpen_colnames: chr \"sex\" #> $ ns : int [1:2001] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpZ5yqi0/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 62 #> $ std_X_p : num 2001 #> $ std_X_center : num [1:2001] 1.47 7015.79 4966.96 4094.73 3987.79 ... #> $ std_X_scale : num [1:2001] 0.499 3067.926 2171.166 1803.359 2002.738 ... #> $ penalty_factor: num [1:2001] 0 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\""},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"fit-a-model","dir":"Articles","previous_headings":"","what":"Fit a model","title":"If your data is in a delimited file","text":"fit model using design follows: Notice messages printed – documentation may optionally saved another .log file using logfile argument. can examine results specific \\lambda value: may also plot paths estimated coefficients:","code":"colon_fit <- plmm(design = colon_design, return_fit = TRUE, trace = TRUE) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Input data passed all checks at 2024-12-17 19:29:26 #> Starting decomposition. #> Calculating the eigendecomposition of K #> Eigendecomposition finished at 2024-12-17 19:29:26 #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:29:26 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:29:27 #> Beta values are estimated -- almost done! #> Formatting results (backtransforming coefs. to original scale). #> Model ready at 2024-12-17 19:29:27 summary(colon_fit, idx = 50) #> lasso-penalized regression model with n=62, p=2002 at lambda=0.0597 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 30 #> ------------------------------------------------- plot(colon_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/delim_files.html","id":"prediction-for-filebacked-data","dir":"Articles","previous_headings":"","what":"Prediction for filebacked data","title":"If your data is in a delimited file","text":"example shows experimental option, wherein working add prediction method filebacked outside cross-validation.","code":"# linear predictor yhat_lp <- predict(object = colon_fit, newX = attach.big.matrix(colon$X), type = \"lp\") # best linear unbiased predictor yhat_blup <- predict(object = colon_fit, newX = attach.big.matrix(colon$X), type = \"blup\") # look at mean squared prediction error mspe_lp <- apply(yhat_lp, 2, function(c){crossprod(colon_outcome$y - c)/length(c)}) mspe_blup <- apply(yhat_blup, 2, function(c){crossprod(colon_outcome$y - c)/length(c)}) min(mspe_lp) #> [1] 0.007659158 min(mspe_blup) #> [1] 0.00617254"},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Getting started with plmmr","text":"plmmr package fitting Penalized Linear Mixed Models R. package created purpose fitting penalized regression models high dimensional data observations correlated. instance, kind data arises often context genetics (e.g., GWAS population structure /family grouping). novelties plmmr : Integration: plmmr combines functionality several packages order quality control, model fitting/analysis, data visualization one package. example, GWAS data, plmmr take PLINK files way list SNPs downstream analysis. Accessibility: plmmr can run R session typical desktop laptop computer. user need access supercomputer experience command line order fit models plmmr. Handling correlation: plmmr uses transformation (1) measures correlation among samples (2) uses correlation measurement improve predictions (via best linear unbiased predictor, BLUP). means plmm(), ’s need filter data ‘maximum subset unrelated samples.’","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"minimal-example","dir":"Articles","previous_headings":"","what":"Minimal example","title":"Getting started with plmmr","text":"minimal reproducible example plmmr can used:","code":"# library(plmmr) fit <- plmm(admix$X, admix$y) # admix data ships with package plot(fit) cvfit <- cv_plmm(admix$X, admix$y) plot(cvfit) summary(cvfit) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2325): #> ------------------------------------------------- #> Nonzero coefficients: 8 #> Cross-validation error (deviance): 2.12 #> Scale estimate (sigma): 1.455"},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"file-backing","dir":"Articles","previous_headings":"Computational capability","what":"File-backing","title":"Getting started with plmmr","text":"many applications high dimensional data analysis, dataset large read R – session crash lack memory. particularly common analyzing data genome-wide association studies (GWAS). analyze large datasets, plmmr equipped analyze data using filebacking - strategy lets R ‘point’ file disk, rather reading file R session. Many packages use technique - bigstatsr biglasso two examples packages use filebacking technique. package plmmr uses create store filebacked objects bigmemory. filebacked computation relies biglasso package Yaohui Zeng et al. bigalgebra Michael Kane et al. processing PLINK files, use methods bigsnpr package Florian Privé.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"numeric-outcomes-only","dir":"Articles","previous_headings":"Computational capability","what":"Numeric outcomes only","title":"Getting started with plmmr","text":"time, package designed linear regression – , considering continuous (numeric) outcomes. maintain treating binary outcomes numeric values appropriate contexts, described Hastie et al. Elements Statistical Learning, chapter 4. future, like extend package handle dichotomous outcomes via logistic regression; theoretical work underlying open problem.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"types-of-penalization","dir":"Articles","previous_headings":"Computational capability","what":"3 types of penalization","title":"Getting started with plmmr","text":"Since focused penalized regression package, plmmr offers 3 choices penalty: minimax concave (MCP), smoothly clipped absolute deviation (SCAD), least absolute shrinkage selection operator (LASSO). implementation penalties built concepts/techniques provided ncvreg package.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"data-size-and-dimensionality","dir":"Articles","previous_headings":"Computational capability","what":"Data size and dimensionality","title":"Getting started with plmmr","text":"distinguish data attributes ‘big’ ‘high dimensional.’ ‘Big’ describes amount space data takes computer, ‘high dimensional’ describes context ratio features (also called ‘variables’ ‘predictors’) observations (e.g., samples) high. instance, data 100 samples 100 variables high dimensional, big. contrast, data 10 million observations 100 variables big, high dimensional. plmmr optimized data high dimensional – methods using estimate relatedness among observations perform best high number features relative number observations. plmmr also designed accommodate data large analyze -memory. accommodate data file-backing (described ). current analysis pipeline works well data files 40 Gb size. practice, means plmmr equipped analyze GWAS data, biobank-sized data.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"data-input-types","dir":"Articles","previous_headings":"","what":"Data input types","title":"Getting started with plmmr","text":"plmmr currently works three types data input: Data stored -memory matrix data frame Data stored PLINK files Data stored delimited files","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/getting-started.html","id":"example-data-sets","dir":"Articles","previous_headings":"Data input types","what":"Example data sets","title":"Getting started with plmmr","text":"plmmr currently includes three example data sets, one type data input. admix data example matrix input data. admix small data set (197 observations, 100 SNPs) describes individuals different ancestry groups. outcome admix simulated include population structure effects (.e. race/ethnicity impact SNP associations). data set available whenever library(plmmr) called. example analysis admix data available vignette('matrix_data', package = \"plmmr\"). penncath_lite data example PLINK input data. penncath_lite (data coronary artery disease PennCath study) high dimensional data set (1401 observations, 4217 SNPs) several health outcomes well age sex information. features data set represent small subset much larger GWAS data set (original data 800K SNPs). information data set, refer original publication. example analysis penncath_lite data available vignette('plink_files', package = \"plmmr\"). colon2 data example delimited-file input data. colon2 variation colon data included biglasso package. colon2 62 observations 2,001 features representing study colon disease. 2000 features original data, ‘sex’ feature simulated. example analysis colon2 data available vignette('delim_files', package = \"plmmr\").","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"basic-model-fitting","dir":"Articles","previous_headings":"","what":"Basic model fitting","title":"If your data is in a matrix or data frame","text":"admix dataset now ready analyze call plmmr::plmm() (one main functions plmmr): Notice: passing admix$X design argument plmm(); internally, plmm() taken X input created plmm_design object. also supply X y create_design() make step explicit. returned beta_vals item matrix whose rows \\hat\\beta coefficients whose columns represent values penalization parameter \\lambda. default, plmm fits 100 values \\lambda (see setup_lambda function details). Note values \\lambda, SNP 8 \\hat \\beta = 0. SNP 8 constant feature, feature (.e., column \\mathbf{X}) whose values vary among members population. can summarize fit nth \\lambda value: can also plot path fit see model coefficients vary \\lambda: Plot path model fit Suppose also know ancestry groups person admix data self-identified. probably want include model unpenalized covariate (.e., want ‘ancestry’ always model). specify unpenalized covariate, need use create_design() function prior calling plmm(). look: may compare results model includes ‘ancestry’ first model:","code":"admix_fit <- plmm(admix$X, admix$y) summary(admix_fit, lambda = admix_fit$lambda[50]) #> lasso-penalized regression model with n=197, p=101 at lambda=0.01426 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 88 #> ------------------------------------------------- admix_fit$beta_vals[1:10, 97:100] |> knitr::kable(digits = 3, format = \"html\") # for n = 25 summary(admix_fit, lambda = admix_fit$lambda[25]) #> lasso-penalized regression model with n=197, p=101 at lambda=0.08163 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 46 #> ------------------------------------------------- plot(admix_fit) # add ancestry to design matrix X_plus_ancestry <- cbind(admix$ancestry, admix$X) # adjust column names -- need these for designating 'unpen' argument colnames(X_plus_ancestry) <- c(\"ancestry\", colnames(admix$X)) # create a design admix_design2 <- create_design(X = X_plus_ancestry, y = admix$y, # below, I mark ancestry variable as unpenalized # we want ancestry to always be in the model unpen = \"ancestry\") # now fit a model admix_fit2 <- plmm(design = admix_design2) summary(admix_fit2, idx = 25) #> lasso-penalized regression model with n=197, p=102 at lambda=0.09886 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 14 #> ------------------------------------------------- plot(admix_fit2)"},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"cross-validation","dir":"Articles","previous_headings":"","what":"Cross validation","title":"If your data is in a matrix or data frame","text":"select \\lambda value, often use cross validation. example using cv_plmm select \\lambda minimizes cross-validation error: can also plot cross-validation error (CVE) versus \\lambda (log scale): Plot CVE","code":"admix_cv <- cv_plmm(design = admix_design2, return_fit = T) admix_cv_s <- summary(admix_cv, lambda = \"min\") print(admix_cv_s) #> lasso-penalized model with n=197 and p=102 #> At minimum cross-validation error (lambda=0.1853): #> ------------------------------------------------- #> Nonzero coefficients: 3 #> Cross-validation error (deviance): 1.33 #> Scale estimate (sigma): 1.154 plot(admix_cv)"},{"path":"https://pbreheny.github.io/plmmr/articles/matrix_data.html","id":"predicted-values","dir":"Articles","previous_headings":"","what":"Predicted values","title":"If your data is in a matrix or data frame","text":"example predict() methods PLMMs: can compare predictions predictions get intercept-model using mean squared prediction error (MSPE) – lower better: see model better predictions null.","code":"# make predictions for select lambda value(s) y_hat <- predict(object = admix_fit, newX = admix$X, type = \"blup\", X = admix$X, y = admix$y) # intercept-only (or 'null') model crossprod(admix$y - mean(admix$y))/length(admix$y) #> [,1] #> [1,] 5.928528 # our model at its best value of lambda apply(y_hat, 2, function(c){crossprod(admix$y - c)/length(c)}) -> mse min(mse) #> [1] 0.6930826 # ^ across all values of lambda, our model has MSPE lower than the null model"},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"math-notation","dir":"Articles","previous_headings":"","what":"Math notation","title":"Notes on notation","text":"concepts need denote, order usage derivations. blocked sections corresponding steps model fitting process.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"statistical-model-the-overall-framework","dir":"Articles","previous_headings":"Math notation","what":"Statistical model (the overall framework)","title":"Notes on notation","text":"overall model can written \\mathbf{y} = \\mathbf{X}\\boldsymbol{\\beta} + \\mathbf{Z}\\boldsymbol{\\gamma} + \\boldsymbol{\\epsilon} equivalently \\mathbf{y} = \\dot{\\mathbf{X}}\\dot{\\boldsymbol{\\beta}} + \\mathbf{u} + \\boldsymbol{\\epsilon} : \\mathbf{X} \\mathbf{y} n \\times p design matrix data n \\times 1 vector outcomes, respectively. , n number observations (e.g., number patients, number samples, etc.) p number features (e.g., number SNPs, number variables, number covariates, etc.). \\dot{\\mathbf{X}} column-standardized \\mathbf{X}, p columns mean 0 standard deviation 1. Note: \\dot{\\mathbf{X}} excludes singular features (columns constants) original \\mathbf{X}. \\dot{\\boldsymbol{\\beta}} represents coefficients standardized scale. \\mathbf{Z} n \\times b matrix indicators corresponding grouping structure, \\boldsymbol{\\gamma} vector values describing grouping associated \\mathbf{y}. real data, values typically unknown. \\boldsymbol{\\epsilon} n \\times 1 vector noise. define realized (empirical) relatedness matrix \\mathbf{K} \\equiv \\frac{1}{p}\\dot{\\mathbf{X}}\\dot{\\mathbf{X}}^\\top model assumes: \\boldsymbol{\\epsilon} \\perp \\mathbf{u} \\boldsymbol{\\epsilon} \\sim N(0, \\sigma^2_{\\epsilon}\\mathbf{}) \\mathbf{u} \\sim N(0, \\sigma^2_{s}\\mathbf{K}) assumptions, may write \\mathbf{y} \\sim N(\\dot{\\mathbf{X}}\\dot{\\boldsymbol{\\beta}}, \\boldsymbol{\\Sigma}) Indices: \\1,..., n indexes observations j \\1,..., p indexes features h \\1,..., b indexes batches (e.g., different family groups, different data collection sites, etc.)","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"decomposition-and-rotation-prep-and-first-part-of-fit","dir":"Articles","previous_headings":"Math notation","what":"Decomposition and rotation (prep and first part of fit)","title":"Notes on notation","text":"Beginning eigendecomposition, \\mathbf{U} \\mathbf{s} eigenvectors eigenvalues \\mathbf{K}, one obtain \\text{eigen}(\\mathbf{K)} \\equiv \\mathbf{U}\\mathbf{S}\\mathbf{U}^\\top. elements \\mathbf{s} diagonal values \\mathbf{S}. Note, random effect \\mathbf{u} distinct columns matrix \\mathbf{U}. k represents number nonzero eigenvalues represented \\mathbf{U} \\mathbf{d}, k \\leq \\text{min}(n,p). , \\mathbf{K} \\equiv \\frac{1}{p}\\dot{\\mathbf{X}}\\dot{\\mathbf{X}}^{\\top} often referred literature realized relatedness matrix (RRM) genomic relatedness matrix (GRM). \\mathbf{K} dimension n \\times n. \\eta ratio \\frac{\\sigma^2_s}{\\sigma^2_e + \\sigma^2_s}. estimate \\hat{\\eta} null model (details come). \\mathbf{\\Sigma} variance outcome, \\mathbb{V}({\\mathbf{y}}) \\propto \\eta \\mathbf{K} + (1 - \\eta)\\mathbf{}_n. \\mathbf{w} vector weights defined (\\eta\\mathbf{\\mathbf{s}} + (1-\\eta))^{-1/2}. values \\mathbf{w} nonzero values diagonal matrix \\mathbf{W} \\equiv (\\eta\\mathbf{S} + (1 - \\eta)\\mathbf{})^{-1/2}. matrix used rotating (preconditioning) data \\mathbf{\\Sigma}^{-1/2} \\equiv \\mathbf{W}\\mathbf{U}^\\top. \\tilde{\\dot{\\mathbf{X}}} \\equiv \\mathbf{W}\\mathbf{U}^\\top\\dot{\\mathbf{X}} rotated data, data transformed scale. \\tilde{\\mathbf{y}} \\equiv \\mathbf{\\Sigma}^{-1/2}\\mathbf{y} outcome rotated scale. \\tilde{\\ddot{\\mathbf{X}}} standardized rotated data. Note: standardization involves scaling, centering. post-rotation standardization impacts estimated coefficients well; define {\\ddot{\\boldsymbol{\\beta}}} estimated coefficients scale.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"model-fitting-with-penalization","dir":"Articles","previous_headings":"Math notation","what":"Model fitting with penalization","title":"Notes on notation","text":"fit \\tilde{\\mathbf{y}} \\sim \\tilde{\\ddot{\\mathbf{X}}} using penalized linear mixed model, obtain \\hat{\\ddot{\\boldsymbol{\\beta}}} estimated coefficients. penalty parameter values (e.g., values lasso tuning parameter) indexed \\lambda_l \\1,..., t.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"rescaling-results-format","dir":"Articles","previous_headings":"Math notation","what":"Rescaling results (format)","title":"Notes on notation","text":"obtain estimated coefficients original scale, values estimated model must unscaled (‘untransformed’) twice: adjust post-rotation standardization, adjust pre-rotation standardization. process written \\hat{\\ddot{\\boldsymbol{\\beta}}} \\rightarrow \\hat{\\dot{\\boldsymbol{\\beta}}} \\rightarrow \\hat{\\boldsymbol{\\beta}}.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/notation.html","id":"object-names-in-source-code","dir":"Articles","previous_headings":"","what":"Object names in source code","title":"Notes on notation","text":"code, denote objects way: \\mathbf{X} \\mathbf{y} X y \\dot{\\mathbf{X}} std_X \\tilde{\\dot{\\mathbf{X}}} rot_X \\ddot{\\tilde{\\mathbf{X}}} stdrot_X \\hat{\\boldsymbol{\\beta}} named og_scale_beta helper functions (clarity) returned plmm objects beta_vals. beta_vals og_scale_beta equivalent; represent estimated coefficients original scale. \\hat{\\dot{\\boldsymbol{\\beta}}} std_scale_beta \\hat{\\ddot{\\boldsymbol{\\beta}}} stdrot_scale_beta \\dot{\\mathbf{X}}\\hat{\\dot{\\boldsymbol{\\beta}}} Xb \\ddot{\\tilde{\\mathbf{X}}} \\hat{\\ddot{\\boldsymbol{\\beta}}} linear_predictors. Note: words, means linear_predictors code scale rotated re-standardized data! \\hat{\\boldsymbol{\\Sigma}} \\equiv \\hat{\\eta}\\mathbf{K} + (1 - \\hat{\\eta})\\mathbf{} estimated_Sigma. Similarly, \\hat{\\boldsymbol{\\Sigma}}_{11} Sigma_11, etc.","code":""},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"processing-plink-files","dir":"Articles","previous_headings":"","what":"Processing PLINK files","title":"If your data is in PLINK files","text":"First, unzip PLINK files zipped. example data, penncath_lite data ships plmmr zipped; MacOS Linux, can run command unzip: GWAS data, tell plmmr combine information across three PLINK files (.bed, .bim, .fam files). process_plink(). , create files want temporary directory just sake example. Users can specify folder choice rds_dir, shown : ’ll see lot messages printed console … result creation 3 files: imputed_penncath_lite.rds imputed_penncath_lite.bk contain data. 1 show folder PLINK data . returned filepath. .rds object filepath contains processed data, now use create design. didactic purposes, let’s examine ’s imputed_penncath_lite.rds using readRDS() function (Note Don’t analysis - section reads data memory. just illustration):","code":"temp_dir <- tempdir() # using a temp dir -- change to fit your preference unzip_example_data(outdir = temp_dir) #> Unzipped files are saved in /tmp/RtmpzM7QaI # temp_dir <- tempdir() # using a temporary directory (if you didn't already create one above) plink_data <- process_plink(data_dir = temp_dir, data_prefix = \"penncath_lite\", rds_dir = temp_dir, rds_prefix = \"imputed_penncath_lite\", # imputing the mode to address missing values impute_method = \"mode\", # overwrite existing files in temp_dir # (you can turn this feature off if you need to) overwrite = TRUE, # turning off parallelization - # leaving this on causes problems knitting this vignette parallel = FALSE) #> #> Preprocessing penncath_lite data: #> Creating penncath_lite.rds #> #> There are 1401 observations and 4367 genomic features in the specified data files, representing chromosomes 1 - 22 #> There are a total of 3514 SNPs with missing values #> Of these, 13 are missing in at least 50% of the samples #> #> Imputing the missing (genotype) values using mode method #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpzM7QaI/imputed_penncath_lite.rds pen <- readRDS(plink_data) # notice: this is a `processed_plink` object str(pen) # note: genotype data is *not* in memory #> List of 5 #> $ X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"imputed_penncath_lite.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpzM7QaI/\" #> .. .. ..$ totalRows : int 1401 #> .. .. ..$ totalCols : int 4367 #> .. .. ..$ rowOffset : num [1:2] 0 1401 #> .. .. ..$ colOffset : num [1:2] 0 4367 #> .. .. ..$ nrow : num 1401 #> .. .. ..$ ncol : num 4367 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : NULL #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ map:'data.frame': 4367 obs. of 6 variables: #> ..$ chromosome : int [1:4367] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ marker.ID : chr [1:4367] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> ..$ genetic.dist: int [1:4367] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ physical.pos: int [1:4367] 2056735 3188505 4275291 4280630 4286036 4302161 4364564 4388885 4606471 4643688 ... #> ..$ allele1 : chr [1:4367] \"C\" \"T\" \"T\" \"G\" ... #> ..$ allele2 : chr [1:4367] \"T\" \"C\" \"C\" \"A\" ... #> $ fam:'data.frame': 1401 obs. of 6 variables: #> ..$ family.ID : int [1:1401] 10002 10004 10005 10007 10008 10009 10010 10011 10012 10013 ... #> ..$ sample.ID : int [1:1401] 1 1 1 1 1 1 1 1 1 1 ... #> ..$ paternal.ID: int [1:1401] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ maternal.ID: int [1:1401] 0 0 0 0 0 0 0 0 0 0 ... #> ..$ sex : int [1:1401] 1 2 1 1 1 1 1 2 1 2 ... #> ..$ affection : int [1:1401] 1 1 2 1 2 2 2 1 2 -9 ... #> $ n : int 1401 #> $ p : int 4367 #> - attr(*, \"class\")= chr \"processed_plink\" # notice: no more missing values in X any(is.na(pen$genotypes[,])) #> [1] FALSE"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"creating-a-design","dir":"Articles","previous_headings":"","what":"Creating a design","title":"If your data is in PLINK files","text":"Now ready create plmm_design, object pieces need model: design matrix \\mathbf{X}, outcome vector \\mathbf{y}, vector penalty factor indicators (1 = feature penalized, 0 = feature penalized). side note: GWAS studies, typical include non-genomic factors unpenalized covariates part model. instance, may want adjust sex age – factors want ensure always included selected model. plmmr package allows include additional unpenalized predictors via ‘add_predictor’ ‘predictor_id’ options, passed create_design() internal function create_design_filebacked(). example options included create_design() documentation. key part create_design() standardizing columns genotype matrix. didactic example showing columns std_X element design mean = 0 variance = 1. Note something analysis – reads data memory.","code":"# get outcome data penncath_pheno <- read.csv(find_example_data(path = 'penncath_clinical.csv')) phen <- data.frame(FamID = as.character(penncath_pheno$FamID), CAD = penncath_pheno$CAD) pen_design <- create_design(data_file = plink_data, feature_id = \"FID\", rds_dir = temp_dir, new_file = \"std_penncath_lite\", add_outcome = phen, outcome_id = \"FamID\", outcome_col = \"CAD\", logfile = \"design\", # again, overwrite if needed; use with caution overwrite = TRUE) #> There are 62 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-17 19:29:42 #> Done with standardization. File formatting in progress # examine the design - notice the components of this object pen_design_rds <- readRDS(pen_design) str(pen_design_rds) #> List of 16 #> $ X_colnames : chr [1:4367] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> $ X_rownames : chr [1:1401] \"10002\" \"10004\" \"10005\" \"10007\" ... #> $ n : int 1401 #> $ p : int 4367 #> $ is_plink : logi TRUE #> $ outcome_idx : int [1:1401] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : Named int [1:1401] 1 1 1 1 1 1 1 1 1 0 ... #> ..- attr(*, \"names\")= chr [1:1401] \"CAD1\" \"CAD2\" \"CAD3\" \"CAD4\" ... #> $ std_X_rownames: chr [1:1401] \"10002\" \"10004\" \"10005\" \"10007\" ... #> $ ns : int [1:4305] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:4305] \"rs3107153\" \"rs2455124\" \"rs10915476\" \"rs4592237\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_penncath_lite.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpzM7QaI/\" #> .. .. ..$ totalRows : int 1401 #> .. .. ..$ totalCols : int 4305 #> .. .. ..$ rowOffset : num [1:2] 0 1401 #> .. .. ..$ colOffset : num [1:2] 0 4305 #> .. .. ..$ nrow : num 1401 #> .. .. ..$ ncol : num 4305 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : NULL #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 1401 #> $ std_X_p : num 4305 #> $ std_X_center : num [1:4305] 0.00785 0.35974 1.01213 0.06067 0.46253 ... #> $ std_X_scale : num [1:4305] 0.0883 0.7783 0.8636 0.28 1.2791 ... #> $ penalty_factor: num [1:4305] 1 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\" # we can check to see that our data have been standardized std_X <- attach.big.matrix(pen_design_rds$std_X) colMeans(std_X[,]) |> summary() # columns have mean zero... #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> -1.356e-16 -2.334e-17 3.814e-19 9.868e-19 2.520e-17 2.635e-16 apply(std_X[,], 2, var) |> summary() # ... & variance 1 #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 1.001 1.001 1.001 1.001 1.001 1.001"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"fitting-a-model","dir":"Articles","previous_headings":"","what":"Fitting a model","title":"If your data is in PLINK files","text":"Now design object, ready fit model. default, model fitting results saved files folder specified rds_dir argument plmmm. want return model fitting results, set return_fit = TRUE plmm(). examine model results :","code":"pen_fit <- plmm(design = pen_design, trace = T, return_fit = T) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Input data passed all checks at 2024-12-17 19:29:43 #> Starting decomposition. #> Calculating the eigendecomposition of K #> Eigendecomposition finished at 2024-12-17 19:29:45 #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:29:45 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:29:48 #> Beta values are estimated -- almost done! #> Formatting results (backtransforming coefs. to original scale). #> Model ready at 2024-12-17 19:29:48 # you can turn off the trace messages by letting trace = F (default) summary(pen_fit, idx = 50) #> lasso-penalized regression model with n=1401, p=4368 at lambda=0.01211 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 537 #> ------------------------------------------------- plot(pen_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"cross-validation","dir":"Articles","previous_headings":"","what":"Cross validation","title":"If your data is in PLINK files","text":"choose tuning parameter model, plmmr offers cross validation method: plot summary methods CV models well:","code":"cv_fit <- cv_plmm(design = pen_design, type = \"blup\", return_fit = T, trace = T) #> Note: The design matrix is being returned as a file-backed big.matrix object -- see bigmemory::big.matrix() documentation for details. #> Reminder: the X that is returned here is column-standardized #> Starting decomposition. #> Calculating the eigendecomposition of K #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:29:50 #> Setting up lambda/preparing for model fitting. #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:29:53 #> 'Fold' argument is either NULL or missing; assigning folds randomly (by default). #> #> To specify folds for each observation, supply a vector with fold assignments. #> #> Starting cross validation #> | | | 0%Beginning eigendecomposition in fold 1 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 1 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:29:54 #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:29:56 #> | |============== | 20% #> Beginning eigendecomposition in fold 2 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 2 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:29:58 #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:30:00 #> | |============================ | 40%Beginning eigendecomposition in fold 3 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 3 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:30:01 #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:30:04 #> | |========================================== | 60%Beginning eigendecomposition in fold 4 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 4 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:30:05 #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:30:07 #> | |======================================================== | 80%Beginning eigendecomposition in fold 5 : #> Starting decomposition. #> Calculating the eigendecomposition of K #> Fitting model in fold 5 : #> Beginning rotation ('preconditioning'). #> Rotation (preconditiong) finished at 2024-12-17 19:30:08 #> Beginning model fitting. #> Model fitting finished at 2024-12-17 19:30:11 #> | |======================================================================| 100% summary(cv_fit) # summary at lambda value that minimizes CV error #> lasso-penalized model with n=1401 and p=4368 #> At minimum cross-validation error (lambda=0.0406): #> ------------------------------------------------- #> Nonzero coefficients: 6 #> Cross-validation error (deviance): 0.22 #> Scale estimate (sigma): 0.471 plot(cv_fit)"},{"path":"https://pbreheny.github.io/plmmr/articles/plink_files.html","id":"details-create_design-for-plink-data","dir":"Articles","previous_headings":"","what":"Details: create_design() for PLINK data","title":"If your data is in PLINK files","text":"call create_design() involves steps: Integrate external phenotype information, supplied. Note: samples PLINK data phenotype value specified additional phenotype file removed analysis. Identify missing values samples SNPs/features. Impute missing values per user’s specified method. See R documentation bigsnpr::snp_fastImputeSimple() details. Note: plmmr package fit models datasets missing values. missing values must imputed subset analysis. Integrate external predictor information, supplied. matrix meta-data (e.g., age, principal components, etc.). Note: samples supplied file included PLINK data, removed. example, phenotyped participants genotyped participants study, plmmr::create_design() create matrix data representing genotyped samples also data supplied external phenotype file. Create design matrix represents nonsingular features samples predictor phenotype information available (case external data supplied). Standardize design matrix columns mean 0 variance 1.","code":""},{"path":"https://pbreheny.github.io/plmmr/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Tabitha K. Peter. Author. Anna C. Reisetter. Author. Patrick J. Breheny. Author, maintainer. Yujing Lu. Author.","code":""},{"path":"https://pbreheny.github.io/plmmr/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Reisetter , Breheny P (2021). “Penalized linear mixed models structured genetic data.” Genetic epidemiology, 45(5), 427–444. https://doi.org/10.1002/gepi.22384.","code":"@Article{, author = {Anna C. Reisetter and Patrick Breheny}, title = {Penalized linear mixed models for structured genetic data}, journal = {Genetic epidemiology}, year = {2021}, volume = {45}, pages = {427--444}, number = {5}, url = {https://doi.org/10.1002/gepi.22384}, }"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"plmmr-","dir":"","previous_headings":"","what":"plmmr","title":"Penalized Linear Mixed Models for Correlated Data","text":"plmmr (penalized linear mixed models R) package contains functions fit penalized linear mixed models correct unobserved confounding effects.","code":""},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Penalized Linear Mixed Models for Correlated Data","text":"install latest version package GitHub, use : can also install plmmr CRAN: description motivation functions package (along examples) refer second module GWAS data tutorial","code":"devtools::install_github(\"pbreheny/plmmr\") install.packages('plmmr')"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"minimal-example","dir":"","previous_headings":"","what":"Minimal example","title":"Penalized Linear Mixed Models for Correlated Data","text":"","code":"library(plmmr) X <- rnorm(100*20) |> matrix(100, 20) y <- rnorm(100) fit <- plmm(X, y) plot(fit) cvfit <- cv_plmm(X, y) plot(cvfit) summary(cvfit)"},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"so-how-fast-is-plmmr-and-how-well-does-it-scale","dir":"","previous_headings":"","what":"So how fast is plmmr? And how well does it scale?","title":"Penalized Linear Mixed Models for Correlated Data","text":"illustrate important questions, created separate GitHub repository scripts plmmr workflow using publicly-available genome-wide association (GWAS) data. main takeaway: using GWAS data study 1,400 samples 800,000 SNPs, full plmmr analysis run half hour using single core laptop. Three smaller datasets ship plmmr, tutorials walking analyze data sets documented documentation site. datasets useful didactic purposes, large enough really highlight computational scalability plmmr – motivated creation separate repository GWAS workflow.","code":""},{"path":"https://pbreheny.github.io/plmmr/index.html","id":"note-on-branches","dir":"","previous_headings":"","what":"Note on branches","title":"Penalized Linear Mixed Models for Correlated Data","text":"branches repo organized following way: master main (‘head’) branch. gh_pages keeping documentation plmmr gwas_scale archived branch contains development version package used run dissertation analysis. delete eventually.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"helper function implement MCP penalty helper functions implement penalty.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"","code":"MCP(z, l1, l2, gamma, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"z vector representing solution active set feature l1 upper bound (beta) l2 lower bound (beta) gamma tuning parameter MCP penalty v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/MCP.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement MCP penalty The helper functions to implement each penalty. — MCP","text":"numeric vector MCP-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement SCAD penalty — SCAD","title":"helper function to implement SCAD penalty — SCAD","text":"helper function implement SCAD penalty","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement SCAD penalty — SCAD","text":"","code":"SCAD(z, l1, l2, gamma, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement SCAD penalty — SCAD","text":"z solution active set feature l1 upper bound l2 lower bound gamma tuning parameter SCAD penalty v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/SCAD.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement SCAD penalty — SCAD","text":"numeric vector SCAD-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to add predictors to a filebacked matrix of data — add_predictors","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"helper function add predictors filebacked matrix data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"","code":"add_predictors(obj, add_predictor, id_var, rds_dir, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"obj bigSNP object add_predictor Optional: add additional covariates/predictors/features external file (.e., PLINK file). id_var String specifying column PLINK .fam file unique sample identifiers. rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir(process_plink() call) quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/add_predictors.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to add predictors to a filebacked matrix of data — add_predictors","text":"list 2 components: 'obj' - bigSNP object added element representing matrix includes additional predictors first columns 'non_gen' - integer vector ranges 1 number added predictors. Example: 2 predictors added, unpen= 1:2","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":null,"dir":"Reference","previous_headings":"","what":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"function designed use BLUP prediction. objective get matrix estimated beta coefficients standardized scale, dimension original/training data. adding rows 0s std_scale_beta matrix corresponding singular features X.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"","code":"adjust_beta_dimension(std_scale_beta, p, std_X_details, fbm_flag, plink_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"std_scale_beta matrix estimated beta coefficients scale standardized original/training data Note: rows matrix represent nonsingular columns design matrix p number columns original/training design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' fbm_flag Logical: model fit filebacked? plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/adjust_beta_dimension.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"An internal function to adjust the dimensions of a matrix of estimated coefficients returned by plmm_fit(). — adjust_beta_dimension","text":"std_scale_b_og_dim: matrix estimated beta coefs. still scale std_X, dimension X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":null,"dir":"Reference","previous_headings":"","what":"Admix: Semi-simulated SNP data — admix","title":"Admix: Semi-simulated SNP data — admix","text":"dataset containing 100 SNPs, demographic variable representing race, simulated outcome","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Admix: Semi-simulated SNP data — admix","text":"","code":"admix"},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Admix: Semi-simulated SNP data — admix","text":"list 3 components X SNP matrix (197 observations 100 SNPs) y vector simulated (continuous) outcomes race vector racial group categorization: # 0 = African, 1 = African American, 2 = European, 3 = Japanese","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/admix.html","id":"source","dir":"Reference","previous_headings":"","what":"Source","title":"Admix: Semi-simulated SNP data — admix","text":"https://hastie.su.domains/CASI/","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to support process_plink() — align_ids","title":"A helper function to support process_plink() — align_ids","text":"helper function support process_plink()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to support process_plink() — align_ids","text":"","code":"align_ids(id_var, quiet, add_predictor, og_ids)"},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to support process_plink() — align_ids","text":"id_var String specifying variable name ID column quiet Logical: message printed? add_predictor External data include design matrix. add_predictors... arg process_plink() og_ids Character vector PLINK ids (FID IID) original data (.e., data subsetting handling missing phenotypes)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/align_ids.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to support process_plink() — align_ids","text":"matrix dimensions add_predictor","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":null,"dir":"Reference","previous_headings":"","what":"a version of cbind() for file-backed matrices — big_cbind","title":"a version of cbind() for file-backed matrices — big_cbind","text":"version cbind() file-backed matrices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a version of cbind() for file-backed matrices — big_cbind","text":"","code":"big_cbind(A, B, C, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a version of cbind() for file-backed matrices — big_cbind","text":"-memory data B file-backed data C file-backed placeholder combined data quiet Logical","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/big_cbind.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a version of cbind() for file-backed matrices — big_cbind","text":"C, filled column values B combined","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":null,"dir":"Reference","previous_headings":"","what":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"check_for_file_extension: function make package 'smart' enough handle .rds file extensions","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"","code":"check_for_file_extension(path)"},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"path string specifying file path ends file name, e.g. \"~/dir/my_file.rds\"","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/check_for_file_extension.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"check_for_file_extension: a function to make our package 'smart' enough to handle .rds file extensions — check_for_file_extension","text":"string filepath without extension, e.g. \"~/dir/my_file\"","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Coef method for ","title":"Coef method for ","text":"Coef method \"cv_plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Coef method for ","text":"","code":"# S3 method for class 'cv_plmm' coef(object, lambda, which = object$min, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Coef method for ","text":"object object class \"cv_plmm.\" lambda numeric vector lambda values. Vector lambda indices coefficients return. Defaults lambda index minimum CVE. ... Additional arguments (used).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Coef method for ","text":"Returns named numeric vector. Values coefficients model specified value either lambda . Names values lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Coef method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design, return_fit = TRUE) head(coef(cv_fit)) #> (Intercept) Snp1 Snp2 Snp3 Snp4 Snp5 #> 4.326474 0.000000 0.000000 0.000000 0.000000 0.000000"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Coef method for ","title":"Coef method for ","text":"Coef method \"plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Coef method for ","text":"","code":"# S3 method for class 'plmm' coef(object, lambda, which = 1:length(object$lambda), drop = TRUE, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Coef method for ","text":"object object class \"plmm.\" lambda numeric vector lambda values. Vector lambda indices coefficients return. drop Logical. ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Coef method for ","text":"Either numeric matrix (model fit data stored memory) sparse matrix (model fit data stored filebacked). Rownames feature names, columns values lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/coef.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Coef method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) coef(fit)[1:10, 41:45] #> 0.02673 0.02493 0.02325 0.02168 0.02022 #> (Intercept) 6.556445885 6.59257224 6.62815211 6.66317769 6.69366816 #> Snp1 -0.768261488 -0.78098090 -0.79310257 -0.80456803 -0.81482505 #> Snp2 0.131945426 0.13991539 0.14735024 0.15387884 0.15929074 #> Snp3 2.826806831 2.83842545 2.84879468 2.85860151 2.86047026 #> Snp4 0.036981534 0.04652885 0.05543821 0.06376126 0.07133592 #> Snp5 0.546784811 0.57461391 0.60049082 0.62402782 0.64291324 #> Snp6 -0.026215632 -0.03072017 -0.03494534 -0.03889146 -0.04256362 #> Snp7 0.009342269 0.01539705 0.02103262 0.02615358 0.03069956 #> Snp8 0.000000000 0.00000000 0.00000000 0.00000000 0.00000000 #> Snp9 0.160794660 0.16217570 0.16337102 0.16464901 0.16638663"},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"function create estimated variance matrix PLMM fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"","code":"construct_variance(fit, K = NULL, eta = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"fit object returned plmm() K optional matrix eta optional numeric value 0 1; fit supplied, option must specified.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/construct_variance.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to create the estimated variance matrix from a PLMM fit — construct_variance","text":"Sigma_hat, matrix representing estimated variance","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to count constant features — count_constant_features","title":"A helper function to count constant features — count_constant_features","text":"helper function count constant features","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to count constant features — count_constant_features","text":"","code":"count_constant_features(fbm, outfile, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to count constant features — count_constant_features","text":"fbm filebacked big.matrix outfile String specifying name log file quiet Logical: message printed console","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_constant_features.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to count constant features — count_constant_features","text":"ns numeric vector indices non-singular columns matrix associated counts","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to count the number of cores available on the current machine — count_cores","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"helper function count number cores available current machine","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"","code":"count_cores()"},{"path":"https://pbreheny.github.io/plmmr/reference/count_cores.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to count the number of cores available on the current machine — count_cores","text":"number cores use; parallel installed, parallel::detectCores(). Otherwise, returns 1.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to create a design for PLMM modeling — create_design","title":"a function to create a design for PLMM modeling — create_design","text":"function create design PLMM modeling","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to create a design for PLMM modeling — create_design","text":"","code":"create_design(data_file = NULL, rds_dir = NULL, X = NULL, y = NULL, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"a function to create a design for PLMM modeling — create_design","text":"data_file filebacked data (data process_plink() process_delim()), filepath processed data. Defaults NULL (argument apply -memory data). rds_dir filebacked data, filepath directory/folder want design saved. Note: include/append name want --created file – name argument new_file, passed create_design_filebacked(). Defaults NULL (argument apply -memory data). X -memory data (data matrix data frame), design matrix. Defaults NULL (argument apply filebacked data). y -memory data, numeric vector representing outcome. Defaults NULL (argument apply filebacked data). Note: responsibility user ensure rows X corresponding elements y row order, .e., observations must order design matrix outcome vector. ... Additional arguments pass create_design_filebacked() create_design_in_memory(). See documentation helper functions details.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to create a design for PLMM modeling — create_design","text":"filepath object class plmm_design, named list design matrix, outcome, penalty factor vector, details needed fitting model. list stored .rds file filebacked data, filebacked case string path file returned. -memory data, list returned.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"a function to create a design for PLMM modeling — create_design","text":"function wrapper create_design...() inner functions; arguments included passed along create_design...() inner function matches type data supplied. Note arguments optional ones . Additional arguments filebacked data: new_file User-specified filename (without .bk/.rds extension) --created .rds/.bk files. Must different existing .rds/.bk files folder. feature_id Optional: string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. - PLINK data: string specifying ID column PLINK .fam file. Options \"IID\" (default) \"FID\" - filebacked data: character vector unique identifiers (IDs) row feature data (.e., data processed process_delim()) - left NULL (default), X assumed row-order add_outcome. Note: assumption made error, calculations downstream incorrect. Pay close attention . add_outcome data frame matrix two columns: ID column column outcome value (used 'y' final design). IDs must characters, outcome must numeric. outcome_id string specifying name ID column 'add_outcome' outcome_col string specifying name phenotype column 'add_outcome' na_outcome_vals Optional: vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). overwrite Optional: logical - existing .rds files overwritten? Defaults FALSE. logfile Optional: name '.log' file written – Note: append .log filename; done automatically. quiet Optional: logical - messages printed console silenced? Defaults FALSE Additional arguments specific PLINK data: add_predictor Optional (PLINK data ): matrix data frame used adding additional unpenalized covariates/predictors/features external file (.e., PLINK file). matrix must one column ID column; columns aside ID used covariates design matrix. Columns must named. predictor_id Optional (PLINK data ): string specifying name column 'add_predictor' sample IDs. Required 'add_predictor' supplied. names used subset align external covariate supplied PLINK data. Additional arguments specific delimited file data: unpen Optional: character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, delimited file must column names. Additional arguments -memory data: unpen Optional: character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"a function to create a design for PLMM modeling — create_design","text":"","code":"## Example 1: matrix data in-memory ## admix_design <- create_design(X = admix$X, y = admix$y, unpen = \"Snp1\") ## Example 2: delimited data ## # process delimited data temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), overwrite = TRUE, rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", header = TRUE) #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpuyGgXe/processed_colon2.rds # prepare outcome data colon_outcome <- read.delim(find_example_data(path = \"colon2_outcome.txt\")) # create a design colon_design <- create_design(data_file = colon_dat, rds_dir = temp_dir, new_file = \"std_colon2\", add_outcome = colon_outcome, outcome_id = \"ID\", outcome_col = \"y\", unpen = \"sex\", overwrite = TRUE, logfile = \"test.log\") #> No feature_id supplied; will assume data X are in same row-order as add_outcome. #> There are 0 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-17 19:29:03 #> Done with standardization. File formatting in progress # look at the results colon_rds <- readRDS(colon_design) str(colon_rds) #> List of 18 #> $ X_colnames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ X_rownames : chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ n : num 62 #> $ p : num 2001 #> $ is_plink : logi FALSE #> $ outcome_idx : int [1:62] 1 2 3 4 5 6 7 8 9 10 ... #> $ y : int [1:62] 1 0 1 0 1 0 1 0 1 0 ... #> $ std_X_rownames: chr [1:62] \"row1\" \"row2\" \"row3\" \"row4\" ... #> $ unpen : int 1 #> $ unpen_colnames: chr \"sex\" #> $ ns : int [1:2001] 1 2 3 4 5 6 7 8 9 10 ... #> $ std_X_colnames: chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> $ std_X :Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"std_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpuyGgXe/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ std_X_n : num 62 #> $ std_X_p : num 2001 #> $ std_X_center : num [1:2001] 1.47 7015.79 4966.96 4094.73 3987.79 ... #> $ std_X_scale : num [1:2001] 0.499 3067.926 2171.166 1803.359 2002.738 ... #> $ penalty_factor: num [1:2001] 0 1 1 1 1 1 1 1 1 1 ... #> - attr(*, \"class\")= chr \"plmm_design\" ## Example 3: PLINK data ## # \\donttest{ # process PLINK data temp_dir <- tempdir() unzip_example_data(outdir = temp_dir) #> Unzipped files are saved in /tmp/RtmpuyGgXe plink_data <- process_plink(data_dir = temp_dir, data_prefix = \"penncath_lite\", rds_dir = temp_dir, rds_prefix = \"imputed_penncath_lite\", # imputing the mode to address missing values impute_method = \"mode\", # overwrite existing files in temp_dir # (you can turn this feature off if you need to) overwrite = TRUE, # turning off parallelization - leaving this on causes problems knitting this vignette parallel = FALSE) #> #> Preprocessing penncath_lite data: #> Creating penncath_lite.rds #> #> There are 1401 observations and 4367 genomic features in the specified data files, representing chromosomes 1 - 22 #> There are a total of 3514 SNPs with missing values #> Of these, 13 are missing in at least 50% of the samples #> #> Imputing the missing (genotype) values using mode method #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpuyGgXe/imputed_penncath_lite.rds # get outcome data penncath_pheno <- read.csv(find_example_data(path = 'penncath_clinical.csv')) outcome <- data.frame(FamID = as.character(penncath_pheno$FamID), CAD = penncath_pheno$CAD) unpen_predictors <- data.frame(FamID = as.character(penncath_pheno$FamID), sex = penncath_pheno$sex, age = penncath_pheno$age) # create design where sex and age are always included in the model pen_design <- create_design(data_file = plink_data, feature_id = \"FID\", rds_dir = temp_dir, new_file = \"std_penncath_lite\", add_outcome = outcome, outcome_id = \"FamID\", outcome_col = \"CAD\", add_predictor = unpen_predictors, predictor_id = \"FamID\", logfile = \"design\", # again, overwrite if needed; use with caution overwrite = TRUE) #> #> Aligning external data with the feature data by FamID #> Adding predictors from external data. #> Aligning IDs between fam and predictor files #> Column-wise combining data sets #> | | | 0% | | | 1% | |= | 1% | |= | 2% | |== | 2% | |== | 3% | |== | 4% | |=== | 4% | |=== | 5% | |==== | 5% | |==== | 6% | |===== | 6% | |===== | 7% | |===== | 8% | |====== | 8% | |====== | 9% | |======= | 9% | |======= | 10% | |======= | 11% | |======== | 11% | |======== | 12% | |========= | 12% | |========= | 13% | |========= | 14% | |========== | 14% | |========== | 15% | |=========== | 15% | |=========== | 16% | |============ | 16% | |============ | 17% | |============ | 18% | |============= | 18% | |============= | 19% | |============== | 19% | |============== | 20% | |============== | 21% | |=============== | 21% | |=============== | 22% | |================ | 22% | |================ | 23% | |================ | 24% | |================= | 24% | |================= | 25% | |================== | 25% | |================== | 26% | |=================== | 26% | |=================== | 27% | |=================== | 28% | |==================== | 28% | |==================== | 29% | |===================== | 29% | |===================== | 30% | |===================== | 31% | |====================== | 31% | |====================== | 32% | |======================= | 32% | |======================= | 33% | |======================= | 34% | |======================== | 34% | |======================== | 35% | |========================= | 35% | |========================= | 36% | |========================== | 36% | |========================== | 37% | |========================== | 38% | |=========================== | 38% | |=========================== | 39% | |============================ | 39% | |============================ | 40% | |============================ | 41% | |============================= | 41% | |============================= | 42% | |============================== | 42% | |============================== | 43% | |============================== | 44% | |=============================== | 44% | |=============================== | 45% | |================================ | 45% | |================================ | 46% | |================================= | 46% | |================================= | 47% | |================================= | 48% | |================================== | 48% | |================================== | 49% | |=================================== | 49% | |=================================== | 50% | |=================================== | 51% | |==================================== | 51% | |==================================== | 52% | |===================================== | 52% | |===================================== | 53% | |===================================== | 54% | |====================================== | 54% | |====================================== | 55% | |======================================= | 55% | |======================================= | 56% | |======================================== | 56% | |======================================== | 57% | |======================================== | 58% | |========================================= | 58% | |========================================= | 59% | |========================================== | 59% | |========================================== | 60% | |========================================== | 61% | |=========================================== | 61% | |=========================================== | 62% | |============================================ | 62% | |============================================ | 63% | |============================================ | 64% | |============================================= | 64% | |============================================= | 65% | |============================================== | 65% | |============================================== | 66% | |=============================================== | 66% | |=============================================== | 67% | |=============================================== | 68% | |================================================ | 68% | |================================================ | 69% | |================================================= | 69% | |================================================= | 70% | |================================================= | 71% | |================================================== | 71% | |================================================== | 72% | |=================================================== | 72% | |=================================================== | 73% | |=================================================== | 74% | |==================================================== | 74% | |==================================================== | 75% | |===================================================== | 75% | |===================================================== | 76% | |====================================================== | 76% | |====================================================== | 77% | |====================================================== | 78% | |======================================================= | 78% | |======================================================= | 79% | |======================================================== | 79% | |======================================================== | 80% | |======================================================== | 81% | |========================================================= | 81% | |========================================================= | 82% | |========================================================== | 82% | |========================================================== | 83% | |========================================================== | 84% | |=========================================================== | 84% | |=========================================================== | 85% | |============================================================ | 85% | |============================================================ | 86% | |============================================================= | 86% | |============================================================= | 87% | |============================================================= | 88% | |============================================================== | 88% | |============================================================== | 89% | |=============================================================== | 89% | |=============================================================== | 90% | |=============================================================== | 91% | |================================================================ | 91% | |================================================================ | 92% | |================================================================= | 92% | |================================================================= | 93% | |================================================================= | 94% | |================================================================== | 94% | |================================================================== | 95% | |=================================================================== | 95% | |=================================================================== | 96% | |==================================================================== | 96% | |==================================================================== | 97% | |==================================================================== | 98% | |===================================================================== | 98% | |===================================================================== | 99% | |======================================================================| 99% | |======================================================================| 100% #> There are 62 constant features in the data #> Subsetting data to exclude constant features (e.g., monomorphic SNPs) #> Column-standardizing the design matrix... #> Standardization completed at 2024-12-17 19:29:06 #> Done with standardization. File formatting in progress # examine the design - notice the components of this object pen_design_rds <- readRDS(pen_design) # }"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"function create design matrix, outcome, penalty factor passed model fitting function","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"","code":"create_design_filebacked( data_file, rds_dir, obj, new_file, feature_id = NULL, add_outcome, outcome_id, outcome_col, na_outcome_vals = c(-9, NA_integer_), add_predictor = NULL, predictor_id = NULL, unpen = NULL, logfile = NULL, overwrite = FALSE, quiet = FALSE )"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"data_file filepath rds file processed data (data process_plink() process_delim()) rds_dir path directory want create new '.rds' '.bk' files. obj RDS object read create_design() new_file User-specified filename (without .bk/.rds extension) --created .rds/.bk files. Must different existing .rds/.bk files folder. feature_id string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. - PLINK data: string specifying ID column PLINK .fam file. Options \"IID\" (default) \"FID\" - filebacked data: character vector unique identifiers (IDs) row feature data (.e., data processed process_delim()) - left NULL (default), X assumed row-order add_outcome. Note: assumption made error, calculations downstream incorrect. Pay close attention . add_outcome data frame matrix two columns: ID column column outcome value (used 'y' final design). IDs must characters, outcome must numeric. outcome_id string specifying name ID column 'add_outcome' outcome_col string specifying name phenotype column 'add_outcome' na_outcome_vals vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). add_predictor Optional (PLINK data ): matrix data frame used adding additional unpenalized covariates/predictors/features external file (.e., PLINK file). matrix must one column ID column; columns aside ID used covariates design matrix. Columns must named. predictor_id Optional (PLINK data ): string specifying name column 'add_predictor' sample IDs. Required 'add_predictor' supplied. names used subset align external covariate supplied PLINK data. unpen Optional (delimited file data ): optional character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names. logfile Optional: name '.log' file written – Note: append .log filename; done automatically. overwrite Logical: existing .rds files overwritten? Defaults FALSE. quiet Logical: messages printed console silenced? Defaults FALSE","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to create a design matrix, outcome, and penalty factor to be passed to a model fitting function — create_design_filebacked","text":"filepath created .rds file containing information model fitting, including standardized X model design information","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to create a design with an in-memory X matrix — create_design_in_memory","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"function create design -memory X matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"","code":"create_design_in_memory(X, y, unpen = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"X numeric matrix rows correspond observations (e.g., samples) columns correspond features. y numeric vector representing outcome model. Note: responsibility user ensure outcome_col X row order! unpen optional character vector names columns mark unpenalized (.e., features always included model). Note: choose use option, X must column names.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_design_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to create a design with an in-memory X matrix — create_design_in_memory","text":"list elements including standardized X model design information","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":null,"dir":"Reference","previous_headings":"","what":"create_log_file — create_log","title":"create_log_file — create_log","text":"create_log_file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"create_log_file — create_log","text":"","code":"create_log(outfile, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"create_log_file — create_log","text":"outfile String specifying name --created file, without extension ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/create_log.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"create_log_file — create_log","text":"Nothing returned, intead text file suffix .log created.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Cross-validation for plmm — cv_plmm","title":"Cross-validation for plmm — cv_plmm","text":"Performs k-fold cross validation lasso-, MCP-, SCAD-penalized linear mixed models grid values regularization parameter lambda.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Cross-validation for plmm — cv_plmm","text":"","code":"cv_plmm( design, y = NULL, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", type = \"blup\", gamma, alpha = 1, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, dfmax = NULL, warn = TRUE, init = NULL, cluster, nfolds = 5, seed, fold = NULL, returnY = FALSE, returnBiasDetails = FALSE, trace = FALSE, save_rds = NULL, save_fold_res = FALSE, return_fit = TRUE, compact_save = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Cross-validation for plmm — cv_plmm","text":"design first argument must one three things: (1) plmm_design object (created create_design()) (2) string file path design object (file path must end '.rds') (3) matrix data.frame object representing design matrix interest y Optional: case design matrix data.frame, user must also supply numeric outcome vector y argument. case, design y passed internally create_design(X = design, y = y). K Similarity matrix used rotate data. either (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'u', returned choose_k(). diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"lasso\" (default), \"SCAD\", \"MCP\". type character argument indicating returned predict.plmm(). type == 'lp', predictions based linear predictor, X beta. type == 'blup', predictions based sum linear predictor estimated random effect (BLUP). Defaults 'blup', shown superior prediction method many applications. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (future idea; yet incorporated) Calculate index objective function ceases locally convex? Default TRUE. dfmax (future idea; yet incorporated) Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. warn Return warning messages failures converge model saturation? Default TRUE. init Initial values coefficients. Default 0 columns X. cluster cv_plmm() can run parallel across cluster using parallel package. cluster must set advance using parallel::makeCluster(). cluster must passed cv_plmm(). nfolds number cross-validation folds. Default 5. seed may set seed random number generator order obtain reproducible results. fold fold observation belongs . default, observations randomly assigned. returnY cv_plmm() return linear predictors cross-validation folds? Default FALSE; TRUE, return matrix element row , column j fitted value observation fold observation excluded fit, jth value lambda. returnBiasDetails Logical: cross-validation bias (numeric value) loss (n x p matrix) returned? Defaults FALSE. trace set TRUE, inform user progress announcing beginning CV fold. Default FALSE. save_rds Optional: filepath name without '.rds' suffix specified (e.g., save_rds = \"~/dir/my_results\"), model results saved provided location (e.g., \"~/dir/my_results.rds\"). Defaults NULL, save result. save_fold_res Optional: logical value indicating whether results (loss predicted values) CV fold saved? TRUE, two '.rds' files saved ('loss' 'yhat') created directory 'save_rds'. files updated fold done. Defaults FALSE. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE. compact_save Optional: TRUE, three separate .rds files saved: one 'beta_vals', one 'K', one everything else (see ). Defaults FALSE. Note: must specify save_rds argument called. ... Additional arguments plmm_fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Cross-validation for plmm — cv_plmm","text":"list 12 items: type: type prediction used ('lp' 'blup') cve: numeric vector cross validation error (CVE) value lambda cvse: numeric vector estimated standard error associated value cve fold: numeric n length vector integers indicating fold observation assigned lambda: numeric vector lambda values fit: overall fit object, including predictors; list returned plmm() min: index corresponding value lambda minimizes cve lambda_min: lambda value cve minmized min1se: index corresponding value lambda within standard error minimizes cve lambda1se: largest value lambda error within 1 standard error minimum. null.dev: numeric value representing deviance intercept-model. supplied lambda sequence, quantity may meaningful. estimated_Sigma: n x n matrix representing estimated covariance matrix.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Cross-validation for plmm — cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) print(summary(cv_fit)) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2493): #> ------------------------------------------------- #> Nonzero coefficients: 5 #> Cross-validation error (deviance): 2.00 #> Scale estimate (sigma): 1.413 plot(cv_fit) # Note: for examples with filebacked data, see the filebacking vignette # https://pbreheny.github.io/plmmr/articles/filebacking.html"},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":null,"dir":"Reference","previous_headings":"","what":"Cross-validation internal function for cv_plmm — cvf","title":"Cross-validation internal function for cv_plmm — cvf","text":"Internal function cv_plmm calls plmm fold subset original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Cross-validation internal function for cv_plmm — cvf","text":"","code":"cvf(i, fold, type, cv_args, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Cross-validation internal function for cv_plmm — cvf","text":"Fold number excluded fit. fold n-length vector fold-assignments. type character argument indicating returned predict.plmm. type == 'lp' predictions based linear predictor, $X beta$. type == 'individual' predictions based linear predictor plus estimated random effect (BLUP). cv_args List additional arguments passed plmm. ... Optional arguments predict_within_cv","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/cvf.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Cross-validation internal function for cv_plmm — cvf","text":"list three elements: numeric vector loss value lambda numeric value indicating number lambda values used numeric value predicted outcome (y hat) values lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"function take eigendecomposition K Note: faster taking SVD X p >> n","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"","code":"eigen_K(std_X, fbm_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"std_X standardized design matrix, stored big.matrix object. fbm_flag Logical: std_X FBM obejct? Passed plmm().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/eigen_K.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to take the eigendecomposition of K Note: This is faster than taking SVD of X when p >> n — eigen_K","text":"list eigenvectors eigenvalues K","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":null,"dir":"Reference","previous_headings":"","what":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"Estimate eta (used rotating data) function called internally plmm()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"","code":"estimate_eta(n, s, U, y, eta_star)"},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"n number observations s singular values K, realized relationship matrix U left-singular vectors standardized design matrix y Continuous outcome vector.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/estimate_eta.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Estimate eta (to be used in rotating the data) This function is called internally by plmm() — estimate_eta","text":"numeric value estimated value eta, variance parameter","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":null,"dir":"Reference","previous_headings":"","what":"Functions to convert between FBM and big.matrix type objects — fbm2bm","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"Functions convert FBM big.matrix type objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"","code":"fbm2bm(fbm, desc = FALSE)"},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"fbm FBM object; see bigstatsr::FBM() details desc Logical: descriptor file desired (opposed filebacked big matrix)? Defaults FALSE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/fbm2bm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Functions to convert between FBM and big.matrix type objects — fbm2bm","text":"big.matrix - see bigmemory::filebacked.big.matrix() details","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to get the file path of a file without the extension — file_sans_ext","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"helper function get file path file without extension","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"","code":"file_sans_ext(path)"},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"path path file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/file_sans_ext.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to get the file path of a file without the extension — file_sans_ext","text":"path_sans_ext filepath without extension","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to help with accessing example PLINK files — find_example_data","title":"A function to help with accessing example PLINK files — find_example_data","text":"function help accessing example PLINK files","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to help with accessing example PLINK files — find_example_data","text":"","code":"find_example_data(path, parent = FALSE)"},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to help with accessing example PLINK files — find_example_data","text":"path Argument (string) specifying path (filename) external data file extdata/ parent path=TRUE user wants name parent directory file located, set parent=TRUE. Defaults FALSE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to help with accessing example PLINK files — find_example_data","text":"path=NULL, character vector file names returned. path given, character string full file path","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/find_example_data.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to help with accessing example PLINK files — find_example_data","text":"","code":"find_example_data(parent = TRUE) #> [1] \"/home/runner/work/_temp/Library/plmmr/extdata\""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":null,"dir":"Reference","previous_headings":"","what":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"Read processed data function intended called either process_plink() process_delim() called .","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"","code":"get_data(path, returnX = FALSE, trace = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"path file path RDS object containing processed data. add '.rds' extension path. returnX Logical: design matrix returned numeric matrix stored memory. default, FALSE. trace Logical: trace messages shown? Default TRUE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Read in processed data This function is intended to be called after either process_plink() or process_delim() has been called once. — get_data","text":"list components: std_X, column-standardized design matrix either (1) numeric matrix (2) filebacked matrix (FBM). See bigstatsr::FBM() bigsnpr::bigSnp-class documentation details. (PLINK data) fam, data frame containing pedigree information (like .fam file PLINK) (PLINK data) map, data frame containing feature information (like .bim file PLINK) ns: vector indicating columns X contain nonsingular features (.e., features variance != 0. center: vector values centering column X scale: vector values scaling column X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to return the computer's host name — get_hostname","title":"a function to return the computer's host name — get_hostname","text":"function return computer's host name","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to return the computer's host name — get_hostname","text":"","code":"get_hostname()"},{"path":"https://pbreheny.github.io/plmmr/reference/get_hostname.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to return the computer's host name — get_hostname","text":"String hostname current machine","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to impute SNP data — impute_snp_data","title":"A function to impute SNP data — impute_snp_data","text":"function impute SNP data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to impute SNP data — impute_snp_data","text":"","code":"impute_snp_data( obj, X, impute, impute_method, parallel, outfile, quiet, seed = as.numeric(Sys.Date()), ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to impute SNP data — impute_snp_data","text":"obj bigSNP object (created read_plink_files()) X matrix genotype data returned name_and_count_bigsnp impute Logical: data imputed? Default TRUE. impute_method 'impute' = TRUE, argument specify kind imputation desired. Options : mode (default): Imputes frequent call. See bigsnpr::snp_fastImputeSimple() details. random: Imputes sampling according allele frequencies. mean0: Imputes rounded mean. mean2: Imputes mean rounded 2 decimal places. xgboost: Imputes using algorithm based local XGBoost models. See bigsnpr::snp_fastImpute() details. Note: can take several minutes, even relatively small data set. parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults TRUE seed Numeric value passed seed impute_method = 'xgboost'. Defaults .numeric(Sys.Date()) ... Optional: additional arguments bigsnpr::snp_fastImpute() (relevant impute_method = \"xgboost\")","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/impute_snp_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to impute SNP data — impute_snp_data","text":"Nothing returned, obj$genotypes overwritten imputed version data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to align genotype and phenotype data — index_samples","title":"A function to align genotype and phenotype data — index_samples","text":"function align genotype phenotype data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to align genotype and phenotype data — index_samples","text":"","code":"index_samples( obj, rds_dir, indiv_id, add_outcome, outcome_id, outcome_col, na_outcome_vals, outfile, quiet )"},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to align genotype and phenotype data — index_samples","text":"obj object created process_plink() rds_dir path directory want create new '.rds' '.bk' files. indiv_id character string indicating ID column name 'fam' element genotype data list. Defaults 'sample.ID', equivalent 'IID' PLINK. option 'family.ID', equivalent 'FID' PLINK. add_outcome data frame least two columns: ID column phenotype column outcome_id string specifying name ID column pheno outcome_col string specifying name phenotype column pheno. column used default y argument 'plmm()'. na_outcome_vals vector numeric values used code NA values outcome. Defaults c(-9, NA_integer) (-9 matches PLINK conventions). outfile string name filepath log file quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_samples.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to align genotype and phenotype data — index_samples","text":"list two items: data.table rows corresponding samples genotype phenotype available. numeric vector indices indicating samples 'complete' (.e., samples add_outcome corresponding data PLINK files)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":null,"dir":"Reference","previous_headings":"","what":"Helper function to index standardized data — index_std_X","title":"Helper function to index standardized data — index_std_X","text":"Helper function index standardized data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Helper function to index standardized data — index_std_X","text":"","code":"index_std_X(std_X_p, non_genomic)"},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Helper function to index standardized data — index_std_X","text":"std_X_p number features standardized matrix data (may filebacked) non_genomic Integer vector columns std_X representing non-genomic data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/index_std_X.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Helper function to index standardized data — index_std_X","text":"list indices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":null,"dir":"Reference","previous_headings":"","what":"Generate nicely formatted lambda vec — lam_names","title":"Generate nicely formatted lambda vec — lam_names","text":"Generate nicely formatted lambda vec","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Generate nicely formatted lambda vec — lam_names","text":"","code":"lam_names(l)"},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Generate nicely formatted lambda vec — lam_names","text":"l Vector lambda values.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lam_names.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Generate nicely formatted lambda vec — lam_names","text":"character vector formatted lambda value names","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":null,"dir":"Reference","previous_headings":"","what":"helper function to implement lasso penalty — lasso","title":"helper function to implement lasso penalty — lasso","text":"helper function implement lasso penalty","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"helper function to implement lasso penalty — lasso","text":"","code":"lasso(z, l1, l2, v)"},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"helper function to implement lasso penalty — lasso","text":"z solution active set feature l1 upper bound l2 lower bound v 'xtx' term","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/lasso.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"helper function to implement lasso penalty — lasso","text":"numeric vector lasso-penalized coefficient estimates within given bounds","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":null,"dir":"Reference","previous_headings":"","what":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"function allows evaluate negative log-likelihood linear mixed model assumption null model order estimate variance parameter, eta.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"","code":"log_lik(eta, n, s, U, y, rot_y = NULL)"},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"eta proportion variance outcome attributable causal SNP effects. words, signal--noise ratio. n number observations s singular values K, realized relationship matrix U left-singular vectors standardized design matrix y Continuous outcome vector. rot_y Optional: y already rotated, can supplied.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/log_lik.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Evaluate the negative log-likelihood of an intercept-only Gaussian plmm model — log_lik","text":"value log-likelihood PLMM, evaluated supplied parameters","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"helper function label summarize contents bigSNP","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"","code":"name_and_count_bigsnp(obj, id_var, quiet, outfile)"},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"obj bigSNP object, possibly subset add_external_phenotype() id_var String specifying column PLINK .fam file unique sample identifiers. Options \"IID\" (default) \"FID\". quiet Logical: messages printed console? Defaults TRUE outfile string name .log file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/name_and_count_bigsnp.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to label and summarize the contents of a bigSNP — name_and_count_bigsnp","text":"list components: counts: column-wise summary minor allele counts 'genotypes' obj: modified bigSNP list additional components X: obj$genotypes FBM pos: obj$map$physical.pos vector","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"Fit linear mixed model via non-convex penalized maximum likelihood.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"","code":"plmm( design, y = NULL, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", init = NULL, gamma, alpha = 1, dfmax = NULL, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, warn = TRUE, trace = FALSE, save_rds = NULL, compact_save = FALSE, return_fit = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"design first argument must one three things: (1) plmm_design object (created create_design()) (2) string file path design object (file path must end '.rds') (3) matrix data.frame object representing design matrix interest y Optional: case design matrix data.frame, user must also supply numeric outcome vector y argument. case, design y passed internally create_design(X = design, y = y). K Similarity matrix used rotate data. either : (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'U', returned previous plmm() model fit data. diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"lasso\" (default), \"SCAD\", \"MCP\". init Initial values coefficients. Default 0 columns X. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. dfmax (Future idea; yet incorporated): Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (Future idea; yet incorporated): Calculate index objective function ceases locally convex? Default TRUE. warn Return warning messages failures converge model saturation? Default TRUE. trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. save_rds Optional: filepath name without '.rds' suffix specified (e.g., save_rds = \"~/dir/my_results\"), model results saved provided location (e.g., \"~/dir/my_results.rds\"). Defaults NULL, save result. compact_save Optional: TRUE, three separate .rds files saved: one 'beta_vals', one 'K', one linear predictors, one everything else (see ). Defaults FALSE. Note: must specify save_rds argument called. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE -memory data, defaults FALSE filebacked data. ... Additional optional arguments plmm_checks()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"list includes 19 items: beta_vals: matrix estimated coefficients original scale. Rows predictors, columns values lambda std_scale_beta: matrix estimated coefficients ~standardized~ scale. returned compact_save = TRUE. std_X_details: list 3 items: center & scale values used center/scale data, vector ('ns') nonsingular columns original data. Nonsingular columns standardized (definition), removed analysis. std_X: standardized design matrix; data filebacked, object filebacked.big.matrix bigmemory package. Note: std_X saved/returned return_fit = FALSE. y: outcome vector used model fitting. p: total number columns design matrix (including singular columns). plink_flag: logical flag: data come PLINK files? lambda: numeric vector lasso tuning parameter values used model fitting. eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure linear_predictors: matrix resulting product stdrot_X estimated coefficients ~rotated~ scale. penalty: character string indicating penalty model fit (e.g., 'MCP') gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. ns_idx: vector indices predictors non-singular features (.e., features variation). iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda K: list 2 elements, s U — s: vector eigenvalues relatedness matrix; see relatedness_mat() details. U: matrix eigenvectors relatedness matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Fit a linear mixed model via non-convex penalized maximum likelihood. — plmm","text":"","code":"# using admix data admix_design <- create_design(X = admix$X, y = admix$y) fit_admix1 <- plmm(design = admix_design) s1 <- summary(fit_admix1, idx = 50) print(s1) #> lasso-penalized regression model with n=197, p=101 at lambda=0.01426 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 88 #> ------------------------------------------------- plot(fit_admix1) # Note: for examples with large data that are too big to fit in memory, # see the article \"PLINK files/file-backed matrices\" on our website # https://pbreheny.github.io/plmmr/articles/filebacking.html"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":null,"dir":"Reference","previous_headings":"","what":"plmm_checks — plmm_checks","title":"plmm_checks — plmm_checks","text":"plmm_checks","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"plmm_checks — plmm_checks","text":"","code":"plmm_checks( design, K = NULL, diag_K = NULL, eta_star = NULL, penalty = \"lasso\", init = NULL, gamma, alpha = 1, dfmax = NULL, trace = FALSE, save_rds = NULL, return_fit = TRUE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"plmm_checks — plmm_checks","text":"design design object, created create_design() K Similarity matrix used rotate data. either (1) known matrix reflects covariance y, (2) estimate (Default \\(\\frac{1}{p}(XX^T)\\)), (3) list components 'd' 'u', returned choose_k(). diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Defaults FALSE. Note: plmm() check see matrix diagonal. want use diagonal K matrix, must set diag_K = TRUE. eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. penalty penalty applied model. Either \"MCP\" (default), \"SCAD\", \"lasso\". init Initial values coefficients. Default 0 columns X. gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. dfmax Option added soon: Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. save_rds Optional: filepath name specified (e.g., save_rds = \"~/dir/my_results.rds\"), model results saved provided location. Defaults NULL, save result. return_fit Optional: logical value indicating whether fitted model returned plmm object current (assumed interactive) session. Defaults TRUE. ... Additional arguments get_data()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_checks.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"plmm_checks — plmm_checks","text":"list parameters pass model fitting. list includes standardized design matrix, outcome, meta-data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"PLMM fit: function fits PLMM using values returned plmm_prep()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"","code":"plmm_fit( prep, y, std_X_details, eta_star, penalty_factor, fbm_flag, penalty, gamma = 3, alpha = 1, lambda_min, nlambda = 100, lambda, eps = 1e-04, max_iter = 10000, convex = TRUE, dfmax = NULL, init = NULL, warn = TRUE, returnX = TRUE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"prep list returned plmm_prep y original (centered) outcome vector. Need intercept estimate std_X_details list components 'center' (values used center X), 'scale' (values used scale X), 'ns' (indices nonsignular columns X) eta_star ratio variances (passed plmm()) penalty_factor multiplicative factor penalty applied coefficient. supplied, penalty_factor must numeric vector length equal number columns X. purpose penalty_factor apply differential penalization coefficients thought likely others model. particular, penalty_factor can 0, case coefficient always model without shrinkage. fbm_flag Logical: std_X FBM object? Passed plmm(). penalty penalty applied model. Either \"MCP\" (default), \"SCAD\", \"lasso\". gamma tuning parameter MCP/SCAD penalty (see details). Default 3 MCP 3.7 SCAD. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. nlambda Length sequence lambda. Default 100. lambda user-specified sequence lambda values. default, sequence values length nlambda computed, equally spaced log scale. eps Convergence threshold. algorithm iterates RMSD change linear predictors coefficient less eps. Default 1e-4. max_iter Maximum number iterations (total across entire path). Default 10000. convex (future idea; yet incorporated) convex Calculate index objective function ceases locally convex? Default TRUE. dfmax (future idea; yet incorporated) Upper bound number nonzero coefficients. Default upper bound. However, large data sets, computational burden may heavy models large number nonzero coefficients. init Initial values coefficients. Default 0 columns X. warn Return warning messages failures converge model saturation? Default TRUE. returnX Return standardized design matrix along fit? default, option turned X 100 MB, turned larger matrices preserve memory. ... Additional arguments can passed biglasso::biglasso_simple_path()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_fit.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM fit: a function that fits a PLMM using the values returned by plmm_prep() — plmm_fit","text":"list components: std_scale_beta: coefficients estimated scale std_X centered_y: y-values 'centered' mean 0 s U, values vectors eigendecomposition K lambda: vector tuning parameter values linear_predictors: product stdrot_X b (linear predictors transformed restandardized scale) eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure. iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty: character string indicating penalty model fit (e.g., 'MCP') penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. ns: indices nonsingular values X feature_names: formatted column names design matrix nlambda: number lambda values used model fitting eps: tolerance ('epsilon') used model fitting max_iter: max. number iterations per model fit warn: logical - warnings given model fit converge? init: initial values model fitting trace: logical - messages printed console models fit?","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"PLMM format: function format output model constructed plmm_fit","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"","code":"plmm_format(fit, p, std_X_details, fbm_flag, plink_flag)"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"fit list parameters describing output model constructed plmm_fit p number features original data (including constant features) std_X_details list 3 items: * 'center': centering values columns X * 'scale': scaling values non-singular columns X * 'ns': indicesof nonsingular columns std_X fbm_flag Logical: corresponding design matrix filebacked? Passed plmm(). plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_format.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM format: a function to format the output of a model constructed with plmm_fit — plmm_format","text":"list components: beta_vals: matrix estimated coefficients original scale. Rows predictors, columns values lambda lambda: numeric vector lasso tuning parameter values used model fitting. eta: number (double) 0 1 representing estimated proportion variance outcome attributable population/correlation structure. s: vectof eigenvalues relatedness matrix K; see relatedness_mat() details. U: matrix eigenvalues relatedness matrix K rot_y: vector outcome values rotated scale. scale model fit. linear_predictors: matrix resulting product stdrot_X estimated coefficients ~rotated~ scale. penalty: character string indicating penalty model fit (e.g., 'MCP') gamma: numeric value indicating tuning parameter used SCAD lasso penalties used. relevant lasso models. alpha: numeric value indicating elastic net tuning parameter. loss: vector numeric values loss value lambda (calculated ~rotated~ scale) penalty_factor: vector indicators corresponding predictor, 1 = predictor penalized. ns_idx: vector indices predictors nonsingular features (.e., variation). iter: numeric vector number iterations needed model fitting value lambda converged: vector logical values indicating whether model fitting converged value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":null,"dir":"Reference","previous_headings":"","what":"Loss method for ","title":"Loss method for ","text":"Loss method \"plmm\" class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Loss method for ","text":"","code":"plmm_loss(y, yhat)"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Loss method for ","text":"y Observed outcomes (response) vector yhat Predicted outcomes (response) vector","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Loss method for ","text":"numeric vector squared-error loss values given observed predicted outcomes","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_loss.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Loss method for ","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design, K = relatedness_mat(admix$X)) yhat <- predict(object = fit, newX = admix$X, type = 'lp', lambda = 0.05) head(plmm_loss(yhat = yhat, y = admix$y)) #> [,1] #> [1,] 0.81638401 #> [2,] 0.09983799 #> [3,] 0.50281622 #> [4,] 0.14234359 #> [5,] 2.03696796 #> [6,] 2.72044268"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":null,"dir":"Reference","previous_headings":"","what":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"PLMM prep: function run checks, SVD, rotation prior fitting PLMM model internal function cv_plmm","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"","code":"plmm_prep( std_X, std_X_n, std_X_p, genomic = 1:std_X_p, n, p, centered_y, k = NULL, K = NULL, diag_K = NULL, eta_star = NULL, fbm_flag, penalty_factor = rep(1, ncol(std_X)), trace = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"std_X Column standardized design matrix. May include clinical covariates non-SNP data. std_X_n number observations std_X (integer) std_X_p number features std_X (integer) genomic numeric vector indices indicating columns standardized X genomic covariates. Defaults columns. n number instances original design matrix X. altered standardization. p number features original design matrix X, including constant features centered_y Continuous outcome vector, centered. k integer specifying number singular values used approximation rotated design matrix. argument passed RSpectra::svds(). Defaults min(n, p) - 1, n p dimensions standardized design matrix. K Similarity matrix used rotate data. either known matrix reflects covariance y, estimate (Default \\(\\frac{1}{p}(XX^T)\\), X standardized). can also list, components d u (returned choose_k) diag_K Logical: K diagonal matrix? reflect observations unrelated, can treated unrelated. Passed plmm(). eta_star Optional argument input specific eta term rather estimate data. K known covariance matrix full rank, 1. fbm_flag Logical: std_X FBM type object? set internally plmm(). trace set TRUE, inform user progress announcing beginning step modeling process. Default FALSE. ... used yet","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmm_prep.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"PLMM prep: a function to run checks, SVD, and rotation prior to fitting a PLMM model This is an internal function for cv_plmm — plmm_prep","text":"List components: centered_y: vector centered outcomes std_X: standardized design matrix K: list 2 elements. (1) s: vector eigenvalues K, (2) U: eigenvectors K (left singular values X). eta: numeric value estimated eta parameter trace: logical.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plmmr-package.html","id":null,"dir":"Reference","previous_headings":"","what":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","title":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","text":"Fits penalized linear mixed models correct unobserved confounding factors. 'plmmr' infers corrects presence unobserved confounding effects population stratification environmental heterogeneity. fits linear model via penalized maximum likelihood. Originally designed multivariate analysis single nucleotide polymorphisms (SNPs) measured genome-wide association study (GWAS), 'plmmr' eliminates need subpopulation-specific analyses post-analysis p-value adjustments. Functions appropriate processing 'PLINK' files also supplied. examples, see package homepage. https://pbreheny.github.io/plmmr/.","code":""},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/reference/plmmr-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"plmmr: Penalized Linear Mixed Models for Correlated Data — plmmr-package","text":"Maintainer: Patrick J. Breheny patrick-breheny@uiowa.edu (ORCID) Authors: Tabitha K. Peter tabitha-peter@uiowa.edu (ORCID) Anna C. Reisetter anna-reisetter@uiowa.edu (ORCID) Yujing Lu","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot method for cv_plmm class — plot.cv_plmm","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"Plot method cv_plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"","code":"# S3 method for class 'cv_plmm' plot( x, log.l = TRUE, type = c(\"cve\", \"rsq\", \"scale\", \"snr\", \"all\"), selected = TRUE, vertical.line = TRUE, col = \"red\", ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"x object class cv_plmm log.l Logical indicate plot returned natural log scale. Defaults log.l = FALSE. type Type plot return. Defaults \"cve.\" selected Logical indicate variables plotted. Defaults TRUE. vertical.line Logical indicate whether vertical line plotted minimum/maximum value. Defaults TRUE. col Color vertical line, plotted. Defaults \"red.\" ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"Nothing returned; instead, plot drawn representing relationship tuning parameter 'lambda' value (x-axis) cross validation error (y-axis).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot method for cv_plmm class — plot.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cvfit <- cv_plmm(design = admix_design) plot(cvfit)"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Plot method for plmm class — plot.plmm","title":"Plot method for plmm class — plot.plmm","text":"Plot method plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Plot method for plmm class — plot.plmm","text":"","code":"# S3 method for class 'plmm' plot(x, alpha = 1, log.l = FALSE, shade = TRUE, col, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Plot method for plmm class — plot.plmm","text":"x object class plmm alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. log.l Logical indicate plot returned natural log scale. Defaults log.l = FALSE. shade Logical indicate whether local nonconvex region shaded. Defaults TRUE. col Vector colors coefficient lines. ... Additional arguments.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Plot method for plmm class — plot.plmm","text":"Nothing returned; instead, plot coefficient paths drawn value lambda (one 'path' coefficient).","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/plot.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Plot method for plmm class — plot.plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) plot(fit) plot(fit, log.l = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Predict method for plmm class — predict.plmm","title":"Predict method for plmm class — predict.plmm","text":"Predict method plmm class","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Predict method for plmm class — predict.plmm","text":"","code":"# S3 method for class 'plmm' predict( object, newX, type = c(\"blup\", \"coefficients\", \"vars\", \"nvars\", \"lp\"), lambda, idx = 1:length(object$lambda), ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Predict method for plmm class — predict.plmm","text":"object object class plmm. newX Matrix values predictions made (used type=\"coefficients\" type settings predict). can either FBM object 'matrix' object. Note: Columns argument must named! type character argument indicating type prediction returned. Options \"lp,\" \"coefficients,\" \"vars,\" \"nvars,\" \"blup.\" See details. lambda numeric vector regularization parameter lambda values predictions requested. idx Vector indices penalty parameter lambda predictions required. default, indices returned. ... Additional optional arguments","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Predict method for plmm class — predict.plmm","text":"Depends type - see Details","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Predict method for plmm class — predict.plmm","text":"Define beta-hat coefficients estimated value lambda minimizes cross-validation error (CVE). options type follows: 'response' (default): uses product newX beta-hat predict new values outcome. incorporate correlation structure data. stats folks , simply linear predictor. 'blup' (acronym Best Linear Unbiased Predictor): adds 'response' value represents esetimated random effect. addition way incorporating estimated correlation structure data prediction outcome. 'coefficients': returns estimated beta-hat 'vars': returns indices variables (e.g., SNPs) nonzero coefficients value lambda. EXCLUDES intercept. 'nvars': returns number variables (e.g., SNPs) nonzero coefficients value lambda. EXCLUDES intercept.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Predict method for plmm class — predict.plmm","text":"","code":"set.seed(123) train_idx <- sample(1:nrow(admix$X), 100) # Note: ^ shuffling is important here! Keeps test and train groups comparable. train <- list(X = admix$X[train_idx,], y = admix$y[train_idx]) train_design <- create_design(X = train$X, y = train$y) test <- list(X = admix$X[-train_idx,], y = admix$y[-train_idx]) fit <- plmm(design = train_design) # make predictions for all lambda values pred1 <- predict(object = fit, newX = test$X, type = \"lp\") pred2 <- predict(object = fit, newX = test$X, type = \"blup\") # look at mean squared prediction error mspe <- apply(pred1, 2, function(c){crossprod(test$y - c)/length(c)}) min(mspe) #> [1] 2.87754 mspe_blup <- apply(pred2, 2, function(c){crossprod(test$y - c)/length(c)}) min(mspe_blup) # BLUP is better #> [1] 2.128471 # compare the MSPE of our model to a null model, for reference # null model = intercept only -> y_hat is always mean(y) crossprod(mean(test$y) - test$y)/length(test$y) #> [,1] #> [1,] 6.381748"},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":null,"dir":"Reference","previous_headings":"","what":"Predict method to use in cross-validation (within cvf) — predict_within_cv","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"Predict method use cross-validation (within cvf)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"","code":"predict_within_cv( fit, trainX, trainY = NULL, testX, std_X_details, type, fbm = FALSE, plink_flag = FALSE, Sigma_11 = NULL, Sigma_21 = NULL, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"fit list components returned plmm_fit. trainX training data, pre-standardization pre-rotation trainY training outcome, centered. needed type = 'blup' testX design matrix used computing predicted values (.e, test data). std_X_details list 3 items: 'center': centering values columns X 'scale': scaling values non-singular columns X 'ns': indices nonsingular columns std_X. Note: vector really need ! type character argument indicating type prediction returned. Passed cvf(), Options \"lp,\" \"coefficients,\" \"vars,\" \"nvars,\" \"blup.\" See details. fbm Logical: trainX FBM object? , function expects testX also FBM. two X matrices must stored way. Sigma_11 Variance-covariance matrix training data. Extracted estimated_Sigma generated using observations. Required type == 'blup'. Sigma_21 Covariance matrix training testing data. Extracted estimated_Sigma generated using observations. Required type == 'blup'. ... Additional optional arguments","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"numeric vector predicted values","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/predict_within_cv.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Predict method to use in cross-validation (within cvf) — predict_within_cv","text":"Define beta-hat coefficients estimated value lambda minimizes cross-validation error (CVE). options type follows: 'lp' (default): uses linear predictor (.e., product test data estimated coefficients) predict test values outcome. Note approach incorporate correlation structure data. 'blup' (acronym Best Linear Unbiased Predictor): adds 'lp' value represents estimated random effect. addition way incorporating estimated correlation structure data prediction outcome. Note: main difference function predict.plmm() method CV, predictions made standardized scale (.e., trainX testX data come std_X). predict.plmm() method makes predictions scale X (original scale)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":null,"dir":"Reference","previous_headings":"","what":"a function to format the time — pretty_time","title":"a function to format the time — pretty_time","text":"function format time","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"a function to format the time — pretty_time","text":"","code":"pretty_time()"},{"path":"https://pbreheny.github.io/plmmr/reference/pretty_time.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"a function to format the time — pretty_time","text":"string formatted current date time","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"Print method summary.cv_plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"","code":"# S3 method for class 'summary.cv_plmm' print(x, digits, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"x object class summary.cv_plmm digits number digits use formatting output ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"Nothing returned; instead, message printed console summarizing results cross-validated model fit.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Print method for summary.cv_plmm objects — print.summary.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) print(summary(cv_fit)) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2168): #> ------------------------------------------------- #> Nonzero coefficients: 10 #> Cross-validation error (deviance): 1.96 #> Scale estimate (sigma): 1.399"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to print the summary of a plmm model — print.summary.plmm","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"function print summary plmm model","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"","code":"# S3 method for class 'summary.plmm' print(x, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"x summary.plmm object ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"Nothing returned; instead, message printed console summarizing results model fit.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/print.summary.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to print the summary of a plmm model — print.summary.plmm","text":"","code":"lam <- rev(seq(0.01, 1, length.out=20)) |> round(2) # for sake of example admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design, lambda = lam) fit2 <- plmm(design = admix_design, penalty = \"SCAD\", lambda = lam) print(summary(fit, idx = 18)) #> lasso-penalized regression model with n=197, p=101 at lambda=0.1100 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 27 #> ------------------------------------------------- print(summary(fit2, idx = 18)) #> SCAD-penalized regression model with n=197, p=101 at lambda=0.1100 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 29 #> -------------------------------------------------"},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in large data files as an FBM — process_delim","title":"A function to read in large data files as an FBM — process_delim","text":"function read large data files FBM","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in large data files as an FBM — process_delim","text":"","code":"process_delim( data_dir, data_file, feature_id, rds_dir = data_dir, rds_prefix, logfile = NULL, overwrite = FALSE, quiet = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in large data files as an FBM — process_delim","text":"data_dir directory file. data_file file read , without filepath. file numeric values. Example: use data_file = \"myfile.txt\", data_file = \"~/mydirectory/myfile.txt\" Note: file headers/column names, set 'header = TRUE' – passed bigmemory::read.big.matrix(). feature_id string specifying column data X (feature data) row IDs (e.g., identifiers row/sample/participant/, etc.). duplicates allowed. rds_dir directory user wants create '.rds' '.bk' files Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds file (create insie rds_dir folder) Note: 'rds_prefix' 'data_prefix' logfile Optional: name (character string) prefix logfile written. Defaults 'process_delim', .e. get 'process_delim.log' outfile. overwrite Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. Note: multiple .rds files names start \"std_prefix_...\", error . protect users accidentally deleting files saved results, one .rds file can removed option. quiet Logical: messages printed console silenced? Defaults FALSE. ... Optional: arguments passed bigmemory::read.big.matrix(). Note: 'sep' option pass , 'header'.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in large data files as an FBM — process_delim","text":"file path newly created '.rds' file","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_delim.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A function to read in large data files as an FBM — process_delim","text":"","code":"temp_dir <- tempdir() colon_dat <- process_delim(data_file = \"colon2.txt\", data_dir = find_example_data(parent = TRUE), overwrite = TRUE, rds_dir = temp_dir, rds_prefix = \"processed_colon2\", sep = \"\\t\", header = TRUE) #> #> Overwriting existing files:processed_colon2.bk/.rds/.desc #> There are 62 observations and 2001 features in the specified data files. #> At this time, plmmr::process_delim() does not not handle missing values in delimited data. #> Please make sure you have addressed missingness before you proceed. #> #> process_plink() completed #> Processed files now saved as /tmp/RtmpuyGgXe/processed_colon2.rds colon2 <- readRDS(colon_dat) str(colon2) #> List of 3 #> $ X:Formal class 'big.matrix.descriptor' [package \"bigmemory\"] with 1 slot #> .. ..@ description:List of 13 #> .. .. ..$ sharedType: chr \"FileBacked\" #> .. .. ..$ filename : chr \"processed_colon2.bk\" #> .. .. ..$ dirname : chr \"/tmp/RtmpuyGgXe/\" #> .. .. ..$ totalRows : int 62 #> .. .. ..$ totalCols : int 2001 #> .. .. ..$ rowOffset : num [1:2] 0 62 #> .. .. ..$ colOffset : num [1:2] 0 2001 #> .. .. ..$ nrow : num 62 #> .. .. ..$ ncol : num 2001 #> .. .. ..$ rowNames : NULL #> .. .. ..$ colNames : chr [1:2001] \"sex\" \"Hsa.3004\" \"Hsa.13491\" \"Hsa.13491.1\" ... #> .. .. ..$ type : chr \"double\" #> .. .. ..$ separated : logi FALSE #> $ n: num 62 #> $ p: num 2001 #> - attr(*, \"class\")= chr \"processed_delim\""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":null,"dir":"Reference","previous_headings":"","what":"Preprocess PLINK files using the bigsnpr package — process_plink","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"Preprocess PLINK files using bigsnpr package","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"","code":"process_plink( data_dir, data_prefix, rds_dir = data_dir, rds_prefix, logfile = NULL, impute = TRUE, impute_method = \"mode\", id_var = \"IID\", parallel = TRUE, quiet = FALSE, overwrite = FALSE, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"data_dir path bed/bim/fam data files, without trailing \"/\" (e.g., use data_dir = '~/my_dir', data_dir = '~/my_dir/') data_prefix prefix (character string) bed/fam data files (e.g., data_prefix = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds file (create insie rds_dir folder) Note: 'rds_prefix' 'data_prefix' logfile Optional: name (character string) prefix logfile written 'rds_dir'. Default NULL (log file written). Note: supply file path argument, error \"file found\" error. supply string; e.g., want my_log.log, supply 'my_log', my_log.log file appear rds_dir. impute Logical: data imputed? Default TRUE. impute_method 'impute' = TRUE, argument specify kind imputation desired. Options : * mode (default): Imputes frequent call. See bigsnpr::snp_fastImputeSimple() details. * random: Imputes sampling according allele frequencies. * mean0: Imputes rounded mean. * mean2: Imputes mean rounded 2 decimal places. * xgboost: Imputes using algorithm based local XGBoost models. See bigsnpr::snp_fastImpute() details. Note: can take several minutes, even relatively small data set. id_var String specifying column PLINK .fam file unique sample identifiers. Options \"IID\" (default) \"FID\" parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. quiet Logical: messages printed console silenced? Defaults FALSE overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. Note: multiple .rds files names start \"std_prefix_...\", error . protect users accidentally deleting files saved results, one .rds file can removed option. ... Optional: additional arguments bigsnpr::snp_fastImpute() (relevant impute_method = \"xgboost\")","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"filepath '.rds' object created; see details explanation.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/process_plink.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Preprocess PLINK files using the bigsnpr package — process_plink","text":"Three files created location specified rds_dir: 'rds_prefix.rds': list three items: (1) X: filebacked bigmemory::big.matrix object pointing imputed genotype data. matrix type 'double', important downstream operations create_design() (2) map: data.frame PLINK 'bim' data (.e., variant information) (3) fam: data.frame PLINK 'fam' data (.e., pedigree information) 'prefix.bk': backingfile stores numeric data genotype matrix 'rds_prefix.desc'\" description file, needed Note process_plink() need run given set PLINK files; subsequent data analysis/scripts, get_data() access '.rds' file. example, see vignette processing PLINK files","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"function read large file numeric file-backed matrix (FBM) Note: function wrapper bigstatsr::big_read()","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"","code":"read_data_files( data_file, data_dir, rds_dir, rds_prefix, outfile, overwrite, quiet, ... )"},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"data_file name file read, including directory. Directory specified data_dir data_dir path directory 'file' rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir rds_prefix String specifying user's preferred filename --created .rds/.bk files (create insie rds_dir folder) Note: 'rds_prefix' 'data_file' outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. quiet Logical: messages printed console? Defaults TRUE ... Optional: arguments passed bigmemory::read.big.matrix(). Note: 'sep' option pass .","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_data_files.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in a large file as a numeric file-backed matrix (FBM) Note: this function is a wrapper for bigstatsr::big_read() — read_data_files","text":"'.rds', '.bk', '.desc' files created data_dir, obj (filebacked bigmemory big.matrix object) returned. See bigmemory documentation info big.matrix class.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to read in PLINK files using bigsnpr methods — read_plink_files","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"function read PLINK files using bigsnpr methods","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"","code":"read_plink_files( data_dir, data_prefix, rds_dir, outfile, parallel, overwrite, quiet )"},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"data_dir path bed/bim/fam data files, without trailing \"/\" (e.g., use data_dir = '~/my_dir', data_dir = '~/my_dir/') data_prefix prefix (character string) bed/fam data files (e.g., prefix = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) prefix logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. parallel Logical: computations within function run parallel? Defaults TRUE. See count_cores() ?bigparallelr::assert_cores details. particular, user aware much parallelization can make computations slower. overwrite Logical: existing .bk/.rds files exist specified directory/prefix, overwritten? Defaults FALSE. Set TRUE want change imputation method using, etc. quiet Logical: messages printed console? Defaults TRUE","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/read_plink_files.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to read in PLINK files using bigsnpr methods — read_plink_files","text":"'.rds' '.bk' files created data_dir, obj (bigSNP object) returned. See bigsnpr documentation info bigSNP class.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculate a relatedness matrix — relatedness_mat","title":"Calculate a relatedness matrix — relatedness_mat","text":"Given matrix genotypes, function estimates genetic relatedness matrix (GRM, also known RRM, see Hayes et al. 2009, doi:10.1017/S0016672308009981 ) among subjects: XX'/p, X standardized.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculate a relatedness matrix — relatedness_mat","text":"","code":"relatedness_mat(X, std = TRUE, fbm = FALSE, ns = NULL, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculate a relatedness matrix — relatedness_mat","text":"X n x p numeric matrix genotypes (fully-imputed data). Note: matrix include non-genetic features. std Logical: X standardized? set FALSE (can done data stored memory), good reason , standardization best practice. fbm Logical: X stored FBM? Defaults FALSE ns Optional vector values indicating indices nonsingular features ... optional arguments bigstatsr::bigapply() (like ncores = ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculate a relatedness matrix — relatedness_mat","text":"n x n numeric matrix capturing genomic relatedness samples represented X. notation, call matrix K 'kinship'; also known GRM RRM.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/relatedness_mat.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculate a relatedness matrix — relatedness_mat","text":"","code":"RRM <- relatedness_mat(X = admix$X) RRM[1:5, 1:5] #> [,1] [,2] [,3] [,4] [,5] #> [1,] 0.81268908 -0.09098097 -0.07888910 0.06770613 0.08311777 #> [2,] -0.09098097 0.81764801 0.20480021 0.02112812 -0.02640295 #> [3,] -0.07888910 0.20480021 0.82177986 -0.02864226 0.18693970 #> [4,] 0.06770613 0.02112812 -0.02864226 0.89327266 -0.03541470 #> [5,] 0.08311777 -0.02640295 0.18693970 -0.03541470 0.79589686"},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A function to rotate filebacked data — rotate_filebacked","title":"A function to rotate filebacked data — rotate_filebacked","text":"function rotate filebacked data","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A function to rotate filebacked data — rotate_filebacked","text":"","code":"rotate_filebacked(prep, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/rotate_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A function to rotate filebacked data — rotate_filebacked","text":"list 4 items: stdrot_X: X rotated re-standardized scale rot_y: y rotated scale (numeric vector) stdrot_X_center: numeric vector values used center rot_X stdrot_X_scale: numeric vector values used scale rot_X","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute sequence of lambda values — setup_lambda","title":"Compute sequence of lambda values — setup_lambda","text":"function allows compute sequence lambda values plmm models.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute sequence of lambda values — setup_lambda","text":"","code":"setup_lambda( X, y, alpha, lambda_min, nlambda, penalty_factor, intercept = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute sequence of lambda values — setup_lambda","text":"X Rotated standardized design matrix includes intercept column present. May include clinical covariates non-SNP data. can either 'matrix' 'FBM' object. y Continuous outcome vector. alpha Tuning parameter Mnet estimator controls relative contributions MCP/SCAD penalty ridge, L2 penalty. alpha=1 equivalent MCP/SCAD penalty, alpha=0 equivalent ridge regression. However, alpha=0 supported; alpha may arbitrarily small, exactly 0. lambda_min smallest value lambda, fraction lambda.max. Default .001 number observations larger number covariates .05 otherwise. value lambda_min = 0 supported. nlambda desired number lambda values sequence generated. penalty_factor multiplicative factor penalty applied coefficient. supplied, penalty_factor must numeric vector length equal number columns X. purpose penalty_factor apply differential penalization coefficients thought likely others model. particular, penalty_factor can 0, case coefficient always model without shrinkage. intercept Logical: X contain intercept column? Defaults TRUE.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/setup_lambda.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute sequence of lambda values — setup_lambda","text":"numeric vector lambda values, equally spaced log scale","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to standardize a filebacked matrix — standardize_filebacked","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"helper function standardize filebacked matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"","code":"standardize_filebacked( X, new_file, rds_dir, non_gen, complete_outcome, id_var, outfile, quiet, overwrite )"},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"X list includes: (1) subset_X: big.matrix object subset &/additional predictors appended columns (2) ns: numeric vector indicating indices nonsingular columns subset_X new_file new_file (character string) bed/fam data files (e.g., new_file = 'mydata') rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) new_file logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...) overwrite Logical: existing .bk/.rds files exist specified directory/new_file, overwritten?","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to standardize a filebacked matrix — standardize_filebacked","text":"list new component obj called 'std_X' - FBM column-standardized data. List also includes several indices/meta-data standardized matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to standardize matrices — standardize_in_memory","title":"A helper function to standardize matrices — standardize_in_memory","text":"helper function standardize matrices","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to standardize matrices — standardize_in_memory","text":"","code":"standardize_in_memory(X)"},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to standardize matrices — standardize_in_memory","text":"X matrix","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to standardize matrices — standardize_in_memory","text":"list standardized matrix, vectors centering/scaling values, vector indices nonsingular columns","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/standardize_in_memory.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"A helper function to standardize matrices — standardize_in_memory","text":"function adapted https://github.com/pbreheny/ncvreg/blob/master/R/std.R NOTE: function returns matrix memory. standardizing filebacked data, use big_std() – see src/big_standardize.cpp","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":null,"dir":"Reference","previous_headings":"","what":"A helper function to subset big.matrix objects — subset_filebacked","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"helper function subset big.matrix objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"","code":"subset_filebacked(X, new_file, complete_samples, ns, rds_dir, outfile, quiet)"},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"X filebacked big.matrix --standardized design matrix new_file Optional user-specified new_file --created .rds/.bk files. complete_samples Numeric vector indicesmarking rows original data non-missing entry 6th column .fam file ns Numeric vector indices non-singular columns vector created handle_missingness() rds_dir path directory want create new '.rds' '.bk' files. Defaults data_dir outfile Optional: name (character string) new_file logfile written. Defaults 'process_plink', .e. get 'process_plink.log' outfile. quiet Logical: messages printed console? Defaults FALSE (leaves print messages ...)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/subset_filebacked.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A helper function to subset big.matrix objects — subset_filebacked","text":"list two components. First, big.matrix object, 'subset_X', representing design matrix wherein: rows subset according user's specification handle_missing_phen columns subset constant features remain – important standardization downstream list also includes integer vector 'ns' marks columns original matrix 'non-singular' (.e. constant features). 'ns' index plays important role plmm_format() untransform() (helper functions model fitting)","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A summary function for cv_plmm objects — summary.cv_plmm","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"summary function cv_plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"","code":"# S3 method for class 'cv_plmm' summary(object, lambda = \"min\", ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"object cv_plmm object lambda regularization parameter value inference reported. Can choose numeric value, 'min', '1se'. Defaults 'min.' ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"return value object S3 class summary.cv_plmm. class print method contains following list elements: lambda_min: lambda value minimum cross validation error lambda.1se: maximum lambda value within 1 standard error minimum cross validation error penalty: penalty applied fitted model nvars: number non-zero coefficients selected lambda value cve: cross validation error folds min: minimum cross validation error fit: plmm fit used cross validation returnBiasDetails = TRUE, two items returned: bias: mean bias cross validation loss: loss value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.cv_plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A summary function for cv_plmm objects — summary.cv_plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) cv_fit <- cv_plmm(design = admix_design) summary(cv_fit) #> lasso-penalized model with n=197 and p=101 #> At minimum cross-validation error (lambda=0.2168): #> ------------------------------------------------- #> Nonzero coefficients: 10 #> Cross-validation error (deviance): 2.12 #> Scale estimate (sigma): 1.455"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":null,"dir":"Reference","previous_headings":"","what":"A summary method for the plmm objects — summary.plmm","title":"A summary method for the plmm objects — summary.plmm","text":"summary method plmm objects","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A summary method for the plmm objects — summary.plmm","text":"","code":"# S3 method for class 'plmm' summary(object, lambda, idx, eps = 1e-05, ...)"},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A summary method for the plmm objects — summary.plmm","text":"object object class plmm lambda regularization parameter value inference reported. idx Alternatively, lambda may specified index; idx=10 means: report inference 10th value lambda along regularization path. lambda idx specified, lambda takes precedence. eps lambda given, eps tolerance difference given lambda value lambda value object. Defaults 0.0001 (1e-5) ... used","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A summary method for the plmm objects — summary.plmm","text":"return value object S3 class summary.plmm. class print method contains following list elements: penalty: penalty used plmm (e.g. SCAD, MCP, lasso) n: Number instances/observations std_X_n: number observations standardized data; time differ 'n' data PLINK external data include samples p: Number regression coefficients (including intercept) converged: Logical indicator whether model converged lambda: lambda value inference reported lambda_char: formatted character string indicating lambda value nvars: number nonzero coefficients (, including intercept) value lambda nonzero: column names indicating nonzero coefficients model specified value lambda","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/summary.plmm.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"A summary method for the plmm objects — summary.plmm","text":"","code":"admix_design <- create_design(X = admix$X, y = admix$y) fit <- plmm(design = admix_design) summary(fit, idx = 97) #> lasso-penalized regression model with n=197, p=101 at lambda=0.00054 #> ------------------------------------------------- #> The model converged #> ------------------------------------------------- #> # of non-zero coefficients: 98 #> -------------------------------------------------"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale — untransform","title":"Untransform coefficient values back to the original scale — untransform","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale — untransform","text":"","code":"untransform( std_scale_beta, p, std_X_details, fbm_flag, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale — untransform","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' fbm_flag Logical: corresponding design matrix filebacked? plink_flag Logical: data come PLINK files? Note: flag matters non-genomic features handled PLINK files – data PLINK files, unpenalized columns counted p argument. delimited files, p include unpenalized columns. difference implications untransform() function determines appropriate dimensions estimated coefficient matrix returns. use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale — untransform","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"","code":"untransform_delim( std_scale_beta, p, std_X_details, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_delim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_delim","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"","code":"untransform_in_memory(std_scale_beta, p, std_X_details, use_names = TRUE)"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_in_memory.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale In memory — untransform_in_memory","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":null,"dir":"Reference","previous_headings":"","what":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"function unwinds initial standardization data obtain coefficient values original scale. called plmm_format().","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"","code":"untransform_plink( std_scale_beta, p, std_X_details, plink_flag, use_names = TRUE )"},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"std_scale_beta estimated coefficients standardized scale p number columns original design matrix std_X_details list 3 elements describing standardized design matrix rotation; elements 'scale', 'center', 'ns' use_names Logical: names added? Defaults TRUE. Set FALSE inside cvf() helper, 'ns' vary within CV folds.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/untransform_plink.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Untransform coefficient values back to the original scale for file-backed data — untransform_plink","text":"matrix estimated coeffcients, 'beta_vals', scale original data.","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":null,"dir":"Reference","previous_headings":"","what":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"Linux/Unix MacOS , companion function unzip .gz files ship plmmr package","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"","code":"unzip_example_data(outdir)"},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"outdir file path directory .gz files written","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"Nothing returned; PLINK files ship plmmr package stored directory specified 'outdir'","code":""},{"path":"https://pbreheny.github.io/plmmr/reference/unzip_example_data.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"For Linux/Unix and MacOS only, here is a companion function to unzip the .gz files that ship with the plmmr package — unzip_example_data","text":"example function, look vignette('plink_files', package = \"plmmr\"). Note : function work Windows systems - Linux/Unix MacOS.","code":""},{"path":[]},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"bug-fixes-4-2-0","dir":"Changelog","previous_headings":"","what":"Bug fixes","title":"plmmr 4.2.0 (2024-12-13)","text":"recently caught couple bugs model fitting functions – apologize errors may caused downstream analysis, explain addressed issues : Bug BLUP: caught mathematical error earlier implementation best linear unbiased prediction. issue inconsistency scaling among terms used constructing predictor. issue impacted prediction within cross-validation well predict() method plmm class. recommend users used best linear unbiased prediction (BLUP) previous analysis re-run analysis using corrected version. Bug processing delimited files: noticed bug way models fit data delimited files. previous version correctly implementing transformation model results standardized scale original scale due inadvertent addition two rows beta_vals object (one row added, intercept). error corrected. recommend users used previous version plmmr analyze data delimited files re-run analyses.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"other-changes-4-2-0","dir":"Changelog","previous_headings":"","what":"Other changes","title":"plmmr 4.2.0 (2024-12-13)","text":"Change default settings prediction: default prediction method predict() cv_plmm() now ‘blup’ (best linear unbiased prediction). Change objects returned default plmm(): default, main model fitting function plmm() now returns std_X (copy standardized design matrix) , y (outcome vector used fit model), std_scale_beta (estimated coefficients standardized scale). components used construct best linear unbiased predictor. user can opt return items using return_fit = FALSE compact_save options. Change arguments passed predict(): tandem change returned plmm() default, predict() method longer needs separate X y argument supplied type = 'blup'. components needed BLUP returned default plmm. Note predict() still early stages development filebacked data; given complexities particularities filebacked data processed (particularly data constant features), edge cases predict() method handle yet. continue work developing method; now, example predict() filebacked data vignette delimited data. Note particular example delimited data, constant features design matrix.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-410-2024-10-23","dir":"Changelog","previous_headings":"","what":"plmmr 4.1.0 (2024-10-23)","title":"plmmr 4.1.0 (2024-10-23)","text":"CRAN release: 2024-10-23 Restore plmm(X,y) syntax: version 4.0.0 required create_design() always called prior plmm() cv_plmm(); update restores X,y syntax consistent packages (e.g., glmnet, ncvreg). Note syntax available case design matrix stored -memory matrix data.frame object. create_design() function still required cases design matrix/dataset stored external file. Bug fix: 4.0.0 version create_design() required X column names, errored uninformative message names supplied (see issue 61). now fixed – column names required unless user wants specify argument unpen. Argument name change: create_design(), argument specify outcome -memory case renamed y; makes syntax consistent, e.g., create_design(X, y). Note change relevant -memory data . Internal: Fixed LTO type mismatch bug.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-400-2024-10-07","dir":"Changelog","previous_headings":"","what":"plmmr 4.0.0 (2024-10-07)","title":"plmmr 4.0.0 (2024-10-07)","text":"CRAN release: 2024-10-11 Major re-structuring preprocessing pipeline: Data external files must now processed process_plink() process_delim(). data (including -memory data) must prepared analysis via create_design(). change ensures data funneled uniform format analysis. Documentation updated: vignettes package now revised include examples complete pipeline new create_design() syntax. article type data input (matrix/data.frame, delimited file, PLINK). CRAN: package CRAN now.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-320-2024-09-02","dir":"Changelog","previous_headings":"","what":"plmmr 3.2.0 (2024-09-02)","title":"plmmr 3.2.0 (2024-09-02)","text":"bigsnpr now Suggests, Imports: essential filebacking support now done bigmemory bigalgebra. bigsnpr package used processing PLINK files. dev branch gwas_scale version pipeline runs completely file-backed.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-310-2024-07-13","dir":"Changelog","previous_headings":"","what":"plmmr 3.1.0 (2024-07-13)","title":"plmmr 3.1.0 (2024-07-13)","text":"Enhancement: make plmmr better functionality writing scripts, functions process_plink(), plmmm(), cv_plmm() now (optionally) write ‘.log’ files, PLINK. Enhancement: cases users working large datasets, may practical desirable results returned plmmm() cv_plmm() saved single ‘.rds’ file. now option model fitting functions called ‘compact_save’, gives users option save output multiple, smaller ‘.rds’ files. Argument removed: Argument std_needed longer available plmm() cv_plmm() functions.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-300-2024-06-27","dir":"Changelog","previous_headings":"","what":"plmmr 3.0.0 (2024-06-27)","title":"plmmr 3.0.0 (2024-06-27)","text":"Bug fix: Cross-validation implementation issues fixed. Previously, full set eigenvalues used inside CV folds, ideal involves information outside fold. Now, entire modeling process cross-validated: standardization, eigendecomposition relatedness matrix, model fitting, backtransformation onto original scale prediction. Computational speedup: standardization rotation filebacked data now much faster; bigalgebra bigmemory now used computations. Internal: standardized scale, intercept PLMM mean outcome. derivation considerably simplifies handling intercept internally model fitting.","code":""},{"path":"https://pbreheny.github.io/plmmr/news/index.html","id":"plmmr-221-2024-03-16","dir":"Changelog","previous_headings":"","what":"plmmr 2.2.1 (2024-03-16)","title":"plmmr 2.2.1 (2024-03-16)","text":"Name change: Changed package name plmmr; note plmm(), cv_plmm(), functions starting plmm_ changed names.","code":""}]