This is an R package for performing MetaSTAAR procedure in whole genome sequencing studies. A lightweight package implementing MetaSTAAR with a pipeline for performing functionally-informed meta-analysis of sequencing studies is available in MetaSTAARlite [Tutorial].
MetaSTAAR is an R package for performing Meta-analysis of variant-Set Test for Association using Annotation infoRmation (MetaSTAAR) procedure in whole genome sequencing (WGS) studies. MetaSTAAR enables functionally-informed rare variant meta-analysis of large WGS studies using an efficient, sparse matrix approach for storing summary statistic, while protecting data privacy of study participants and avoiding sharing subject-level data. MetaSTAAR accounts for relatedness and population structure of continuous and dichotomous traits, and boosts the power of rare variant meta-analysis by incorporating multiple variant functional annotations.
R (recommended version >= 3.5.1)
For optimal computational performance, it is recommended to use an R version configured with the Intel Math Kernel Library (or other fast BLAS/LAPACK libraries). See the instructions on building R with Intel MKL.
MetaSTAAR links to R packages Rcpp, RcppArmadillo and STAAR, and also imports R packages Rcpp, STAAR, Matrix, dplyr, expm, MASS. These dependencies should be installed before installing MetaSTAAR.
library(devtools)
devtools::install_github("xihaoli/MetaSTAAR",ref="main")
Please see the MetaSTAAR user manual for detailed usage of MetaSTAAR package. The scripts used to generate results in the manuscript are available on Zenodo.
The whole-genome functional annotation data assembled from a variety of sources and the precomputed annotation principal components are available at the Functional Annotation of Variant - Online Resource (FAVOR) site and FAVOR Essential Database.
The current version is 0.9.6.3 (February 5, 2024).
If you use MetaSTAAR for your work, please cite:
Xihao Li, Corbin Quick, Hufeng Zhou, Sheila M. Gaynor, Yaowu Liu, Han Chen, Margaret Sunitha Selvaraj, Ryan Sun, Rounak Dey, Donna K. Arnett, Lawrence F. Bielak, Joshua C. Bis, John Blangero, Eric Boerwinkle, Donald W. Bowden, Jennifer A. Brody, Brian E. Cade, Adolfo Correa, L. Adrienne Cupples, Joanne E. Curran, Paul S. de Vries, Ravindranath Duggirala, Barry I. Freedman, Harald H. H. Göring, Xiuqing Guo, Jeffrey Haessler, Rita R. Kalyani, Charles Kooperberg, Brian G. Kral, Leslie A. Lange, Ani Manichaikul, Lisa W. Martin, Stephen T. McGarvey, Braxton D. Mitchell, May E. Montasser, Alanna C. Morrison, Take Naseri, Jeffrey R. O'Connell, Nicholette D. Palmer, Patricia A. Peyser, Bruce M. Psaty, Laura M. Raffield, Susan Redline, Alexander P. Reiner, Muagututi’a Sefuiva Reupena, Kenneth M. Rice, Stephen S. Rich, Colleen M. Sitlani, Jennifer A. Smith, Kent D. Taylor, Ramachandran S. Vasan, Cristen J. Willer, James G. Wilson, Lisa R. Yanek, Wei Zhao, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, TOPMed Lipids Working Group, Jerome I. Rotter, Pradeep Natarajan, Gina M. Peloso, Zilin Li, & Xihong Lin. (2023). Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies. Nature Genetics, 55(1), 154-164. PMID: 36564505. PMCID: PMC10084891. DOI: 10.1038/s41588-022-01225-6.
This software is licensed under GPLv3.