Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extreme p-values in SAIGE conditional analysis #153

Open
zrayw opened this issue Nov 19, 2024 · 0 comments
Open

Extreme p-values in SAIGE conditional analysis #153

zrayw opened this issue Nov 19, 2024 · 0 comments

Comments

@zrayw
Copy link

zrayw commented Nov 19, 2024

Dear Wei,

I encountered an issue with extreme p-values when running SAIGE Step 2 for conditional analysis, which appears similar to the previously mentioned issue.

To investigate further, I conducted additional checks on these extreme cases:

  1. I ran COJO conditional analysis using the unconditioned SAIGE summary statistics and the same genotype data as a reference. The results between COJO (p_COJO) and SAIGE (p.value_c) were highly consistent (Spearman’s ρ ~ 0.99), except for the extreme cases. For these, COJO estimated the conditional p-value at ~0.04, whereas SAIGE reported p-values < 1e-20.

  2. I examined the LD between the SNP I conditioned on and the extreme cases (R2 column). These SNPs are located far apart and are in complete LD.

  3. The extreme cases are common variants with allele frequency >0.01.

  4. Indeed some of the extreme case have large var_c, but it is not always the case (e.g. some of the var_c is ~4).

plot1
image

Could you please provide insights or suggestions for addressing this issue? Your guidance would be greatly appreciated.

Best,
Ruiwen

Below are the version and scripts I use for saige:

Loading required package: RhpcBLASctl
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
locale:
[1] C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.12.8 optparse_1.7.3 RhpcBLASctl_0.21-247.1
[4] SAIGE_1.1.6.2
loaded via a namespace (and not attached):
[1] compiler_3.6.3 Matrix_1.5-1 Rcpp_1.0.7 getopt_1.20.3
[5] grid_3.6.3 RcppParallel_5.1.5 lattice_0.20-40
step2_SPAtests.R --bgenFile="$saige_pfile_dir"/chr"$chr""$race".bgen
--bgenFileIndex="$saige_pfile_dir"/chr"$chr"
"$race".bgen.bgi
--sampleFile="$saige_pfile_dir"/chr"$chr""$race".sample
--AlleleOrder=ref-first --chrom=$chr --minMAF=0 --minMAC=20
--is_Firth_beta=TRUE --pCutoffforFirth=0.01 --is_output_moreDetails=TRUE --LOCO=TRUE
--GMMATmodelFile="$saige_dir"/saige_step1
"$race".rda
--varianceRatioFile="$saige_dir"/saige_step1_"$race".varianceRatio.txt
--SAIGEOutputFile="$cond_dir"/cond_chr"$chr"_"$race".txt
--condition="chr5:132703440:C:A;rs2285700"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant