Improve LD shrinkage efficiency #35

biona001 · 2024-08-02T20:22:11Z

Motivation

When a LD matrix is too large (e.g. >5000 variables), the function find_optimal_shrinkage becomes too slow.

Proposal

Rather than computing shrinkage on the entire correlation matrix, we can do so on just the group representative variables, since these variables are primary drivers for the dependencies across groups. This often reduce the LD matrix size by a factor of 10.

Potential issue

Often there are few representatives in a block (e.g. < 10). This can lead to large shrinkage values (e.g. >0.5) even though the LD matrix matches the Z-score exactly.

this PR

Now we compute shrinkage on the original matrix if the size is manageable (<1000 variables), otherwise we change to computing shrinkage value on the representatives.

Before merging

To check this approach is reasonable, we should re-run the Alzheimer's disease study to see if the shrinkage value is comparable to previously (which I think we around 0.01 to 0.02).

codecov-commenter · 2024-08-02T20:30:55Z

Codecov Report

Attention: Patch coverage is 62.96296% with 10 lines in your changes missing coverage. Please review.

Project coverage is 64.73%. Comparing base (6d00378) to head (1390806).

Files	Patch %	Lines
src/ghostbasil_parallel.jl	61.53%	10 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #35      +/-   ##
==========================================
- Coverage   65.09%   64.73%   -0.37%     
==========================================
  Files           5        5              
  Lines         679      689      +10     
==========================================
+ Hits          442      446       +4     
- Misses        237      243       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

biona001 · 2024-08-21T02:28:03Z

I tested this on the Alzheimer's disease study and the mean shrinkage value is 0.01582485377913878 after this PR (before it was 0.015295867015486209). This difference is small because only 5/1703 regions actually trigger the new code. Even so, 4/5 regions returned similar answers, so I'll go ahead and merge anyways

Before this PR:

region 693 / 1703 (f = LD_start29737971_end30798167.h5): chr 6, nz beta = 19, nsnps = 1454, shrinkage = 0.0529
region 694 / 1703 (f = LD_start30798168_end31571217.h5): chr 6, nz beta = 15, nsnps = 1991, shrinkage = 0.0345
region 695 / 1703 (f = LD_start31571218_end32682663.h5): chr 6, nz beta = 20, nsnps = 1732, shrinkage = 0.1506
region 696 / 1703 (f = LD_start32682664_end33236496.h5): chr 6, nz beta = 16, nsnps = 1028, shrinkage = 0.0294
region 1521 / 1703 (f = LD_start69387817_end72672202.h5): chr 17, nz beta = 14, nsnps = 1278, shrinkage = 0.004

After this PR:

region 693 / 1703 (f = LD_start29737971_end30798167.h5): chr 6, nz beta = 26, nsnps = 1454, shrinkage = 0.0849
region 694 / 1703 (f = LD_start30798168_end31571217.h5): chr 6, nz beta = 17, nsnps = 1991, shrinkage = 0.0927
region 695 / 1703 (f = LD_start31571218_end32682663.h5): chr 6, nz beta = 18, nsnps = 1732, shrinkage = 0.1565
region 696 / 1703 (f = LD_start32682664_end33236496.h5): chr 6, nz beta = 13, nsnps = 1028, shrinkage = 0.8383
region 1521 / 1703 (f = LD_start69387817_end72672202.h5): chr 17, nz beta = 8, nsnps = 1278, shrinkage = 0.0

biona001 added 2 commits July 30, 2024 12:33

estimate LD shrinkage on reps to improve speed

43719ab

only shrink with reps when region has >1000 snps

1390806

biona001 merged commit 634b8a0 into main Aug 21, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve LD shrinkage efficiency #35

Improve LD shrinkage efficiency #35

biona001 commented Aug 2, 2024 •

edited

Loading

codecov-commenter commented Aug 2, 2024

biona001 commented Aug 21, 2024 •

edited

Loading

Improve LD shrinkage efficiency #35

Improve LD shrinkage efficiency #35

Conversation

biona001 commented Aug 2, 2024 • edited Loading

Motivation

Proposal

Potential issue

this PR

Before merging

codecov-commenter commented Aug 2, 2024

Codecov Report

biona001 commented Aug 21, 2024 • edited Loading

biona001 commented Aug 2, 2024 •

edited

Loading

biona001 commented Aug 21, 2024 •

edited

Loading