Skip to content

Commit

Permalink
Update
Browse files Browse the repository at this point in the history
  • Loading branch information
xin-huang committed May 5, 2024
1 parent 798cb14 commit c0e5b26
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 5 deletions.
14 changes: 9 additions & 5 deletions docs/userguide/dfe.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,16 @@

After obtaining the cache file [1KG.YRI.CEU.20.split_mig.sel.single.gamma.spectra.bpkl](https://github.com/xin-huang/dadi-cli/blob/revision/examples/results/caches/1KG.YRI.CEU.20.split_mig.sel.single.gamma.spectra.bpkl), we can fit the spectrum from nonsynonymous SNPs for DFE inference. Although our data come from two populations, we will first infer a one-dimensional DFE, which assumes that selection coefficients are equal in the two populations. An example command is below:

```
dadi-cli InferDFE --fs examples/results/fs/1KG.YRI.CEU.20.nonsynonymous.snps.unfold.fs --cache1d examples/results/caches/1KG.YRI.CEU.20.split_mig.sel.single.gamma.spectra.bpkl --pdf1d lognormal --p0 1 1 .5 --lbounds -10 0.01 0 --ubounds 10 10 0.5 --demo-popt examples/results/demog/1KG.YRI.CEU.20.split_mig.demog.params.InferDM.bestfits --ratio 2.31 --output-prefix examples/results/dfe/1KG.YRI.CEU.20.split_mig.dfe.1D_lognormal.params --optimizations 10 --maxeval 400 --check-convergence 5 --cpus 2
```
dadi-cli InferDFE --fs examples/results/fs/1KG.YRI.CEU.20.non.unfolded.fs --cache1d examples/results/caches/1KG.YRI.CEU.20.split_mig.sel.single.gamma.spectra.bpkl --pdf1d lognormal --p0 1 1 .5 --lbounds -10 0.01 0 --ubounds 10 10 0.5 --demo-popt examples/results/demog/1KG.YRI.CEU.20.split_mig.demog.params.InferDM.bestfits --ratio 2.31 --output-prefix examples/results/dfe/1KG.YRI.CEU.20.split_mig.dfe.1D_lognormal --optimizations 10 --maxeval 400 --check-convergence 5 --cpus 2
```

Many arguments are the same as those in the `InferDM` command.

We define the marginal DFE as a lognormal distribution with `--pdf1d`. We use `--ratio` to specify the ratio of the nonsynonymous SNPs to the synonymous SNPs to calculate the population-scaled mutation rate of the nonsynonymous SNPs. Our parameters are `log_mu` the mean of the lognormal distribution, `log_sigma` the standard deviation of the lognormal distribution, and `misid`.

## Output

The result is

```
Expand All @@ -34,9 +38,9 @@ Similar to the best fit parameters in `./examples/results/demog/1KG.YRI.CEU.spli
|------------|-----|-------|-------------------|
| -1389 | 5.5 | 7.6 | 0.017 |

# Joint DFE inference
## Joint DFE inference

## Inferring a bivariate lognormal joint DFE
### Inferring a bivariate lognormal joint DFE

Here we will infer a joint DFE with selection potentially being different in the two populations. We define the DFE as a bivariate lognormal distribution with `--pdf2d` and pass in a cache that assumes the population-scaled selection coefficients are different in the two populations through `--cache2d`. The bivariate lognormal has an extra parameter `rho`, the correlation of the DFE between the populations. We can allow `mu_log` and `sigma_log` be different or the same in our populations. `dadi-cli` will run either the symmetric (shared `mu_log` and `sigma_log`) or asymmetric (independent `mu_log` and `sigma_log`) bivariate lognormal based on the number of parameters. For the symmetric bivariate lognormal the parameters are `log_mu`, `log_sigma`, and `rho`, the asymmetric bivariate lognormal the parameters are `log_mu1`, `log_mu2`, `log_sigma1`, `log_sigma2`, and `rho`, where 1 denotes the first population and 2 denotes the second population.

Expand All @@ -52,7 +56,7 @@ An example of running an asymmetrical bivariate lognormal is:
dadi-cli InferDFE --fs ./examples/results/fs/1KG.YRI.CEU.20.nonsynonymous.snps.unfold.fs --cache2d ./examples/results/caches/1KG.YRI.CEU.20.split_mig.sel.spectra.bpkl --pdf2d biv_lognormal --p0 1 1 1 1 .5 .5 --lbounds -10 -10 0.01 0.01 0.001 0 --ubounds 10 10 10 10 0.999 0.5 --demo-popt ./examples/results/demog/1KG.YRI.CEU.20.split_mig.demog.params.InferDM.bestfits --ratio 2.31 --output ./examples/results/dfe/1KG.YRI.CEU.20.split_mig.dfe.bivariate_asym_lognormal.params --optimizations 15 --maxeval 400 --check-convergence 10
```

## Inferring mixture model joint DFE
### Inferring mixture model joint DFE

Here, we will use a mixture of a univariate lognormal and a bivariate lognormal distribution. To make the mixture we pass in options for both 1D and 2D: `--pdf1d`, `--pdf2d`, `--cache1d`, and `--cache2d`. Because the mixture model is assuming some proportion of the DFE is lognormal and the other is bivariate, the bivariate is symmeteric. The parameters for the mixture lognormal DFE are `log_mu`, `log_sigma`, `rho`,and `w`, the proportional weight of the bivariate lognormal DFE (1-`w` would be the weight of the univariate lognormal distribution). In this example we fix `rho` of the bivariate component to 0 with the `--constants` option.

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# /Users/user/mambaforge/envs/dadi-cli-cpu/bin/dadi-cli InferDFE --fs examples/results/fs/1KG.YRI.CEU.20.non.unfolded.fs --cache1d examples/results/caches/1KG.YRI.CEU.20.split_mig.sel.single.gamma.spectra.bpkl --pdf1d lognormal --p0 1 1 .5 --lbounds -10 0.01 0 --ubounds 10 10 0.5 --demo-popt examples/results/demog/1KG.YRI.CEU.20.split_mig.demog.params.InferDM.bestfits --ratio 2.31 --output-prefix examples/results/dfe/1KG.YRI.CEU.20.split_mig.dfe.1D_lognormal --optimizations 10 --maxeval 400 --check-convergence 5 --cpus 2
# /Users/user/mambaforge/envs/dadi-cli-cpu/bin/dadi-cli InferDFE --fs examples/results/fs/1KG.YRI.CEU.20.non.unfolded.fs --cache1d examples/results/caches/1KG.YRI.CEU.20.split_mig.sel.single.gamma.spectra.bpkl --pdf1d lognormal --p0 1 1 .5 --lbounds -10 0.01 0 --ubounds 10 10 0.5 --demo-popt examples/results/demog/1KG.YRI.CEU.20.split_mig.demog.params.InferDM.bestfits --ratio 2.31 --output-prefix examples/results/dfe/1KG.YRI.CEU.20.split_mig.dfe.1D_lognormal --optimizations 10 --maxeval 400 --check-convergence 5 --cpus 2
#
# Converged results
# Log(likelihood) log_mu log_sigma misid theta
-1390.216558502869 5.7078081159785725 7.934694578672994 0.01647346102391192 15691.446711371886
-1390.2206755646514 5.696866719099689 7.910257538672334 0.016549081772691945 15691.446711371886
-1390.2321802957717 5.6985080744503795 7.915039953288335 0.01632034569197264 15691.446711371886
#
# Top 100 results
# Log(likelihood) log_mu log_sigma misid theta
-1390.216558502869 5.7078081159785725 7.934694578672994 0.01647346102391192 15691.446711371886
-1390.2206755646514 5.696866719099689 7.910257538672334 0.016549081772691945 15691.446711371886
-1390.2321802957717 5.6985080744503795 7.915039953288335 0.01632034569197264 15691.446711371886
-1390.5785860284573 5.838670032124382 8.202117596248334 0.016267679223638656 15691.446711371886
-1391.602004080164 5.883742893394162 8.301911448575662 0.015025267493752754 15691.446711371886
-2851.693923307018 0.9456483751405087 0.2546946309528162 0.03942740949097054 15691.446711371886
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# /Users/user/mambaforge/envs/dadi-cli-cpu/bin/dadi-cli InferDFE --fs examples/results/fs/1KG.YRI.CEU.20.non.unfolded.fs --cache1d examples/results/caches/1KG.YRI.CEU.20.split_mig.sel.single.gamma.spectra.bpkl --pdf1d lognormal --p0 1 1 .5 --lbounds -10 0.01 0 --ubounds 10 10 0.5 --demo-popt examples/results/demog/1KG.YRI.CEU.20.split_mig.demog.params.InferDM.bestfits --ratio 2.31 --output-prefix examples/results/dfe/1KG.YRI.CEU.20.split_mig.dfe.1D_lognormal --optimizations 10 --maxeval 400 --check-convergence 5 --cpus 2
# Log(likelihood) log_mu log_sigma misid theta
-1390.2206755646514 5.696866719099689 7.910257538672334 0.016549081772691945 15691.446711371886
-2851.693923307018 0.9456483751405087 0.2546946309528162 0.03942740949097054 15691.446711371886
-1390.2321802957717 5.6985080744503795 7.915039953288335 0.01632034569197264 15691.446711371886
-1390.5785860284573 5.838670032124382 8.202117596248334 0.016267679223638656 15691.446711371886
-1391.602004080164 5.883742893394162 8.301911448575662 0.015025267493752754 15691.446711371886
-1390.216558502869 5.7078081159785725 7.934694578672994 0.01647346102391192 15691.446711371886

0 comments on commit c0e5b26

Please sign in to comment.