Skip to content

Commit

Permalink
Use confidence for identifying correlated genes.
Browse files Browse the repository at this point in the history
  • Loading branch information
orenbenkiki committed Jun 17, 2024
1 parent 7a7e82f commit 888fa4e
Show file tree
Hide file tree
Showing 7 changed files with 93 additions and 54 deletions.
1 change: 1 addition & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Daf = "1375bf9c-a47d-45a1-aad5-626dd8629d98"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Reexport = "189a3867-3050-52da-a836-e630ba90ab69"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
Expand Down
2 changes: 1 addition & 1 deletion docs/v0.1.0/.documenter-siteinfo.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-06-16T14:04:57","documenter_version":"1.4.1"}}
{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-06-17T13:39:41","documenter_version":"1.4.1"}}
12 changes: 6 additions & 6 deletions docs/v0.1.0/boxes.html
Original file line number Diff line number Diff line change
Expand Up @@ -162,10 +162,10 @@ <h1 id="Boxes">
daf::DafWriter;
min_significant_gene_UMIs::Integer = 40,
gene_fraction_regularization::AbstractFloat = 1e-5,
confidence::AbstractFloat = 0.9,
fold_confidence::AbstractFloat = 0.9,
max_box_span::AbstractFloat = 2.0,
max_neighborhood_span::AbstractFloat = 2.0,
min_gene_correlation::AbstractFloat = 0.5,
correlation_confidence::AbstractFloat = 0.99,
max_deviant_genes_fraction::AbstractFloat = 0.01,
overwrite::Bool = false,
)::Nothing
Expand All @@ -181,7 +181,7 @@ <h1 id="Boxes">
</code> (by default,
<code>1e-5
</code>). Since the fraction of the gene is a random variable, we decrease the high fraction and increase the low fraction by a factor based on the
<code>confidence
<code>fold_confidence
</code> of the test (by default, 0.9), assuming a multinomial distribution. In addition, if the sum of the total UMIs of the gene in both metacells is less than
<code>min_significant_gene_UMIs
</code> (by default,
Expand All @@ -206,9 +206,9 @@ <h1 id="Boxes">
<code>max_neighborhood_span
</code>. These neighborhoods may overlap. The main neighborhoods of different boxes may even be identical.
</li>
<li>For each box, we compute the set of genes which have at least the
<code>min_gene_correlation
</code> with some other gene(s) in its main neighborhood. We restrict the correlated set of genes of each metacell to be the intersection of this set with the set from its box in the previous round.
<li>For each box, we compute the set of genes which are correlated (using the
<code>correlation_confidence
</code>) with some other gene(s) in its main neighborhood.
</li>
<li>If the new sets of correlated genes only differ up to
<code>max_convergence_fraction
Expand Down
14 changes: 9 additions & 5 deletions docs/v0.1.0/identify_genes.html
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ <h1 id="Identify-Genes">
<code class="language-julia hljs">function identify_correlated_genes!(
daf::DafWriter;
gene_fraction_regularization::AbstractFloat = 1e-5,
min_gene_correlation::AbstractFloat = 0.5,
correlation_confidence::AbstractFloat = 0.9,
overwrite::Bool = false,
)::Nothing
</code>
Expand All @@ -284,11 +284,15 @@ <h1 id="Identify-Genes">
</li>
<li>Correlate this between all the pairs of genes.
</li>
<li>Find the maximal absolute correlation for each gene (that is, strong anti-correlation also counts).
<li>For each gene, shuffle its values along all metacells, and again correlate this between all the pairs of genes.
</li>
<li>Identify the genes which have at least one gene with a correlation of at least
<code>min_gene_correlation
</code>.
<li>Find the maximal absolute correlation for each gene in both cases (that is, strong anti-correlation also counts).
</li>
<li>Find the
<code>correlation_confidence
</code> quantile correlation of the shuffled data.
</li>
<li>Identify the genes that have at least that level of correlations in the unshuffled data.
</li>
</ol>
<p>CONTRACT
Expand Down
Loading

0 comments on commit 888fa4e

Please sign in to comment.