Skip to content

Commit

Permalink
Add compute_boxes_data.
Browse files Browse the repository at this point in the history
  • Loading branch information
orenbenkiki committed Jun 20, 2024
1 parent 7ab237a commit 24b27e5
Show file tree
Hide file tree
Showing 9 changed files with 153 additions and 31 deletions.
2 changes: 1 addition & 1 deletion docs/v0.1.0/.documenter-siteinfo.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-06-20T17:56:23","documenter_version":"1.4.1"}}
{"documenter":{"julia_version":"1.10.4","generation_timestamp":"2024-06-20T21:08:18","documenter_version":"1.4.1"}}
96 changes: 90 additions & 6 deletions docs/v0.1.0/boxes.html
Original file line number Diff line number Diff line change
Expand Up @@ -245,23 +245,23 @@ <h1 id="Boxes">
</p>
<p>
<strong>gene @ divergence
</strong>::AbstractFloat (required): Scale fold factors of each gene by multiplying with (1 - divergence) of the gene.
</strong>::Union{Float32, Float64} (required): Scale fold factors of each gene by multiplying with (1 - divergence) of the gene.
</p>
<p>
<strong>metacell @ total_UMIs
</strong>::Unsigned (required): The total number of UMIs used to estimate the fraction of all the genes in each metacell.
</strong>::Union{UInt16, UInt32, UInt64, UInt8} (required): The total number of UMIs used to estimate the fraction of all the genes in each metacell.
</p>
<p>
<strong>Matrices
</strong>
</p>
<p>
<strong>gene, metacell @ fraction
</strong>::AbstractFloat (required): The estimated fraction of the UMIs of each gene in each metacell.
</strong>::Union{Float32, Float64} (required): The estimated fraction of the UMIs of each gene in each metacell.
</p>
<p>
<strong>gene, metacell @ total_UMIs
</strong>::Unsigned (required): The total number of UMIs used to estimate the fraction of each gene in each metacell.
</strong>::Union{UInt16, UInt32, UInt64, UInt8} (required): The total number of UMIs used to estimate the fraction of each gene in each metacell.
</p>
<p>
<strong>Outputs
Expand Down Expand Up @@ -293,7 +293,7 @@ <h1 id="Boxes">
</p>
<p>
<strong>neighborhood @ span
</strong>::AbstractFloat (guaranteed): The span (fold factor) used to compute the neighborhood.
</strong>::Union{Float32, Float64} (guaranteed): The span (fold factor) used to compute the neighborhood.
</p>
<p>
<strong>Matrices
Expand All @@ -305,7 +305,7 @@ <h1 id="Boxes">
</p>
<p>
<strong>box, box @ distance
</strong>::AbstractFloat (guaranteed): The distance (fold factor) between the most different metacell genes between the boxes.
</strong>::Union{Float32, Float64} (guaranteed): The distance (fold factor) between the most different metacell genes between the boxes.
</p>
<p>
<strong>box, neighborhood @ is_member
Expand All @@ -314,6 +314,84 @@ <h1 id="Boxes">
</div>
</section>
</article>
<article class="docstring">
<header>
<a class="docstring-article-toggle-button fa-solid fa-chevron-down" href="javascript:;" title="Collapse docstring">
</a>
<a class="docstring-binding" id="Metacells.Boxes.compute_boxes_data!" href="#Metacells.Boxes.compute_boxes_data!">
<code>Metacells.Boxes.compute_boxes_data!
</code>
</a>
<span class="docstring-category">Function
</span>
</header>
<section>
<div>
<pre>
<code class="language-julia hljs">function compute_boxes_data!(daf::DafWriter)::Nothing
</code>
</pre>
<p>Compute aggregated per-box data based on per-metacell data. This is separated from
<a href="boxes.html#Metacells.Boxes.compute_boxes!">
<code>compute_boxes!
</code>
</a> to allow computing this data for all genes (instead of just for the subset of genes fed to
<a href="boxes.html#Metacells.Boxes.compute_boxes!">
<code>compute_boxes!
</code>
</a>).
</p>
<p>
<strong>Inputs
</strong>
</p>
<p>
<strong>Axes
</strong>
</p>
<p>
<strong>gene
</strong> (required): Sequenced genes.
</p>
<p>
<strong>metacell
</strong> (required): Minimal-sized groups of cells for robust point estimates.
</p>
<p>
<strong>box
</strong> (required): Distinct groups of metacells with &quot;very close&quot; estimated cell state.
</p>
<p>
<strong>Matrices
</strong>
</p>
<p>
<strong>gene, metacell @ fraction
</strong>::Union{Float32, Float64} (required): The estimated fraction of the UMIs of each gene in each metacell.
</p>
<p>
<strong>gene, metacell @ total_UMIs
</strong>::Union{UInt16, UInt32, UInt64, UInt8} (required): The total number of UMIs used to estimate the fraction of each gene in each metacell.
</p>
<p>
<strong>Outputs
</strong>
</p>
<p>
<strong>Matrices
</strong>
</p>
<p>
<strong>gene, box @ fraction
</strong>::Union{Float32, Float64} (guaranteed): The estimated fraction of the UMIs of each gene in each box.
</p>
<p>
<strong>gene, box @ total_UMIs
</strong>::Union{UInt16, UInt32, UInt64, UInt8} (guaranteed): The total number of UMIs used to estimate the fraction of each gene in each box.
</p>
</div>
</section>
</article>
<h2 id="Index">
<a class="docs-heading-anchor" href="#Index">Index
</a>
Expand All @@ -335,6 +413,12 @@ <h2 id="Index">
</code>
</a>
</li>
<li>
<a href="boxes.html#Metacells.Boxes.compute_boxes_data!">
<code>Metacells.Boxes.compute_boxes_data!
</code>
</a>
</li>
</ul>
</article>
<nav class="docs-footer">
Expand Down
10 changes: 5 additions & 5 deletions docs/v0.1.0/identify_genes.html
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ <h1 id="Identify-Genes">
</p>
<p>
<strong>gene, metacell @ fraction
</strong>::AbstractFloat (required): The estimated fraction of the UMIs of each gene in each metacell.
</strong>::Union{Float32, Float64} (required): The estimated fraction of the UMIs of each gene in each metacell.
</p>
<p>
<strong>Outputs
Expand All @@ -230,7 +230,7 @@ <h1 id="Identify-Genes">
</p>
<p>
<strong>gene @ divergence
</strong>::AbstractFloat (guaranteed): Scale fold factors of each gene by multiplying with (1 - divergence) of the gene.
</strong>::Union{Float32, Float64} (guaranteed): Scale fold factors of each gene by multiplying with (1 - divergence) of the gene.
</p>
</div>
</section>
Expand Down Expand Up @@ -304,15 +304,15 @@ <h1 id="Identify-Genes">
</p>
<p>
<strong>gene @ divergence
</strong>::AbstractFloat (required): Scale fold factors of each gene by multiplying with (1 - divergence) of the gene.
</strong>::Union{Float32, Float64} (required): Scale fold factors of each gene by multiplying with (1 - divergence) of the gene.
</p>
<p>
<strong>Matrices
</strong>
</p>
<p>
<strong>gene, metacell @ fraction
</strong>::AbstractFloat (required): The estimated fraction of the UMIs of each gene in each metacell.
</strong>::Union{Float32, Float64} (required): The estimated fraction of the UMIs of each gene in each metacell.
</p>
<p>
<strong>Outputs
Expand Down Expand Up @@ -397,7 +397,7 @@ <h1 id="Identify-Genes">
</p>
<p>
<strong>gene, metacell @ fraction
</strong>::AbstractFloat (required): The estimated fraction of the UMIs of each gene in each metacell.
</strong>::Union{Float32, Float64} (required): The estimated fraction of the UMIs of each gene in each metacell.
</p>
<p>
<strong>Outputs
Expand Down
6 changes: 6 additions & 0 deletions docs/v0.1.0/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,12 @@ <h1 id="Index">
</a>
</li>
<li>
<a href="boxes.html#Metacells.Boxes.compute_boxes_data!">
<code>Metacells.Boxes.compute_boxes_data!
</code>
</a>
</li>
<li>
<a href="contracts.html#Metacells.Contracts.box_axis">
<code>Metacells.Contracts.box_axis
</code>
Expand Down
Binary file modified docs/v0.1.0/objects.inv
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/v0.1.0/search_index.js

Large diffs are not rendered by default.

25 changes: 25 additions & 0 deletions src/boxes.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ of "similar" boxes for further analysis.
module Boxes

export compute_boxes!
export compute_boxes_data!

using ..Contracts
using ..IdentifyGenes
Expand Down Expand Up @@ -725,4 +726,28 @@ end
return is_member_of_boxes_in_neighborhoods
end

"""
function compute_boxes_data!(daf::DafWriter)::Nothing
Compute aggregated per-box data based on per-metacell data. This is separated from [`compute_boxes!`](@ref) to allow
computing this data for all genes (instead of just for the subset of genes fed to [`compute_boxes!`](@ref)).
$(CONTRACT)
"""
@computation Contract(
axes = [gene_axis(RequiredInput), metacell_axis(RequiredInput), box_axis(RequiredInput)],
data = [
gene_metacell_fraction_matrix(RequiredInput),
gene_metacell_total_UMIs_matrix(RequiredInput),
gene_box_fraction_matrix(GuaranteedOutput),
gene_box_total_UMIs_matrix(GuaranteedOutput),
],
) function compute_boxes_data!(daf::DafWriter)::Nothing # untested
fraction_of_genes_in_boxes = daf["/ metacell / gene : fraction @ box ! %> GeoMean eps 1e-5"]
set_matrix!(daf, "box", "gene", "fraction", fraction_of_genes_in_boxes; overwrite = true)
total_UMIs_of_genes_in_boxes = daf["/ metacell / gene : total_UMIs @ box ! %> Sum"]
set_matrix!(daf, "box", "gene", "total_UMIs", total_UMIs_of_genes_in_boxes; overwrite = true)
return nothing
end

end # module
1 change: 1 addition & 0 deletions src/boxes.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
```@docs
Metacells.Boxes
Metacells.Boxes.compute_boxes!
Metacells.Boxes.compute_boxes_data!
```

## Index
Expand Down
42 changes: 24 additions & 18 deletions src/contracts.jl
Original file line number Diff line number Diff line change
Expand Up @@ -115,11 +115,8 @@ is used when considering the distance between gene expressions. Therefore genes
a higher fold factor
"""
function gene_divergence_vector(expectation::ContractExpectation)::Pair{VectorKey, DataSpecification}
return ("gene", "divergence") => (
expectation,
AbstractFloat,
"Scale fold factors of each gene by multiplying with (1 - divergence) of the gene.",
)
return ("gene", "divergence") =>
(expectation, StorageFloat, "Scale fold factors of each gene by multiplying with (1 - divergence) of the gene.")
end

"""
Expand Down Expand Up @@ -190,7 +187,7 @@ the robustness of the estimates.
function metacell_total_UMIs_vector(expectation::ContractExpectation)::Pair{VectorKey, DataSpecification}
return ("metacell", "total_UMIs") => (
expectation,
Unsigned,
StorageUnsigned,
"The total number of UMIs used to estimate the fraction of all the genes in each metacell.",
)
end
Expand Down Expand Up @@ -234,8 +231,11 @@ boxes may share the same main neighborhood. If the samples are sufficiently spar
only just the single box (which itself may include just a single metacell).
"""
function box_total_UMIs_vector(expectation::ContractExpectation)::Pair{VectorKey, DataSpecification} # untested
return ("box", "neighborhood.main") =>
(expectation, Unsigned, "The total number of UMIs used to estimate the fraction of all the genes in each box.")
return ("box", "neighborhood.main") => (
expectation,
StorageUnsigned,
"The total number of UMIs used to estimate the fraction of all the genes in each box.",
)
end

"""
Expand Down Expand Up @@ -265,7 +265,7 @@ simple fixed span will not do, due to the curse of multi-dimensionality).
"""
function neighborhood_span_vector(expectation::ContractExpectation)::Pair{VectorKey, DataSpecification}
return ("neighborhood", "span") =>
(expectation, AbstractFloat, "The span (fold factor) used to compute the neighborhood.")
(expectation, StorageFloat, "The span (fold factor) used to compute the neighborhood.")
end

"""
Expand All @@ -277,7 +277,7 @@ batch effects, purging doublets, and compensating for any other technical artifa
"""
function gene_metacell_fraction_matrix(expectation::ContractExpectation)::Pair{MatrixKey, DataSpecification}
return ("gene", "metacell", "fraction") =>
(expectation, AbstractFloat, "The estimated fraction of the UMIs of each gene in each metacell.")
(expectation, StorageFloat, "The estimated fraction of the UMIs of each gene in each metacell.")
end

"""
Expand All @@ -289,8 +289,11 @@ estimates) to be some minimum, and possibly adjust the fold factor according to
multinomial sampling distribution).
"""
function gene_metacell_total_UMIs_matrix(expectation::ContractExpectation)::Pair{MatrixKey, DataSpecification}
return ("gene", "metacell", "total_UMIs") =>
(expectation, Unsigned, "The total number of UMIs used to estimate the fraction of each gene in each metacell.")
return ("gene", "metacell", "total_UMIs") => (
expectation,
StorageUnsigned,
"The total number of UMIs used to estimate the fraction of each gene in each metacell.",
)
end

"""
Expand All @@ -299,9 +302,9 @@ end
The estimated fraction of the UMIs of each gene in each box. Each box is a sample of the manifold, representing a real
biological state, which is different from the state of any other box.
"""
function gene_box_fraction_matrix(expectation::ContractExpectation)::Pair{MatrixKey, DataSpecification} # untested
function gene_box_fraction_matrix(expectation::ContractExpectation)::Pair{MatrixKey, DataSpecification}
return ("gene", "box", "fraction") =>
(expectation, AbstractFloat, "The estimated fraction of the UMIs of each gene in each box.")
(expectation, StorageFloat, "The estimated fraction of the UMIs of each gene in each box.")
end

"""
Expand All @@ -312,9 +315,12 @@ robustness of the estimate. When computing fold factors, we require the total nu
estimates) to be some minimum, and possibly adjust the fold factor according to some confidence level (assuming a
multinomial sampling distribution).
"""
function gene_box_total_UMIs_matrix(expectation::ContractExpectation)::Pair{MatrixKey, DataSpecification} # untested
return ("gene", "box", "total_UMIs") =>
(expectation, Unsigned, "The total number of UMIs used to estimate the fraction of each gene in each box.")
function gene_box_total_UMIs_matrix(expectation::ContractExpectation)::Pair{MatrixKey, DataSpecification}
return ("gene", "box", "total_UMIs") => (
expectation,
StorageUnsigned,
"The total number of UMIs used to estimate the fraction of each gene in each box.",
)
end

"""
Expand Down Expand Up @@ -342,7 +348,7 @@ genes that are also correlated in the main neighborhood of either of the boxes).
function box_box_distance(expectation::ContractExpectation)::Pair{MatrixKey, DataSpecification}
return ("box", "box", "distance") => (
expectation,
AbstractFloat,
StorageFloat,
"The distance (fold factor) between the most different metacell genes between the boxes.",
)
end
Expand Down

0 comments on commit 24b27e5

Please sign in to comment.