Merge pull request #391 from ReactiveBayes/sharp-bits

Add sharp bits section and invite users to the discussions upon error
ReactiveBayes · Dec 23, 2024 · a6b3dfa · a6b3dfa
2 parents 77c1cab + 8d82cba
commit a6b3dfa
Show file tree

Hide file tree

Showing 9 changed files with 434 additions and 18 deletions.
diff --git a/docs/Project.toml b/docs/Project.toml
@@ -4,6 +4,7 @@ BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
 Cairo = "159f3aea-2a34-519c-b102-8c37f9878175"
 Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
 Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
+DocumenterMermaid = "a078cd44-4d9c-4618-b545-3ab9d77f9177"
 ExponentialFamily = "62312e5e-252a-4322-ace9-a5f4bf9b357b"
 ExponentialFamilyProjection = "17f509fa-9a96-44ba-99b2-1c5f01f0931b"
 GraphPPL = "b3f8163a-e979-4e85-b43e-1f63d8c8b42c"

diff --git a/docs/make.jl b/docs/make.jl
@@ -1,5 +1,6 @@
 using RxInfer
 using Documenter
+using DocumenterMermaid
 
 ## https://discourse.julialang.org/t/generation-of-documentation-fails-qt-qpa-xcb-could-not-connect-to-display/60988
 ## https://gr-framework.org/workstations.html#no-output
@@ -85,8 +86,10 @@ ExamplesPages = map(collect(pairs(ExamplesCategoriesPages))) do (label, info)
     return info.title => info.pages
 end
 
+draft = get(ENV, "DOCS_DRAFT", "false") == "true"
+
 makedocs(;
-    draft = false,
+    draft = draft,
     warnonly = false,
     modules = [RxInfer],
     authors = "Bagaev Dmitry <[email protected]> and contributors",
@@ -111,7 +114,13 @@ makedocs(;
             "Inference specification"   => ["Overview" => "manuals/inference/overview.md", "Static inference" => "manuals/inference/static.md", "Streamline inference" => "manuals/inference/streamlined.md", "Initialization" => "manuals/inference/initialization.md", "Auto-updates" => "manuals/inference/autoupdates.md", "Deterministic nodes" => "manuals/inference/delta-node.md", "Non-conjugate inference" => "manuals/inference/nonconjugate.md", "Undefined message update rules" => "manuals/inference/undefinedrules.md"],
             "Inference customization"   => ["Defining a custom node and rules" => "manuals/customization/custom-node.md", "Inference results postprocessing" => "manuals/customization/postprocess.md"],
             "Debugging"                 => "manuals/debugging.md",
-            "Migration from v2 to v3"   => "manuals/migration-guide-v2-v3.md"
+            "Migration from v2 to v3"   => "manuals/migration-guide-v2-v3.md",
+            "Sharp bits of RxInfer"     => [
+                "Overview" => "manuals/sharpbits/overview.md",
+                "Rule Not Found Error" => "manuals/sharpbits/rule-not-found.md",
+                "Stack Overflow in Message Computations" => "manuals/sharpbits/stack-overflow-inference.md",
+                "Using `=` instead of `:=` for deterministic nodes" => "manuals/sharpbits/usage-colon-equality.md"
+            ]
         ],
         "Library" => [
             "Model construction" => "library/model-construction.md",

diff --git a/docs/src/manuals/comparison.md b/docs/src/manuals/comparison.md
@@ -67,7 +67,12 @@ end
 end
 ```
 
-- **Expressiveness**: `RxInfer.jl` empowers users to elegantly and concisely craft models, closely mirroring probabilistic notation, thanks to Julia's macro capabilities. To illustrate this, let's consider the following model:
+- **Expressiveness**: `RxInfer.jl` empowers users to elegantly and concisely craft models, closely mirroring probabilistic notation, thanks to Julia's macro capabilities.
+
+!!! note
+    RxInfer uses `:=` for deterministic relationships (see [Using `=` instead of `:=` for deterministic nodes](@ref usage-colon-equality)) which might differ from other frameworks but enables powerful message-passing capabilities.
+
+To illustrate the expressiveness, let's consider the following model:
 
 $$\begin{aligned}
  x & \sim \mathrm{Normal}(0.0, 1.0)\\

diff --git a/docs/src/manuals/model-specification.md b/docs/src/manuals/model-specification.md
@@ -171,8 +171,9 @@ y ~ Normal(mean = x, variance = 1.0)
 ```
 
 Using `x = exp(t)` directly would be incorrect and most likely would result in an `MethodError` because `t` does not have a definitive value at the model creation time 
-(remember that our models create a factor graph under the hood and latent states do not have a value until the inference is performed). At the model creation time, 
-`t` holds a reference to a node in the graph, instead of an actual value sample from the `Normal` distribution.
+(remember that our models create a factor graph under the hood and latent states do not have a value until the inference is performed).
+
+See [Using `=` instead of `:=` for deterministic nodes](@ref usage-colon-equality) for a detailed explanation of this design choice.
 
 ### [Control flow statements](@id user-guide-model-specification-node-creation-control-flow)
 

diff --git a/docs/src/manuals/sharpbits/overview.md b/docs/src/manuals/sharpbits/overview.md
@@ -0,0 +1,50 @@
+# Sharp bits of RxInfer
+
+This page serves as a collection of sharp bits - potential pitfalls and common issues you might encounter while using RxInfer. While RxInfer is designed to be user-friendly, there are certain scenarios where you might encounter unexpected behavior or errors. Understanding these can help you avoid common problems and debug your code more effectively.
+
+- [Rule Not Found Error](@ref rule-not-found)
+    - What causes it
+    - How to diagnose and fix it
+    - Common scenarios
+
+- [Stack Overflow during inference](@ref stack-overflow-inference)
+    - Understanding the potential cause
+    - Prevention strategies
+
+- [Using `=` instead of `:=` for deterministic nodes](@ref usage-colon-equality)
+    - Why not `=`?
+
+!!! note
+    This is a community document that will be updated as we identify more common issues and their solutions. If you encounter a problem that isn't covered here, please consider opening an [issue/discussion](https://github.com/rxinfer/rxinfer/discussions) or contributing to this guide.
+
+## How to contribute
+
+If you have a sharp bit to share, please consider opening an [issue/discussion](https://github.com/rxinfer/rxinfer/discussions) or contributing to this guide.
+To write a new section, create a new file in the `docs/src/manuals/sharpbits` directory. Use `@id` to specify the ID of the section and `@ref` to reference it later.
+
+```md
+# [New section](@id new-section)
+
+This is a new section.
+```
+
+Then add a new entry to the `pages` array in the `docs/make.jl` file.
+
+```julia
+"Sharp bits of RxInfer" => [
+    "Overview" => "manuals/sharpbits/overview.md",
+    "Rule Not Found Error" => "manuals/sharpbits/rule-not-found.md",
+    "Stack Overflow in Message Computations" => "manuals/sharpbits/stack-overflow-inference.md",
+    "Using `=` instead of `:=` for deterministic nodes" => "manuals/sharpbits/usage-colon-equality.md",
+    # ...
+    "New section" => "manuals/sharpbits/new-section.md",
+]
+```
+
+In the `overview.md` file, add a new section with the title and the ID of the section. Use the `@ref` macro to reference the ID.
+
+```md
+- [New section](@ref new-section)
+    - What this section is about
+    - ...
+```
diff --git a/docs/src/manuals/sharpbits/rule-not-found.md b/docs/src/manuals/sharpbits/rule-not-found.md
@@ -0,0 +1,143 @@
+# [Rule Not Found Error](@id rule-not-found)
+
+When using RxInfer, you might encounter a `RuleNotFoundError`. This error occurs during message-passing inference when the system cannot find appropriate update rules for computing messages between nodes in your factor graph. Let's understand why this happens and how to resolve it.
+
+## Why does this happen?
+
+Message-passing inference works by exchanging messages between nodes in a factor graph. Each message represents a probability distribution, and the rules for computing these messages depend on:
+
+1. The type of the factor node (e.g., `Normal`, `Gamma`, etc.)
+2. The types of incoming messages (e.g., `Normal`, `PointMass`, etc.) 
+3. The interface through which the message is being computed
+4. The inference method being used (Belief Propagation or Variational Message Passing)
+
+The last point is particularly important - some message update rules may exist for Variational Message Passing (VMP) but not for Belief Propagation (BP), or vice versa. This is because BP aims to compute exact posterior distributions through message passing (when possible), while VMP approximates the posterior using the Bethe approximation. For a detailed mathematical treatment of these differences, see our [Bethe Free Energy implementation](@ref lib-bethe-free-energy) guide.
+
+For example, consider this simple model:
+
+```julia
+@model function problematic_model()
+    μ ~ Normal(mean = 0.0, variance = 1.0)
+    τ ~ Gamma(shape = 1.0, rate = 1.0)
+    y ~ Normal(mean = μ, precision = τ)
+end
+```
+
+This model will fail with a `RuleNotFoundError` because there are no belief propagation message passing update rules available for this combination of distributions - only variational message passing rules exist. Even though the model looks simple, the message passing rules needed for exact inference do not exist in closed form.
+
+## Common scenarios
+
+You're likely to encounter this error when:
+
+1. Using non-conjugate pairs of distributions (e.g., `Beta` prior with `Normal` likelihood with precision parameterization)
+2. Working with custom distributions or factor nodes without defining all necessary update rules
+3. Using complex transformations between variables that don't have defined message computations
+4. Mixing different types of distributions in ways that don't have analytical solutions
+
+## Design Philosophy
+
+RxInfer prioritizes performance over generality in its message-passing implementation. By default, it only uses analytically derived message update rules, even in cases where numerical approximations might be possible. This design choice:
+
+- Ensures fast and reliable inference when rules exist
+- Avoids potential numerical instabilities from approximations
+- Throws an error when analytical solutions don't exist
+
+This means you may encounter `RuleNotFoundError` even in cases where approximate solutions could theoretically work. This is intentional - RxInfer will tell you explicitly when you need to consider alternative approaches rather than silently falling back to potentially slower or less reliable approximations. See the [Solutions](@ref rule-not-found-solutions) section below for more details.
+
+## Visualizing the message passing graph
+
+To better understand where message passing rules are needed, let's look at a simple factor graph visualization:
+
+```mermaid
+graph LR
+    %% Other parts of the graph
+    g1[g] -.-> x
+    h1[h] -.-> z
+    y -.-> g2[p]
+    
+    %% Main focus area
+    x((x)) -.- m1[["μ<sub>x→f</sub>"]] --> f[f]
+    f --> m2[["μ<sub>f→y</sub>"]] -.- y((y))
+    z((z)) -.- m3[["μ<sub>z→f</sub>"]] --> f
+
+    %% Styling
+    classDef variable fill:#b3e0ff,stroke:#333,stroke-width:2px;
+    classDef factor fill:#ff9999,stroke:#333,stroke-width:2px,shape:square;
+    classDef otherFactor fill:#ff9999,stroke:#333,stroke-width:2px,opacity:0.3;
+    classDef message fill:none,stroke:none;
+    class x,y,z variable;
+    class f factor;
+    class g1,g2,h1 otherFactor;
+    class m1,m2,m3 message;
+```
+
+In this example:
+- Variables (`x`, `y`, `z`) are represented as circles
+- The factor node (`f`) is represented as a square
+- Messages (μ) flow along the edges between variables and factors, with subscripts indicating direction (e.g., x→f flows from x to f)
+- Faded nodes (g, h) represent other parts of the factor graph that aren't relevant for this local message computation
+
+To compute the outgoing message `f→y`, RxInfer needs:
+1. Rules for how to process incoming messages `x→f` and `z→f`
+2. Rules for combining these messages based on the factor `f`'s type
+3. Rules for producing the outgoing message type that `y` expects
+
+A `RuleNotFoundError` occurs when any of these rules are missing. For example, if `x` sends a `Normal` message but `f` doesn't know how to process `Normal` inputs, or if `f` can't produce the type of message that `y` expects.
+
+## [Solutions](@id rule-not-found-solutions)
+
+### 1. Convert to conjugate pairs
+
+First, try to reformulate your model using conjugate prior-likelihood pairs. Conjugate pairs have analytical solutions for message passing and are well-supported in RxInfer. For example, instead of using a `Normal` likelihood with `Beta` prior on its precision, use a `Normal-Gamma` conjugate pair. See [Conjugate prior - Wikipedia](https://en.wikipedia.org/wiki/Conjugate_prior#Table_of_conjugate_distributions) for a comprehensive list of conjugate distributions.
+
+### 2. Check available rules
+
+If conjugate pairs aren't suitable, verify if your combination of distributions and message types is supported. RxInfer provides many predefined rules, but not all combinations are possible. A good starting point is to check the [List of available nodes](https://reactivebayes.github.io/ReactiveMP.jl/stable/lib/nodes/#lib-predefined-nodes) section in the documentation of ReactiveMP.jl.
+
+### 3. Create custom update rules
+
+If you need specific message computations, you can define your own update rules. See [Creating your own custom nodes](@ref create-node) for a detailed guide on implementing custom nodes and their update rules.
+
+### 4. Use approximations
+
+When exact message updates aren't available, consider:
+
+- Using simpler distribution pairs that have defined rules
+- Employing approximation techniques like moment matching or the methods described in [Meta Specification](@ref user-guide-meta-specification) and [Deterministic nodes](@ref delta-node-manual)
+
+### 5. Use variational inference
+
+Sometimes, adding appropriate factorization constraints can help avoid problematic message computations:
+
+```julia
+constraints = @constraints begin
+    q(μ, τ) = q(μ)q(τ)  # Mean-field assumption
+end
+
+result = infer(
+    model = problematic_model(),
+    constraints = constraints,
+)
+```
+
+!!! note
+    When using variational constraints, you will likely need to initialize certain messages or marginals to handle loops in the factor graph. See [Initialization](@ref initialization) for details on how to properly initialize your model.
+
+For more details on constraints and variational inference, see:
+
+- [Constraints Specification](@ref user-guide-constraints-specification) for a complete guide on using constraints
+- [Bethe Free Energy](@ref lib-bethe-free-energy) for the mathematical background on variational inference and message passing
+
+## Implementation details
+
+When RxInfer encounters a missing rule, it means one of these is missing:
+
+1. A `@rule` definition for the specific message direction and types
+2. A `@marginalrule` for computing joint marginals
+3. An `@average_energy` implementation for free energy computation
+
+You can add these using the methods described in [Creating your own custom nodes](@ref create-node).
+
+!!! note
+    Not all message-passing rules have analytical solutions. In such cases, you might need to use numerical approximations or choose different model structures.
+
diff --git a/docs/src/manuals/sharpbits/stack-overflow-inference.md b/docs/src/manuals/sharpbits/stack-overflow-inference.md
@@ -0,0 +1,95 @@
+# [Stack Overflow during inference](@id stack-overflow-inference)
+
+When working with large probabilistic models in RxInfer, you might encounter a `StackOverflowError`. This section explains why this happens and how to prevent it.
+
+## The Problem
+
+RxInfer uses reactive streams to compute messages between nodes in the factor graph. The subscription to these streams happens recursively, which means:
+
+1. Each node subscribes to its input messages or posteriors
+2. Those input messages may need to subscribe to their own inputs
+3. This continues until all dependencies are resolved
+
+For large models, this recursive subscription process can consume the entire stack space, resulting in a `StackOverflowError`.
+
+## Example Error
+
+When this occurs, you'll see an error message that looks something like this:
+
+```julia
+ERROR: Stack overflow error occurred during the inference procedure. 
+```
+
+## Solution: Limiting Stack Depth
+
+RxInfer provides a solution through the `limit_stack_depth` option in the inference options. This option limits the recursion depth at the cost of some performance overhead.
+
+### How to Use
+
+You can enable stack depth limiting by passing it through the `options` parameter to the `infer` function:
+
+```@example stack-overflow-inference
+using RxInfer
+
+@model function long_state_space_model(y)
+    x[1] ~ Normal(mean = 0.0, var = 1.0)
+    y[1] ~ Normal(mean = x[1], var = 1.0)
+    for i in 2:length(y)
+        x[i] ~ Normal(mean = x[i - 1], var = 1.0)
+        y[i] ~ Normal(mean = x[i], var = 1.0)
+    end
+end
+
+data = (y = rand(10000), )
+
+using Test #hide
+@test_throws StackOverflowError infer(model = long_state_space_model(), data = data) #hide
+
+results = infer(
+    model = long_state_space_model(),
+    data = data,
+    options = (
+        limit_stack_depth = 100, # note the comma
+    )
+)
+```
+
+!!! note
+    Note the comma after `limit_stack_depth = 100`. This is important because it tells Julia that the option is placed in the named tuple `options`.
+
+Without `limit_stack_depth` enabled, the inference will fail with a `StackOverflowError`
+
+```julia
+results = infer(
+    model = long_state_space_model(),
+    data = data
+)
+```
+
+```julia
+ERROR: Stack overflow error occurred during the inference procedure. 
+```
+
+### Performance Considerations
+
+When `limit_stack_depth` is enabled:
+- The recursive subscription process is split into multiple steps
+- This prevents stack overflow but introduces performance overhead (you should verify this in your use case)
+- For very large models, this option might be essential for successful execution
+
+## When to Use
+
+Consider using `limit_stack_depth` when:
+- Working with large models (many nodes/variables)
+- Encountering `StackOverflowError`
+- Processing deep hierarchical models
+- Dealing with long sequences or time series
+
+!!! tip
+    If you're not sure whether you need this option, try running your model without it first. Only enable `limit_stack_depth` if you encounter stack overflow issues.
+
+## Further Reading
+
+For more details about inference options and execution, see:
+- [Static Inference](@ref manual-static-inference) documentation
+- The `options` parameter in the [`infer`](@ref) function documentation