diff --git a/docs/Project.toml b/docs/Project.toml index e2ff121ed..111718ea0 100644 --- a/docs/Project.toml +++ b/docs/Project.toml @@ -4,6 +4,7 @@ BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf" Cairo = "159f3aea-2a34-519c-b102-8c37f9878175" Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f" Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4" +DocumenterMermaid = "a078cd44-4d9c-4618-b545-3ab9d77f9177" ExponentialFamily = "62312e5e-252a-4322-ace9-a5f4bf9b357b" ExponentialFamilyProjection = "17f509fa-9a96-44ba-99b2-1c5f01f0931b" GraphPPL = "b3f8163a-e979-4e85-b43e-1f63d8c8b42c" diff --git a/docs/make.jl b/docs/make.jl index 74c088500..cfbbf8490 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -1,5 +1,6 @@ using RxInfer using Documenter +using DocumenterMermaid ## https://discourse.julialang.org/t/generation-of-documentation-fails-qt-qpa-xcb-could-not-connect-to-display/60988 ## https://gr-framework.org/workstations.html#no-output @@ -85,8 +86,10 @@ ExamplesPages = map(collect(pairs(ExamplesCategoriesPages))) do (label, info) return info.title => info.pages end +draft = get(ENV, "DOCS_DRAFT", "false") == "true" + makedocs(; - draft = false, + draft = draft, warnonly = false, modules = [RxInfer], authors = "Bagaev Dmitry and contributors", @@ -111,7 +114,13 @@ makedocs(; "Inference specification" => ["Overview" => "manuals/inference/overview.md", "Static inference" => "manuals/inference/static.md", "Streamline inference" => "manuals/inference/streamlined.md", "Initialization" => "manuals/inference/initialization.md", "Auto-updates" => "manuals/inference/autoupdates.md", "Deterministic nodes" => "manuals/inference/delta-node.md", "Non-conjugate inference" => "manuals/inference/nonconjugate.md", "Undefined message update rules" => "manuals/inference/undefinedrules.md"], "Inference customization" => ["Defining a custom node and rules" => "manuals/customization/custom-node.md", "Inference results postprocessing" => "manuals/customization/postprocess.md"], "Debugging" => "manuals/debugging.md", - "Migration from v2 to v3" => "manuals/migration-guide-v2-v3.md" + "Migration from v2 to v3" => "manuals/migration-guide-v2-v3.md", + "Sharp bits of RxInfer" => [ + "Overview" => "manuals/sharpbits/overview.md", + "Rule Not Found Error" => "manuals/sharpbits/rule-not-found.md", + "Stack Overflow in Message Computations" => "manuals/sharpbits/stack-overflow-inference.md", + "Using `=` instead of `:=` for deterministic nodes" => "manuals/sharpbits/usage-colon-equality.md" + ] ], "Library" => [ "Model construction" => "library/model-construction.md", diff --git a/docs/src/manuals/comparison.md b/docs/src/manuals/comparison.md index f6befb941..ca0d67c30 100644 --- a/docs/src/manuals/comparison.md +++ b/docs/src/manuals/comparison.md @@ -67,7 +67,12 @@ end end ``` -- **Expressiveness**: `RxInfer.jl` empowers users to elegantly and concisely craft models, closely mirroring probabilistic notation, thanks to Julia's macro capabilities. To illustrate this, let's consider the following model: +- **Expressiveness**: `RxInfer.jl` empowers users to elegantly and concisely craft models, closely mirroring probabilistic notation, thanks to Julia's macro capabilities. + +!!! note + RxInfer uses `:=` for deterministic relationships (see [Using `=` instead of `:=` for deterministic nodes](@ref usage-colon-equality)) which might differ from other frameworks but enables powerful message-passing capabilities. + +To illustrate the expressiveness, let's consider the following model: $$\begin{aligned} x & \sim \mathrm{Normal}(0.0, 1.0)\\ diff --git a/docs/src/manuals/model-specification.md b/docs/src/manuals/model-specification.md index cf0da647a..8cf57d263 100644 --- a/docs/src/manuals/model-specification.md +++ b/docs/src/manuals/model-specification.md @@ -171,8 +171,9 @@ y ~ Normal(mean = x, variance = 1.0) ``` Using `x = exp(t)` directly would be incorrect and most likely would result in an `MethodError` because `t` does not have a definitive value at the model creation time -(remember that our models create a factor graph under the hood and latent states do not have a value until the inference is performed). At the model creation time, -`t` holds a reference to a node in the graph, instead of an actual value sample from the `Normal` distribution. +(remember that our models create a factor graph under the hood and latent states do not have a value until the inference is performed). + +See [Using `=` instead of `:=` for deterministic nodes](@ref usage-colon-equality) for a detailed explanation of this design choice. ### [Control flow statements](@id user-guide-model-specification-node-creation-control-flow) diff --git a/docs/src/manuals/sharpbits/overview.md b/docs/src/manuals/sharpbits/overview.md new file mode 100644 index 000000000..3e389e4af --- /dev/null +++ b/docs/src/manuals/sharpbits/overview.md @@ -0,0 +1,50 @@ +# Sharp bits of RxInfer + +This page serves as a collection of sharp bits - potential pitfalls and common issues you might encounter while using RxInfer. While RxInfer is designed to be user-friendly, there are certain scenarios where you might encounter unexpected behavior or errors. Understanding these can help you avoid common problems and debug your code more effectively. + +- [Rule Not Found Error](@ref rule-not-found) + - What causes it + - How to diagnose and fix it + - Common scenarios + +- [Stack Overflow during inference](@ref stack-overflow-inference) + - Understanding the potential cause + - Prevention strategies + +- [Using `=` instead of `:=` for deterministic nodes](@ref usage-colon-equality) + - Why not `=`? + +!!! note + This is a community document that will be updated as we identify more common issues and their solutions. If you encounter a problem that isn't covered here, please consider opening an [issue/discussion](https://github.com/rxinfer/rxinfer/discussions) or contributing to this guide. + +## How to contribute + +If you have a sharp bit to share, please consider opening an [issue/discussion](https://github.com/rxinfer/rxinfer/discussions) or contributing to this guide. +To write a new section, create a new file in the `docs/src/manuals/sharpbits` directory. Use `@id` to specify the ID of the section and `@ref` to reference it later. + +```md +# [New section](@id new-section) + +This is a new section. +``` + +Then add a new entry to the `pages` array in the `docs/make.jl` file. + +```julia +"Sharp bits of RxInfer" => [ + "Overview" => "manuals/sharpbits/overview.md", + "Rule Not Found Error" => "manuals/sharpbits/rule-not-found.md", + "Stack Overflow in Message Computations" => "manuals/sharpbits/stack-overflow-inference.md", + "Using `=` instead of `:=` for deterministic nodes" => "manuals/sharpbits/usage-colon-equality.md", + # ... + "New section" => "manuals/sharpbits/new-section.md", +] +``` + +In the `overview.md` file, add a new section with the title and the ID of the section. Use the `@ref` macro to reference the ID. + +```md +- [New section](@ref new-section) + - What this section is about + - ... +``` diff --git a/docs/src/manuals/sharpbits/rule-not-found.md b/docs/src/manuals/sharpbits/rule-not-found.md new file mode 100644 index 000000000..3f796ebb5 --- /dev/null +++ b/docs/src/manuals/sharpbits/rule-not-found.md @@ -0,0 +1,143 @@ +# [Rule Not Found Error](@id rule-not-found) + +When using RxInfer, you might encounter a `RuleNotFoundError`. This error occurs during message-passing inference when the system cannot find appropriate update rules for computing messages between nodes in your factor graph. Let's understand why this happens and how to resolve it. + +## Why does this happen? + +Message-passing inference works by exchanging messages between nodes in a factor graph. Each message represents a probability distribution, and the rules for computing these messages depend on: + +1. The type of the factor node (e.g., `Normal`, `Gamma`, etc.) +2. The types of incoming messages (e.g., `Normal`, `PointMass`, etc.) +3. The interface through which the message is being computed +4. The inference method being used (Belief Propagation or Variational Message Passing) + +The last point is particularly important - some message update rules may exist for Variational Message Passing (VMP) but not for Belief Propagation (BP), or vice versa. This is because BP aims to compute exact posterior distributions through message passing (when possible), while VMP approximates the posterior using the Bethe approximation. For a detailed mathematical treatment of these differences, see our [Bethe Free Energy implementation](@ref lib-bethe-free-energy) guide. + +For example, consider this simple model: + +```julia +@model function problematic_model() + μ ~ Normal(mean = 0.0, variance = 1.0) + τ ~ Gamma(shape = 1.0, rate = 1.0) + y ~ Normal(mean = μ, precision = τ) +end +``` + +This model will fail with a `RuleNotFoundError` because there are no belief propagation message passing update rules available for this combination of distributions - only variational message passing rules exist. Even though the model looks simple, the message passing rules needed for exact inference do not exist in closed form. + +## Common scenarios + +You're likely to encounter this error when: + +1. Using non-conjugate pairs of distributions (e.g., `Beta` prior with `Normal` likelihood with precision parameterization) +2. Working with custom distributions or factor nodes without defining all necessary update rules +3. Using complex transformations between variables that don't have defined message computations +4. Mixing different types of distributions in ways that don't have analytical solutions + +## Design Philosophy + +RxInfer prioritizes performance over generality in its message-passing implementation. By default, it only uses analytically derived message update rules, even in cases where numerical approximations might be possible. This design choice: + +- Ensures fast and reliable inference when rules exist +- Avoids potential numerical instabilities from approximations +- Throws an error when analytical solutions don't exist + +This means you may encounter `RuleNotFoundError` even in cases where approximate solutions could theoretically work. This is intentional - RxInfer will tell you explicitly when you need to consider alternative approaches rather than silently falling back to potentially slower or less reliable approximations. See the [Solutions](@ref rule-not-found-solutions) section below for more details. + +## Visualizing the message passing graph + +To better understand where message passing rules are needed, let's look at a simple factor graph visualization: + +```mermaid +graph LR + %% Other parts of the graph + g1[g] -.-> x + h1[h] -.-> z + y -.-> g2[p] + + %% Main focus area + x((x)) -.- m1[["μx→f"]] --> f[f] + f --> m2[["μf→y"]] -.- y((y)) + z((z)) -.- m3[["μz→f"]] --> f + + %% Styling + classDef variable fill:#b3e0ff,stroke:#333,stroke-width:2px; + classDef factor fill:#ff9999,stroke:#333,stroke-width:2px,shape:square; + classDef otherFactor fill:#ff9999,stroke:#333,stroke-width:2px,opacity:0.3; + classDef message fill:none,stroke:none; + class x,y,z variable; + class f factor; + class g1,g2,h1 otherFactor; + class m1,m2,m3 message; +``` + +In this example: +- Variables (`x`, `y`, `z`) are represented as circles +- The factor node (`f`) is represented as a square +- Messages (μ) flow along the edges between variables and factors, with subscripts indicating direction (e.g., x→f flows from x to f) +- Faded nodes (g, h) represent other parts of the factor graph that aren't relevant for this local message computation + +To compute the outgoing message `f→y`, RxInfer needs: +1. Rules for how to process incoming messages `x→f` and `z→f` +2. Rules for combining these messages based on the factor `f`'s type +3. Rules for producing the outgoing message type that `y` expects + +A `RuleNotFoundError` occurs when any of these rules are missing. For example, if `x` sends a `Normal` message but `f` doesn't know how to process `Normal` inputs, or if `f` can't produce the type of message that `y` expects. + +## [Solutions](@id rule-not-found-solutions) + +### 1. Convert to conjugate pairs + +First, try to reformulate your model using conjugate prior-likelihood pairs. Conjugate pairs have analytical solutions for message passing and are well-supported in RxInfer. For example, instead of using a `Normal` likelihood with `Beta` prior on its precision, use a `Normal-Gamma` conjugate pair. See [Conjugate prior - Wikipedia](https://en.wikipedia.org/wiki/Conjugate_prior#Table_of_conjugate_distributions) for a comprehensive list of conjugate distributions. + +### 2. Check available rules + +If conjugate pairs aren't suitable, verify if your combination of distributions and message types is supported. RxInfer provides many predefined rules, but not all combinations are possible. A good starting point is to check the [List of available nodes](https://reactivebayes.github.io/ReactiveMP.jl/stable/lib/nodes/#lib-predefined-nodes) section in the documentation of ReactiveMP.jl. + +### 3. Create custom update rules + +If you need specific message computations, you can define your own update rules. See [Creating your own custom nodes](@ref create-node) for a detailed guide on implementing custom nodes and their update rules. + +### 4. Use approximations + +When exact message updates aren't available, consider: + +- Using simpler distribution pairs that have defined rules +- Employing approximation techniques like moment matching or the methods described in [Meta Specification](@ref user-guide-meta-specification) and [Deterministic nodes](@ref delta-node-manual) + +### 5. Use variational inference + +Sometimes, adding appropriate factorization constraints can help avoid problematic message computations: + +```julia +constraints = @constraints begin + q(μ, τ) = q(μ)q(τ) # Mean-field assumption +end + +result = infer( + model = problematic_model(), + constraints = constraints, +) +``` + +!!! note + When using variational constraints, you will likely need to initialize certain messages or marginals to handle loops in the factor graph. See [Initialization](@ref initialization) for details on how to properly initialize your model. + +For more details on constraints and variational inference, see: + +- [Constraints Specification](@ref user-guide-constraints-specification) for a complete guide on using constraints +- [Bethe Free Energy](@ref lib-bethe-free-energy) for the mathematical background on variational inference and message passing + +## Implementation details + +When RxInfer encounters a missing rule, it means one of these is missing: + +1. A `@rule` definition for the specific message direction and types +2. A `@marginalrule` for computing joint marginals +3. An `@average_energy` implementation for free energy computation + +You can add these using the methods described in [Creating your own custom nodes](@ref create-node). + +!!! note + Not all message-passing rules have analytical solutions. In such cases, you might need to use numerical approximations or choose different model structures. + diff --git a/docs/src/manuals/sharpbits/stack-overflow-inference.md b/docs/src/manuals/sharpbits/stack-overflow-inference.md new file mode 100644 index 000000000..d403d0b56 --- /dev/null +++ b/docs/src/manuals/sharpbits/stack-overflow-inference.md @@ -0,0 +1,95 @@ +# [Stack Overflow during inference](@id stack-overflow-inference) + +When working with large probabilistic models in RxInfer, you might encounter a `StackOverflowError`. This section explains why this happens and how to prevent it. + +## The Problem + +RxInfer uses reactive streams to compute messages between nodes in the factor graph. The subscription to these streams happens recursively, which means: + +1. Each node subscribes to its input messages or posteriors +2. Those input messages may need to subscribe to their own inputs +3. This continues until all dependencies are resolved + +For large models, this recursive subscription process can consume the entire stack space, resulting in a `StackOverflowError`. + +## Example Error + +When this occurs, you'll see an error message that looks something like this: + +```julia +ERROR: Stack overflow error occurred during the inference procedure. +``` + +## Solution: Limiting Stack Depth + +RxInfer provides a solution through the `limit_stack_depth` option in the inference options. This option limits the recursion depth at the cost of some performance overhead. + +### How to Use + +You can enable stack depth limiting by passing it through the `options` parameter to the `infer` function: + +```@example stack-overflow-inference +using RxInfer + +@model function long_state_space_model(y) + x[1] ~ Normal(mean = 0.0, var = 1.0) + y[1] ~ Normal(mean = x[1], var = 1.0) + for i in 2:length(y) + x[i] ~ Normal(mean = x[i - 1], var = 1.0) + y[i] ~ Normal(mean = x[i], var = 1.0) + end +end + +data = (y = rand(10000), ) + +using Test #hide +@test_throws StackOverflowError infer(model = long_state_space_model(), data = data) #hide + +results = infer( + model = long_state_space_model(), + data = data, + options = ( + limit_stack_depth = 100, # note the comma + ) +) +``` + +!!! note + Note the comma after `limit_stack_depth = 100`. This is important because it tells Julia that the option is placed in the named tuple `options`. + +Without `limit_stack_depth` enabled, the inference will fail with a `StackOverflowError` + +```julia +results = infer( + model = long_state_space_model(), + data = data +) +``` + +```julia +ERROR: Stack overflow error occurred during the inference procedure. +``` + +### Performance Considerations + +When `limit_stack_depth` is enabled: +- The recursive subscription process is split into multiple steps +- This prevents stack overflow but introduces performance overhead (you should verify this in your use case) +- For very large models, this option might be essential for successful execution + +## When to Use + +Consider using `limit_stack_depth` when: +- Working with large models (many nodes/variables) +- Encountering `StackOverflowError` +- Processing deep hierarchical models +- Dealing with long sequences or time series + +!!! tip + If you're not sure whether you need this option, try running your model without it first. Only enable `limit_stack_depth` if you encounter stack overflow issues. + +## Further Reading + +For more details about inference options and execution, see: +- [Static Inference](@ref manual-static-inference) documentation +- The `options` parameter in the [`infer`](@ref) function documentation diff --git a/docs/src/manuals/sharpbits/usage-colon-equality.md b/docs/src/manuals/sharpbits/usage-colon-equality.md new file mode 100644 index 000000000..373132e8f --- /dev/null +++ b/docs/src/manuals/sharpbits/usage-colon-equality.md @@ -0,0 +1,83 @@ +# [Using `=` instead of `:=` for deterministic nodes](@id usage-colon-equality) + +When specifying probabilistic models in RxInfer, you might be tempted to use the `=` operator for deterministic relationships between variables. While this may seem natural from a programming perspective (especially if you're coming from other frameworks - see [Comparison to other packages](@ref comparison)), it doesn't align with how Bayesian inference and factor graphs work. Let's explore why RxInfer uses a different approach and how it enables powerful probabilistic modeling. + +## The Problem + +Consider this seemingly reasonable model specification: + +```julia +@model function wrong_model(θ) + x ~ MvNormal(mean = [ 0.0, 0.0 ], cov = [ 1.0 0.0; 0.0 1.0 ]) + y = dot(x, θ) # This won't work! + z ~ Normal(y, 1.0) +end +``` + +This code will fail because: +1. During model creation, `x` is not an actual vector of numbers - it's a reference to a node in the factor graph +2. Julia's `dot` function expects a vector input, not a graph node +3. The `=` operator performs immediate assignment and executes the `dot` function, which isn't what we want for building factor graphs + +## The Solution + +Use the `:=` operator for deterministic relationships: + +```julia +@model function correct_model() + x ~ MvNormal(mean = [ 0.0, 0.0 ], cov = [ 1.0 0.0; 0.0 1.0 ]) + y := dot(x, θ) # This is correct! + z ~ Normal(y, 1.0) +end +``` + +The `:=` operator: +- Creates a deterministic node in the factor graph +- Properly tracks dependencies between variables +- Allows RxInfer to handle the computation during inference + +!!! tip + If you're coming from other probabilistic programming frameworks like Turing.jl, remember that RxInfer uses `:=` for deterministic relationships. While this might seem unusual at first, it's a deliberate design choice that enables powerful message-passing inference algorithms. + +## Why Not `=`? + +RxInfer's design is based on factor graphs, which are probabilistic graphical models that represent the factorization of a joint probability distribution. In a factor graph: + +- Variables are represented as nodes (vertices) in the graph +- Factor nodes connect variables and encode their relationships +- Edges represent the dependencies between variables and factors +- Both probabilistic (`~`) and deterministic (`:=`) relationships create specific types of factor nodes + +When you specify a model, RxInfer constructs this graph structure where: +- Each `~` creates a factor node representing that probability distribution +- Each `:=` creates a deterministic factor node representing that transformation +- Variables are automatically connected to their relevant factors +- The graph captures the complete probabilistic model structure + +This explicit graph-based design brings several key benefits: +- **Efficient Message Passing**: The graph structure enables localized belief propagation, where each node only needs to communicate with its immediate neighbors +- **Lazy Evaluation**: Factor nodes compute messages only when needed during inference, avoiding unnecessary calculations +- **Flexible Inference**: The same graph structure can support different message-passing schedules and inference algorithms +- **Modular Updates**: Changes in one part of the graph only affect the connected components + +Using `=` would break this design because: +- It executes computations immediately during model specification, before the graph is built +- It prevents RxInfer from properly tracking the probabilistic dependencies +- It makes message passing impossible since there's no graph structure to pass messages through + +## Implementation Details + +When you write: +```julia +y := dot(x, θ) +``` + +RxInfer creates: +1. A deterministic factor node representing the `dot` function with `x` and `θ` as arguments (edges) +2. Creates a node for `y` if it has not been created yet +3. Proper edges connecting `x` and `θ` to this node and this node to `y` +4. Message passing rules for propagating beliefs through this transformation + +This structured approach enables efficient inference and maintains the mathematical rigor of the probabilistic model. + +For more details about model specification, see the [Model Specification](@ref user-guide-model-specification) guide, particularly the section on [Deterministic relationships](@ref user-guide-model-specification-node-creation-deterministic). \ No newline at end of file diff --git a/src/inference/inference.jl b/src/inference/inference.jl index 3a5d24711..71081fb06 100644 --- a/src/inference/inference.jl +++ b/src/inference/inference.jl @@ -92,24 +92,50 @@ function inference_process_error(error) end function inference_process_error(error, rethrow) - if rethrow - Base.rethrow(error) + if error isa StackOverflowError + @error """ + Stack overflow error detected during inference. This can happen with large model graphs + due to recursive message updates. + + Possible solution: + Try using the `limit_stack_depth` inference option. If this does not resolve the issue, + please open a GitHub issue at https://github.com/ReactiveBayes/RxInfer.jl/issues and we'll help investigate. + + For more details: + • Stack overflow guide: https://reactivebayes.github.io/RxInfer.jl/stable/manuals/sharpbits/stack-overflow-inference/ + • See `infer` function docs for options + """ end - return error, catch_backtrace() -end - -# We want to show an extra hint in case the error is of type `StackOverflowError` -function inference_process_error(err::StackOverflowError, rethrow) @error """ - Stack overflow error occurred during the inference procedure. - The inference engine may execute message update rules recursively, hence, the model graph size might be causing this error. - To resolve this issue, try using `limit_stack_depth` inference option for model creation. See `?inference` documentation for more details. - The `limit_stack_depth` option does not help against over stack overflow errors that might happening outside of the model creation or message update rules execution. + We encountered an error during inference, but don't worry - we're here to help! 🤝 + + Here are some helpful resources to get you back on track: + + 1. Check our Sharp bits documentation which covers common issues: + https://reactivebayes.github.io/RxInfer.jl/stable/manuals/sharpbits/overview/ + + 2. Browse our existing issues - your question may already be answered: + https://github.com/ReactiveBayes/RxInfer.jl/issues + + Still stuck? We'd love to help! You can: + - Start a discussion for questions and help. Feedback and questions from new users is also welcome! If you are stuck, please reach out and we will solve it together. + https://github.com/ReactiveBayes/RxInfer.jl/discussions + - Report a bug or request a feature: + https://github.com/ReactiveBayes/RxInfer.jl/issues + + Note that we use GitHub discussions not just for technical questions! We welcome all kinds of discussions, + whether you're new to Bayesian inference, have questions about use cases, or just want to share your experience. + + To help us help you, please include: + - A minimal example that reproduces the issue + - The complete error message and stack trace + + Together we'll get your inference working! 💪 """ if rethrow - Base.rethrow(err) # Shows the original stack trace + Base.rethrow(error) end - return err, catch_backtrace() + return error, catch_backtrace() end function inference_check_itertype(::Symbol, ::Union{Nothing, Tuple, Vector}) @@ -225,6 +251,9 @@ include("streaming.jl") This function provides a generic way to perform probabilistic inference for batch/static and streamline/online scenarios. Returns either an [`InferenceResult`](@ref) (batch setting) or [`RxInferenceEngine`](@ref) (streamline setting) based on the parameters used. +!!! note + Before using this function, you may want to review common issues and solutions in the Sharp bits of RxInfer section of the documentation. + ## Arguments Check the official documentation for more information about some of the arguments.