Working with bounds instead of point observations in RxInfer #312

lionelkiel · 2024-05-30T14:24:46Z

lionelkiel
May 30, 2024

Hi everyone,

I am trying to use RxInfer to implement inference but I am running into a small problem. The observations $y$ that I get come from a Gamma distribution but are not point estimates, instead they are bounds. My likelihood is therefore a product of cdfs not pdfs and looks as follows:
$$P(y | \theta) = \prod_i \int_{y_{i,lower}}^{y_{i,upper}}\Gamma_\theta (x)dx$$
instead of:
$$P(y | \theta) = \prod_i \Gamma_\theta (y_i)$$
In code, the observations that I would like to pass through the inference would look like this:

@model function distribution(y)
    # something along the lines of y[i] ~ cdf(Gamma)
end

bounds = [[0.1,0.2],[0.2,0.3],[0.1,0.3]]

result = infer(
    model = distribution(),
    data  = (y = bounds, )
)

My question is thus, is this possible with RxInfer?

Answered by wouterwln

Jun 25, 2024

Hi Lionel, I have been playing around with your model and got some sensible results. However, I did use some dirty tricks that I'll try to explain here:
First of all, I used this model:

@model function lionels_model(lower_lims, upper_lims)
    α ~ GammaShapeRate(1.0, 1.0)
    β ~ GammaShapeRate(1.0, 1.0)
    for i in 1:length(lower_lims)
        y[i] ~ GammaShapeRate(α, β)
        y[i] ~ Uniform(lower_lims[i], upper_lims[i])
    end
end

Now, the problem with this model is that a Gamma distribution does not have a nice closed-form conjugate prior to its shape parameter. Therefore, the message towards this variable is a bit weird to handle. What I did is that I approximated the product betw…

View full answer

wouterwln · 2024-05-30T15:00:37Z

wouterwln
May 30, 2024
Maintainer

Hey @lionelkiel, thanks for checking out RxInfer!

Okay, if I understand correctly, your model would look something like this:

@model function distribution(y)
    β ~ GammaShapeRate(1, 1) # Not conjugate prior but not the point of the discussion, we can make this work
    β ~ GammaShapeRate(1, 1)
    local z
    for i in eachindex(y)
        z[i] ~ GammaShapeRate(α, β)
        z[i] ~ Uniform(y[i, 1], y[i,2])
    end
end

The main problem are the messages towards $\alpha$ and $\beta$, because of the incoming uniform message towards the gamma node. @Nimrais what do you think? Could we project the product of uniform and Gamma onto Gamma and use VMP? I think sum-product messages are excluded anyway since you'd have to integrate out either $\alpha$ or $\beta$ after multiplying with the uniform distribution. I have discussed a bit with @bartvanerp but we weren't sure about the right strategy yet. If anyone has any ideas, I'm happy to hear about them :).

2 replies

Nimrais May 30, 2024
Maintainer

I would suggest smt kinda like Beta from y[i, 1] to y[i,2], Gamma does not look like a right choice.

Nimrais May 30, 2024
Maintainer

yeah, even about that I am not sure. This model is a bit strange for me because it's actually does not define smt that can be normalised.

Nimrais · 2024-05-30T15:58:34Z

Nimrais
May 30, 2024
Maintainer

Can you explain generative process in this model, before we will jump into the inference? Because your $P(y|\theta)$ does not look like a conditional distribution.

4 replies

lionelkiel May 30, 2024
Author

Hi, thanks for the reply I will try do my best at explaining it. So the idea is that I am physically measuring some continuous quantity and through some statistics i've come to the conclusion that a Gamma distribution seems the most likely candidate for the underlying distribution of this quantity,

Since point measurements are physically impossible for continuous quantities I can only measure an interval (because of uncertainties). Now this could be discretized and then I wouldn't have this problem, but for some measurements I get much larger bounds than for other measurements so I would like to take that into account.

Given this, my data looks as follows:
$D =$ { $(l_i,u_i) | i = 1, \cdots, n$ }
where l is the lower bound and u the upper bound for the interval. Now if I were to assume the distribution this data comes from is a Gamma distribution $\Gamma_\theta$ we could write the distribution over the parameters using Bayes Law (the posterior):
$$P(\theta | D) = \frac{P(D | \theta)P(\theta)}{P(D)}$$
Now the likelihood is just the probability (under $\Gamma_\theta$ with fixed $\theta$) of finding a value in the measured interval:
$$P(D | \theta) = \prod_{i=1}^n \int_{l_i}^{u_i} \Gamma_\theta(x)dx$$
The goal is to then use the posterior to check the uncertainty of certain values e.g. mean, $\sigma$-quantile etc.

Please tell me if anything is confusing or needs better explanation!

wouterwln May 31, 2024
Maintainer

So do you receive a point estimate that you convert to a range afterwards? Or do you actually only observe the range? If you observe a point estimate it might be easier to account for measurement noise.

Nimrais May 31, 2024
Maintainer

Regarding the model, it is not yet specified from my understanding. Based on what I see you are saying, that your observation is that: $y_{i}$ is an interval ranging from $y_{l}$ to $y_{r}$. However, it is unclear how these intervals are generated and what the dependency is between $y_{l}$ to $y_{r}$ and $y_{i}$.

lionelkiel May 31, 2024
Author

So I don't make a point estimate at all actually. Say the quantity I want to measure is some true unknown value $y$, I can pick some $y'$ and check if it is smaller or bigger than $y$. This is how I can approach the true value $y$. The problem is, this is sometimes very computationally expensive, when this is the case the process is cut short which would result in a worse estimate of $y$ (much bigger bound).

So I guess the problem is harder than I anticipated. I will try discretising my observations and discarding the ones with bigger bounds to see if this gets me some nice results.

I could also try to write out $P(y_l,y_r | \theta)$ mathematically to check how they are generated and their dependency, but that seems like more of a hassle than it is worth.

wouterwln · 2024-06-25T13:19:22Z

wouterwln
Jun 25, 2024
Maintainer

Hi Lionel, I have been playing around with your model and got some sensible results. However, I did use some dirty tricks that I'll try to explain here:
First of all, I used this model:

@model function lionels_model(lower_lims, upper_lims)
    α ~ GammaShapeRate(1.0, 1.0)
    β ~ GammaShapeRate(1.0, 1.0)
    for i in 1:length(lower_lims)
        y[i] ~ GammaShapeRate(α, β)
        y[i] ~ Uniform(lower_lims[i], upper_lims[i])
    end
end

Now, the problem with this model is that a Gamma distribution does not have a nice closed-form conjugate prior to its shape parameter. Therefore, the message towards this variable is a bit weird to handle. What I did is that I approximated the product between this message and the Gamma prior with importance sampling and then through moment matching, so I approximate it with a Gamma distribution.

The same I did for the product between the Uniform likelihood and the Gamma prior for the upper and lower bounds. Now this is not a very nice trick there because we actually know that the resulting posterior marginal would be a truncated Gamma distribution. However, since we do not (yet) have support for truncated distributions (ReactiveBayes/ExponentialFamily.jl#193) I chose the easier, albeit dirtier route.
I'll paste my code here:

Initialization:

using RxInfer
using FastGaussQuadrature
using Roots
using StableRNGs

rng = StableRNG(500)

Importance sampling projection

This is the dirty trick part, it's okay if you don't understand

function is_project(left, right::GammaDistributionsFamily)
    f = (x) -> exp(max(logpdf(left,x), -36.0) + logpdf(right,x) + x)
    x, w = gausslaguerre(31)
    Z = sum(w .* f.(x))
    normalized_f = (x) -> exp(max(logpdf(left,x), -36.0) + logpdf(right,x) + x - log(Z))
    expectation_x = sum(w .* normalized_f.(x) .* x)
    expectation_logx = sum(w .* normalized_f.(x) .* log.(x))
    gss = GammaSufficientStatistics(expectation_x, expectation_logx)
    return solve_logpartition_identity(gss, right)
end

BayesBase.prod(::GenericProd, left::GammaDistributionsFamily, right::ContinuousUnivariateLogPdf) = BayesBase.prod(GenericProd(), right, left)
BayesBase.prod(::GenericProd, left::ContinuousUnivariateLogPdf, right::GammaDistributionsFamily) = is_project(left, right)
BayesBase.prod(::GenericProd, left::GammaDistributionsFamily, right::Uniform) = BayesBase.prod(GenericProd(), right, left)
BayesBase.prod(::GenericProd, left::Uniform, right::GammaDistributionsFamily) = is_project(left, right)


struct GammaSufficientStatistics{T}
    x::T
    logx::T
end

function solve_logpartition_identity(statistics::GammaSufficientStatistics, initial_guess::GammaDistributionsFamily)
    f = let statistics = statistics
        (α) -> RxInfer.ReactiveMP.digamma(α) - log(α / statistics.x) - statistics.logx
    end
    α = find_zero(f, shape(initial_guess), Roots.Order0())
    β = α / statistics.x
    return GammaShapeScale(α, inv(β))
end

ReactiveMP rules

These rules were not in RxInfer yet at the time of writing, I will add them to the package such that future users can use them

@rule GammaShapeRate(:β, Marginalisation) (q_out::GammaDistributionsFamily, q_α::GammaDistributionsFamily) = GammaShapeRate(1 + mean(q_α), mean(q_out))

@rule GammaShapeRate(:α, Marginalisation) (q_out::GammaDistributionsFamily, q_β::GammaDistributionsFamily) = begin
    return ContinuousUnivariateLogPdf(RxInfer.ReactiveMP.DomainSets.HalfLine(), (α) -> α * mean(log, q_β) + (α - 1) * mean(log, q_out) - RxInfer.ReactiveMP.loggamma(α))
end

@rule GammaShapeRate(:out, Marginalisation) (q_α::Any, q_β::Any) = GammaShapeRate(mean(q_α), mean(q_β))

Model and inference constraints

Your model only works when doing Variational Message Passing, so we need a set of inference constraints and an initial state for the iterative update procedure

@model function lionels_model(lower_lims, upper_lims)
    α ~ GammaShapeRate(1.0, 1.0)
    β ~ GammaShapeRate(1.0, 1.0)
    for i in 1:length(lower_lims)
        y[i] ~ GammaShapeRate(α, β)
        y[i] ~ Uniform(lower_lims[i], upper_lims[i])
    end
end

constraints = @constraints begin
    q(α, β, y) = q(α)q(β)q(y)
end

initialization = @initialization begin
    q(α) = GammaShapeRate(1.0, 1.0)
    q(β) = GammaShapeRate(1.0, 1.0)
    q(y) = GammaShapeRate(1.0, 1.0)
end

Data generation

Now we're in a shape where we can generate some (demo) data. We draw a random alpha and beta, and sample from the Gamma distribution, and then we generate some arbitrary upper and lower bounds

α = 5 * rand(rng)
β = 5 * rand(rng)
n = 100
y = rand(rng, GammaShapeRate(α, β), n)
lower_lims = y .- rand(rng, n)
upper_lims = y .+ rand(rng, n)

Inference

The inference can now be done by RxInfer

result = infer(model = lionels_model(), data = (lower_lims = lower_lims, upper_lims = upper_lims), constraints= constraints, initialization = initialization, iterations = 100)

Results

println("True α: $α, estimated α: $(mean(last(result.posteriors[:α])))")
println("True β: $β, estimated β: $(mean(last(result.posteriors[:β])))")
println("True distribution mean: $(α / β), estimated distribution mean: $(mean(last(result.posteriors[:α])) / mean(last(result.posteriors[:β])))")

Gives me the following results:

True α: 1.2997280910518128, estimated α: 1.1066894995329997
True β: 4.025454172802439, estimated β: 3.4398779058221995
True distribution mean: 0.322877378615633, estimated distribution mean: 0.32172348258636196

Which looks okay to me. Inference will never be perfect because of the dirty tricks we went through, but maybe @Nimrais would be able to do some magical projection which will increase the inference quality. Hope this solves your problem!

1 reply

lionelkiel Jun 28, 2024
Author

Thank you for this Wouter! Will definitely try this out!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReactiveBayes

Working with bounds instead of point observations in RxInfer #312

{{title}}

Replies: 3 comments 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

ReactiveBayes

Working with bounds instead of point observations in RxInfer #312

lionelkiel May 30, 2024

Replies: 3 comments · 7 replies

wouterwln May 30, 2024 Maintainer

Nimrais May 30, 2024 Maintainer

Nimrais May 30, 2024 Maintainer

Nimrais May 30, 2024 Maintainer

lionelkiel May 30, 2024 Author

wouterwln May 31, 2024 Maintainer

Nimrais May 31, 2024 Maintainer

lionelkiel May 31, 2024 Author

wouterwln Jun 25, 2024 Maintainer

Initialization:

Importance sampling projection

ReactiveMP rules

Model and inference constraints

Data generation

Inference

Results

lionelkiel Jun 28, 2024 Author

lionelkiel
May 30, 2024

Replies: 3 comments 7 replies

wouterwln
May 30, 2024
Maintainer

Nimrais May 30, 2024
Maintainer

Nimrais May 30, 2024
Maintainer

Nimrais
May 30, 2024
Maintainer

lionelkiel May 30, 2024
Author

wouterwln May 31, 2024
Maintainer

Nimrais May 31, 2024
Maintainer

lionelkiel May 31, 2024
Author

wouterwln
Jun 25, 2024
Maintainer

lionelkiel Jun 28, 2024
Author