Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pseudo-Observation Parametrisations #121

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

willtebbutt
Copy link
Member

As discussed on the call the other day, I've implemented a couple of pseudo-observation parametrisations.
The first has been used in a few places now, so I really think that we should support it. The thing I refer to as the "decoupled pseudo-observation" parametrisation is quite non-standard (I've not actually seen it used anywhere before, just been using it in my own work), but it's a really obvious extension, so I figure why not?

I've had to add an additional abstract type to make it possible to have different fields from those in the `SparseVariationalApproximation", that's why there are quite a lot of LOC changes.

Also, I've expanded on the parametrisation explanations, and provided some code examples that actually run and are now CI'd, so they won't go stale.

Keen to know what people make of this.

examples/d-sparse-parametrisations/script.jl Outdated Show resolved Hide resolved
examples/d-sparse-parametrisations/script.jl Outdated Show resolved Hide resolved
src/ApproximateGPs.jl Outdated Show resolved Hide resolved
src/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
src/SparseVariationalApproximationModule.jl Show resolved Hide resolved
src/SparseVariationalApproximationModule.jl Show resolved Hide resolved
test/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
test/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
test/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
test/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
src/ApproximateGPs.jl Outdated Show resolved Hide resolved
src/SparseVariationalApproximationModule.jl Show resolved Hide resolved
src/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
src/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
src/SparseVariationalApproximationModule.jl Show resolved Hide resolved
src/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
test/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
test/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
test/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
test/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
willtebbutt and others added 2 commits March 24, 2022 13:53
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@codecov
Copy link

codecov bot commented Mar 24, 2022

Codecov Report

Merging #121 (a7a7143) into master (490ece8) will increase coverage by 0.79%.
The diff coverage is 100.00%.

❗ Current head a7a7143 differs from pull request most recent head d47c809. Consider uploading reports for the commit d47c809 to get more accurate results

@@            Coverage Diff             @@
##           master     #121      +/-   ##
==========================================
+ Coverage   93.61%   94.41%   +0.79%     
==========================================
  Files           5        5              
  Lines         329      376      +47     
==========================================
+ Hits          308      355      +47     
  Misses         21       21              
Impacted Files Coverage Δ
src/SparseVariationalApproximationModule.jl 93.89% <100.00%> (+3.41%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 490ece8...d47c809. Read the comment docs.

src/ApproximateGPs.jl Show resolved Hide resolved
src/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
src/ApproximateGPs.jl Show resolved Hide resolved
src/SparseVariationalApproximationModule.jl Outdated Show resolved Hide resolved
@willtebbutt willtebbutt reopened this Mar 24, 2022
# pseudo-observation covariance matrix `Ŝ` such that ``\hat{C} = C - C (C + \hat{S})^{-1} C``.
#
# However, ths is not necessarily a problem: if the likelihood used in the model is
# log-concave then the optimal choice for `Ĉ` can always be represented using this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a citation for this? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering about that. I've definitely seen the result floating around (and it's easy enough to prove) -- will have a hunt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's that easy to prove, just do it here in the docs 😂

Comment on lines +164 to +167
# The reason to think that this parametrisation will do something sensible is this property.
# Obviously when ``\mathbf{v} \neq \mathbf{x}`` the optimal approximate posterior cannot be
# recovered, however, when the hope is that there exists a small pseudo-dataset which gets
# close to the optimum.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the advantage of this parametrisation ?

)
return AbstractGPs.elbo(sva, l_fx, ys; kwargs...)
end

_get_prior(approx::SparseVariationalApproximation) = approx.fz.f
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just calling it

Suggested change
_get_prior(approx::SparseVariationalApproximation) = approx.fz.f
prior(approx::SparseVariationalApproximation) = approx.fz.f

?
I think it shows up in a bunch more places (within AbstractGPs' VFE/DTC code too)...

PseudoObsSparseVariationalApproximation(
f::AbstractGP,
z::AbstractVector,
S::AbstractMatrix{<:Real},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, it's got to be a symmetric, pos-def matrix, right?

S::AbstractMatrix{<:Real}, v::AbstractVector, y::AbstractVector{<:Real}
)

Chooses `likelihood(u) = N(y; f(v), S)` where `length(y)` need not be equal to the number
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how this makes any sense. On the left-hand side you have u, on the right hand side you don't. How are the two coupled ?

Copy link
Member Author

@willtebbutt willtebbutt Mar 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so it's implicit via f. u := f(z), so by making noisy observations of f(v), you learn something about f(z). Not clear from what I wrote though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

....so then the inducing points are actually v? why are u/z needed then?

y = approx.likelihood.y
S = approx.likelihood.S
v = approx.likelihood.v
return posterior(AbstractGPs.VFE(f(z, 1e-9)), f(v, S), y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we name all these magic constants floating around? Should they be consistent? Should they be configurable? Here it's 1e-9, above it's zero, below it's 1e-18....

Comment on lines +227 to +233
approx_centered = SparseVariationalApproximation(
Centered(), f(z, 1e-12), qu
)
approx_post_centered = posterior(approx_centered)
approx_centered = SparseVariationalApproximation(
Centered(), f(z, 1e-12), qu
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
approx_centered = SparseVariationalApproximation(
Centered(), f(z, 1e-12), qu
)
approx_post_centered = posterior(approx_centered)
approx_centered = SparseVariationalApproximation(
Centered(), f(z, 1e-12), qu
)
approx_centered = SparseVariationalApproximation(
Centered(), f(z, 1e-12), qu
)
approx_post_centered = posterior(approx_centered)

Comment on lines +271 to +273
approx_centered = SparseVariationalApproximation(
Centered(), f(z, 1e-12), qu
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
approx_centered = SparseVariationalApproximation(
Centered(), f(z, 1e-12), qu
)

@st--
Copy link
Member

st-- commented Mar 25, 2022

but it's a really obvious extension, so I figure why not?

I'm rather more hesitant about it, because any added code incurs a maintenance burden throughout the future lifetime... so if you'd like to add these I'd like to be more convinced why they're genuinely useful. Are they easier/faster to optimise? What do you gain from using them?

@rossviljoen
Copy link
Collaborator

Whether or not we end up adding these specific parameterisations, I think it would still be good to add the AbstractSparseVariationalApproximation type anyway to allow people to add parameterisations like this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants