Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify naming of Bayesian solvers #41

Open
wcwitt opened this issue Jan 19, 2023 · 9 comments
Open

Clarify naming of Bayesian solvers #41

wcwitt opened this issue Jan 19, 2023 · 9 comments
Labels

Comments

@wcwitt
Copy link
Collaborator

wcwitt commented Jan 19, 2023

I've experimented with a few ways of implementing Bayesian ridge (for example), and the proliferation of functions has gotten a bit confusing. Now that more people are starting to use them, it's crucial to reorganize and document them.

@wcwitt
Copy link
Collaborator Author

wcwitt commented Jan 31, 2023

First draft of the revised API. I am proposing collapsing BRR and ARD into a single user interface, controlled by the ard_threshold parameter. Please weigh in if you don't like this or have any other feedback.

@casv2 @cortner @bernstei @jameskermode

"""
    bayesian_linear_regression(A, Y; <keyword arguments>)

Perform Bayesian linear regression, possibly with automatic relevance determination.

# Arguments

- `A::Matrix{<:AbstractFloat}`: design matrix, with observations as rows
- `Y::Vector{<:AbstractFloat}`: target vector
- `ard_threshold::AbstractFloat=0.0`: if nonzero, prune the basis with automatic relevance determination. must be greater than `min_variance_coeff`.
- `tol::Float64=1e-3`: tolerance to use for terminating the evidence maximization.
- `max_iter::Int=1000`: iteration limit for evidence maximization.
- `opt_method::String="LBFGS"`: method for evidence maximization.
- `min_var_coeff::AbstractFloat=1e-8`: lower bound for the variance on the coefficient prior.
- `min_var_noise::AbstractFloat=1e-8`: lower bound for the variance on the model noise.
- `factorization::String="cholesky"`: if "cholesky" performs poorly, try "svd" or "qr".
- `ret_covar::Bool=false`: whether to supply the covariance matrix in the results.
- `comm_size::Int=0`: if nonzero, sample from the posterior and include a committee in the results.
- `verbose::Bool=true`
"""

@cortner
Copy link
Member

cortner commented Jan 31, 2023

I like this a lot. How do will I construct a solver object? In Julia rather than JSON style?

@wcwitt
Copy link
Collaborator Author

wcwitt commented Jan 31, 2023

See this, with the caveat that the name will change, it sounds like to BLR

ACEsuit/ACE1pack.jl@88b6efd

@jameskermode
Copy link
Collaborator

Looks good. Is there a reason you’ve parametrised in terms of variances rather than precisions as is more common?

@bernstei
Copy link

Looks good. Is there a reason you’ve parametrised in terms of variances rather than precisions as is more common?

We should be quite careful with this. When I was messing around with the python re-implementation I got quite different performance when transforming those parameters to 1/x. That was with very ill conditioned matrices, which may be a lot better with the purified ACE1x basis, but it's not just a matter of notation.

@jameskermode
Copy link
Collaborator

jameskermode commented Jan 31, 2023 via email

@wcwitt
Copy link
Collaborator Author

wcwitt commented Jan 31, 2023

My personal hyperprior on the noise is:

I basically have no idea. Maybe something roughly equivalent to 1 meV/atom? I will definitely suspect overfitting if the LML maximization returns something less than say 0.01 meV/atom.

When I went to translate that prior into code, it was simplest to enforce a lower bound on the variances - that's the entire explanation. Of course, we could convert the same intuition into an upper bound on the precisions, and I'm definitely open to that if it's better behaved.

So I would argue that the interface should use variances - because I'd be surprised if any of us reason in terms of precisions - but for the actual optimization we should do whatever works best numerically.

@wcwitt
Copy link
Collaborator Author

wcwitt commented Feb 6, 2023

TODO: need to handle the transition between overconstrained/underconstrained more gracefully - should do that as part of this refactoring.

@wcwitt
Copy link
Collaborator Author

wcwitt commented Mar 7, 2023

Recording from discussion elsewhere: "They should all get [a maxiter parameter] please, and then they should fail with a nice user-friendly message, something along the lines of "even when the solver hasn't converged the quality of the solution may be good, please test this before changing solver parameters""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants