Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Bayesian Linear Regression using Normal Conjugate Prior #500

Merged
merged 57 commits into from
Jan 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
9589ef8
Initialized bayesian conjugate class
Nov 21, 2024
2d58430
Added docstring example
Nov 21, 2024
08ae9d8
Added BayesianConjugateLinearRegressor
Nov 21, 2024
153f561
Changed parameterization
Nov 21, 2024
de68ef5
Removed dependency
Nov 21, 2024
fc3baac
added init
Nov 21, 2024
419f0c7
Changed test
Nov 22, 2024
82a8c75
Changed test
Nov 22, 2024
c3e6c56
Changed test
Nov 22, 2024
17d8e6a
Removed numpy array conversion in init
Nov 22, 2024
fc07dd9
Only allows initialization using Normal distribution from skpro
Nov 22, 2024
f35d064
Naming changes
Nov 22, 2024
87ca46a
Changed the shape of the test samples
Nov 22, 2024
ed5f81f
Changed the initiation process to numpy
Nov 22, 2024
ab59af2
Reworked logic
Nov 22, 2024
df7d0bb
Reworked recentering logic
Nov 22, 2024
cdb46b9
Make y_pred a series
Nov 22, 2024
4bae2f3
Only infer num_features during fit
Nov 22, 2024
d765033
Change col names
Nov 22, 2024
270be35
Change col names
Nov 22, 2024
fdd246f
Change col names
Nov 22, 2024
fb9f8db
Change col names
Nov 22, 2024
6439dc9
added second test param
Nov 22, 2024
9801221
changed update logic
Nov 22, 2024
8d2bb57
changed update logic
Nov 22, 2024
5e8c069
Remove centering
Nov 22, 2024
86b3aa0
Added example notebook
Nov 22, 2024
b20a255
Clarified that Normal doesn't result in multivariate normal
Nov 22, 2024
dc6cc54
removed kernelspec
Nov 22, 2024
f9a942f
reverted changes to 03_ example notebook
Dec 6, 2024
f32f887
formatting
Dec 6, 2024
4e998df
restructured example folder
Dec 6, 2024
d6342b9
Renamed notebook for consistency with the name of estimator
Dec 6, 2024
dddaa6c
Added line on BayesianMCMCLinearRegressor
Dec 6, 2024
9807a93
Renamed subfolder to abyesian
Dec 6, 2024
4ee08e6
Changed estimator name removed MCMC
Dec 6, 2024
0044efa
Changed the paths for Bayesian estimators
Jan 3, 2025
e5a6643
Renamed variables
Jan 3, 2025
6069012
Updated docstrings by specifying Gaussian conjugacy
Jan 3, 2025
6d12dca
Modified examples
Jan 3, 2025
a93c963
Modified examples
Jan 3, 2025
bdf5687
Ensured the provided mu prior is a column vector
Jan 3, 2025
8af237e
Ensured the provided mu prior is a column vector
Jan 3, 2025
eaa67ac
Ensured the provided mu prior is a column vector
Jan 3, 2025
54f7d97
Added example notebooks
Jan 3, 2025
6430f08
Added example notebooks
Jan 3, 2025
b03df94
Merge remote-tracking branch 'upstream/main' into pymc_dev_conjugate
Jan 3, 2025
bda0dc5
Added example notebooks
Jan 3, 2025
93c5ad9
Added example notebooks
Jan 3, 2025
a297acb
Added artefacts for example notebooks
Jan 4, 2025
b2f2f19
Removed notebooks and artefacts from this PR
Jan 9, 2025
4b3c84b
Added math info on docstring
Jan 9, 2025
490a100
Renamed submodule, making it private
Jan 9, 2025
d962798
Changed imports
Jan 9, 2025
f7afaa6
Changed imports
Jan 9, 2025
ff2bbb7
changed API name to BayesianRegressor
Jan 10, 2025
3466182
Merge remote-tracking branch 'upstream/main' into pymc_dev_conjugate
Jan 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions docs/source/api_reference/regression.rst
Original file line number Diff line number Diff line change
Expand Up @@ -195,3 +195,15 @@ Base classes
:template: class.rst

BaseProbaRegressor

BayesianRegressor
-----------------

.. currentmodule:: skpro.regression.bayesian

.. autosummary::
:toctree: auto_generated/
:template: class.rst

BayesianConjugateLinearRegressor
BayesianLinearRegressor
8 changes: 8 additions & 0 deletions skpro/regression/bayesian/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
"""Base classes for Bayesian probabilistic regression."""
# copyright: skpro developers, BSD-3-Clause License (see LICENSE file)

__all__ = ["BayesianConjugateLinearRegressor"]

from skpro.regression.bayesian._bayesian_conjugate import (
BayesianConjugateLinearRegressor,
)
322 changes: 322 additions & 0 deletions skpro/regression/bayesian/_bayesian_conjugate.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,322 @@
"""Bayesian Conjugate Linear Regression Estimator."""

__author__ = ["meraldoantonio"]

import numpy as np
import pandas as pd

from skpro.distributions import Normal
from skpro.regression.base import BaseProbaRegressor


class BayesianConjugateLinearRegressor(BaseProbaRegressor):
"""
Bayesian probabilistic estimator for linear regression.

This estimator models the relationship between features `X` and target `t` using
a Bayesian linear regression framework with conjugate priors.
Specifically, the prior and posterior of the coefficients (`w`) are
both multivariate Gaussians.

Model Assumptions
-----------------
- **Prior for coefficients (`w`)**:
Prior for coefficients `w` (including intercept) follow a multivariate Gaussian:
```
w ~ N(coefs_prior_mu, coefs_prior_cov)
```
where:
- `coefs_prior_mu`: Mean vector of the prior distribution for `w`.
- `coefs_prior_cov`: Covariance matrix of the prior distribution for `w`.

- **Likelihood**:
The likelihood of target `t` given features `X` and coefficients `w` is Gaussian:
```
t | X, w ~ N(X @ w, sigma^2)
```
where:
- `sigma^2 = 1 / noise_precision`: Variance of the noise in the data.
- `noise_precision`: Known precision (inverse variance) of the noise.

- **Posterior for coefficients (`w`)**:
Using Bayesian conjugacy, posterior of `w` remains Gaussian:
```
w | X, t ~ N(coefs_posterior_mu, coefs_posterior_cov)
```
with:
```
coefs_posterior_cov = (coefs_prior_precision + noise_precision * X.T @ X)^(-1)
coefs_posterior_mu = coefs_posterior_cov @ (
coefs_prior_precision @ coefs_prior_mu + noise_precision * X.T @ y
)
```
where:
- `coefs_prior_precision = inv(coefs_prior_cov)`.

- **Predictive distribution**:
For a new observation `x`, the predictive distribution of `y` is:
```
y_pred | x ~ N(x.T @ coefs_posterior_mu, pred_var)
```
where:
```
pred_var = x.T @ coefs_posterior_cov @ x + 1 / noise_precision
```

Example
-------
>>> from skpro.regression.bayesian.bayesian_conjugate import (
... BayesianConjugateLinearRegressor,
... ) # doctest: +SKIP
>>> from sklearn.datasets import load_diabetes # doctest: +SKIP
>>> from sklearn.model_selection import train_test_split # doctest: +SKIP

>>> # Load dataset
>>> X, y = load_diabetes(return_X_y=True, as_frame=True) # doctest: +SKIP
>>> X_train, X_test, y_train, y_test = train_test_split(X, y) # doctest: +SKIP

>>> # Center the training data
>>> X_train -= X_train.mean(axis=0) # doctest: +SKIP

>>> # Initialize model
>>> bayes_model = BayesianConjugateLinearRegressor(
... coefs_prior_mu=None,
... coefs_prior_cov=np.eye(X_train.shape[1]),
... noise_precision=1.0,
... ) # doctest: +SKIP

>>> # Fit the model
>>> bayes_model.fit(X_train, y_train) # doctest: +SKIP

>>> # Predict probabilities (returns an skpro Normal distribution)
>>> y_test_pred_proba = bayes_model.predict_proba(X_test) # doctest: +SKIP

>>> # Predict point estimates (mean of the predicted distribution)
>>> y_test_pred = bayes_model.predict(X_test) # doctest: +SKIP
"""

_tags = {
"authors": ["meraldoantonio"],
"capability:multioutput": False,
"capability:missing": True,
"X_inner_mtype": "pd_DataFrame_Table",
"y_inner_mtype": "pd_DataFrame_Table",
}

def __init__(self, coefs_prior_cov, coefs_prior_mu=None, noise_precision=1):
"""Initialize regressor by providing coefficent priors and noise precision.

Parameters
----------
coefs_prior_cov : 2D np.ndarray, required
Covariance matrix of the prior for intercept and coefficients.
If list of lists, will be converted to a 2D np.array.
Must be positive-definite.

coefs_prior_mu : np.ndarray column vector, optional
Mean vector of the prior for intercept and coefficients.
The zeroth element of the vector is the prior for the intercept.
If not provided, assumed to be a column vector of zeroes.

noise_precision : float
Known precision of the Gaussian likelihood noise (beta)
This is the inverse of the noise variance.
"""
if coefs_prior_cov is None:
raise ValueError("`coefs_prior_cov` must be provided.")
else:
self.coefs_prior_cov = coefs_prior_cov

# Set coefs_prior_mu to a zero vector if not provided
if coefs_prior_mu is None:
self.coefs_prior_mu = np.zeros((self.coefs_prior_cov.shape[0], 1))
# Ensure coefs_prior_mu is a column vector
elif coefs_prior_mu.ndim != 2 or coefs_prior_mu.shape[1] != 1:
raise ValueError(
"coefs_prior_mu must be a column vector with shape (n_features, 1)."
)
else:
self.coefs_prior_mu = coefs_prior_mu

# Validate dimensions of coefs_prior_mu and coefs_prior_cov
if self.coefs_prior_mu.shape[0] != self.coefs_prior_cov.shape[0]:
raise ValueError(
"Dimensionality of `coefs_prior_mu` and `coefs_prior_cov` must match."
)

self.noise_precision = noise_precision

super().__init__()

def _fit(self, X, y):
"""Fit the Bayesian Linear Regressor to the observed data.

Parameters
----------
X : pandas DataFrame
Feature matrix (n_samples, n_features).
y : pandas Series
Target vector (n_samples,).

Returns
-------
self : reference to self
"""
self._y_cols = y.columns

# Construct the prior mean and covariance
self._coefs_prior_mu = self.coefs_prior_mu
self._coefs_prior_cov = self.coefs_prior_cov
self._coefs_prior_precision = np.linalg.inv(self._coefs_prior_cov)

# Perform Bayesian inference
(
self._coefs_posterior_mu,
self._coefs_posterior_cov,
) = self._perform_bayesian_inference(
X, y, self._coefs_prior_mu, self._coefs_prior_cov
)
return self

def _predict_proba(self, X):
"""Predict the distribution over outputs for given features.

Parameters
----------
X : pandas DataFrame
Feature matrix (n_samples, n_features).

Returns
-------
y_pred : Normal
Predicted Normal distribution for outputs.
"""
idx = X.index
if isinstance(X, pd.DataFrame):
X = X.values

# Predictive mean: X * posterior_mu
pred_mu = X @ self._coefs_posterior_mu

# Compute predictive variance for each data point
pred_var_all_x_i = []
for i in range(X.shape[0]):
x_i = X[i, :].reshape(1, -1)
pred_var_x_i = (
x_i @ self._coefs_posterior_cov @ x_i.T + 1 / self.noise_precision
)
pred_var_all_x_i.append(pred_var_x_i.item())

pred_var_all_x_i = np.array(pred_var_all_x_i)
pred_sigma = np.sqrt(pred_var_all_x_i) # Compute standard deviation

self._pred_mu = pred_mu
self._pred_sigma = pred_sigma
self._pred_var = pred_var_all_x_i

mus = pred_mu.reshape(-1, 1).tolist() # Convert to list of lists
sigmas = pred_sigma.reshape(-1, 1).tolist() # Convert to list of lists

# Return results as skpro Normal distribution
self._y_pred = Normal(
mu=mus,
sigma=sigmas,
columns=self._y_cols,
index=idx,
)
return self._y_pred

def _perform_bayesian_inference(self, X, y, coefs_prior_mu, coefs_prior_cov):
"""Perform Bayesian inference for linear regression.

Obtains the posterior distribution using normal conjugacy formula.

Parameters
----------
X : pandas DataFrame
Feature matrix (n_samples, n_features).
y : pandas Series
Observed target vector (n_samples,).
coefs_prior_mu : np.ndarray
Mean vector of the prior Normal distribution for coefficients.
coefs_prior_cov : np.ndarray
Covariance matrix of the prior Normal distribution for coefficients.

Returns
-------
coefs_posterior_mu : np.ndarray
Mean vector of the posterior Normal distribution for coefficients.
coefs_posterior_cov : np.ndarray
Covariance matrix of the posterior Normal distribution for coefficients.
"""
X = np.array(X)
y = np.array(y)

# Compute prior precision from prior covariance
coefs_prior_precision = np.linalg.inv(coefs_prior_cov)

# Compute posterior precision and covariance
coefs_posterior_precision = coefs_prior_precision + self.noise_precision * (
X.T @ X
)
coefs_posterior_cov = np.linalg.inv(coefs_posterior_precision)
coefs_posterior_mu = coefs_posterior_cov @ (
coefs_prior_precision @ coefs_prior_mu + self.noise_precision * X.T @ y
)

return coefs_posterior_mu, coefs_posterior_cov

def update(self, X, y):
"""Update the posterior with new data.

Parameters
----------
X : pandas DataFrame
New feature matrix.
y : pandas Series or DataFrame
New target vector.

Returns
-------
self : reference to self
"""
# Ensure y is a DataFrame
if isinstance(y, pd.Series):
y = y.to_frame(name="y_train")

(
self._coefs_posterior_mu,
self._coefs_posterior_cov,
) = self._perform_bayesian_inference(
X, y, self._coefs_posterior_mu, self._coefs_posterior_cov
)
return self

@classmethod
def get_test_params(cls, parameter_set="default"):
"""Return testing parameter settings for the estimator.

Parameters
----------
parameter_set : str, default="default"
Name of the set of test parameters to return, for use in tests. If no
special parameters are defined for a value, will return `"default"` set.

Returns
-------
params : dict or list of dict, default = {}
Parameters to create testing instances of the class
"""
params1 = {
"coefs_prior_mu": np.zeros(10).reshape(-1, 1),
"coefs_prior_cov": np.eye(10),
"noise_precision": 1.0,
}

params2 = {
"coefs_prior_mu": np.zeros(10).reshape(-1, 1),
"coefs_prior_cov": np.eye(10),
"noise_precision": 0.5,
}

return [params1, params2]
Loading