Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Adds Linear, Ridge and Lasso regressions #10

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ keywords = ["machine-learning", "linfa", "ai", "ml"]
categories = ["algorithms", "mathematics", "science"]

[dependencies]

linfa-supervised = {path = "linfa-supervised"}
Copy link
Contributor

@remram44 remram44 Dec 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add a version number, or publishing the linfa crate to crates.io will fail 😉

Suggested change
linfa-supervised = {path = "linfa-supervised"}
linfa-supervised = { path = "linfa-supervised", version = "0.1" }

linfa-clustering = { path = "linfa-clustering", version = "0.1" }

[dev-dependencies]
Expand All @@ -22,7 +24,8 @@ rand_isaac = "0.2.0"
ndarray-npy = { version = "0.5", default-features = false }

[workspace]
members = ["linfa-clustering"]
members = ["linfa-clustering", "linfa-supervised"]

[profile.release]
opt-level = 3
lto = true
16 changes: 16 additions & 0 deletions linfa-supervised/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
[package]
name = "linfa-supervised"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably prefer this to be named linfa-linear - we will have many more algorithm belonging to the family of supervised learning and we don't want to have all of them in the same sub-crate.

version = "0.1.0"
authors = ["Khalil HADJI <[email protected]>"]
edition = "2018"
description = "A collection of supervised learning algorithms"
license = "MIT/Apache-2.0"

repository = "https://github.com/LukeMathWalker/linfa"

keywords = ["supervised", "machine-learning", "linfa", "regression", "linear", "ridge", "lasso"]
categories = ["algorithms", "mathematics", "science"]

[dependencies]
ndarray = { version = "0.13" , features = ["rayon"] }
ndarray-linalg = { version = "0.12", features = ["openblas"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to force the usage of a certain BLAS implementation over the other ones (e.g. intel-mkl) - I would suggest to use a features-strategy similar to what I put down here: https://github.com/rust-ndarray/ndarray-examples/blob/d7d43e018cade1f8bf0d92220081ba66963bab07/linear_regression/Cargo.toml#L8

40 changes: 40 additions & 0 deletions linfa-supervised/examples/main.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
use linfa_supervised::LinearRegression;
use linfa_supervised::RidgeRegression;
use ndarray::array;

fn linear_regression() {
let mut linear_regression = LinearRegression::new(false);
let x = array![[1.0], [2.0], [3.0], [4.0]];
let y = array![1.0, 2.0, 3.0, 4.0];
linear_regression.fit(&x, &y);
let x_hat = array![[6.0], [7.0]];
println!("{:#?}", linear_regression.predict(&x_hat));

let mut linear_regression2 = LinearRegression::new(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest splitting this example (and the one below) in two separate examples, with a descriptive name (e.g. with/without intercept).
It would be ideal to have them in separate files as well under the examples folder.

let x2 = array![[1.0], [2.0], [3.0], [4.0]];
let y2 = array![2.0, 3.0, 4.0, 5.0];
linear_regression2.fit(&x2, &y2);
let x2_hat = array![[6.0], [7.0]];
println!("{:#?}", linear_regression2.predict(&x2_hat));
}

fn ridge_regression() {
let mut ridge_regression = RidgeRegression::new(0.0);
let x = array![[1.0], [2.0], [3.0], [4.0]];
let y = array![1.0, 2.0, 3.0, 4.0];
ridge_regression.fit(&x, &y);
let x_hat = array![[6.0], [7.0]];
println!("{:#?}", ridge_regression.predict(&x_hat));

let mut ridge_regression2 = RidgeRegression::new(1.0);
let x2 = array![[1.0], [2.0], [3.0], [4.0]];
let y2 = array![2.0, 3.0, 4.0, 5.0];
ridge_regression2.fit(&x2, &y2);
let x2_hat = array![[6.0], [7.0]];
println!("{:#?}", ridge_regression2.predict(&x2_hat));
}

fn main() {
linear_regression();
ridge_regression();
}
5 changes: 5 additions & 0 deletions linfa-supervised/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
mod linear_regression;
mod ridge_regression;

pub use linear_regression::*;
pub use ridge_regression::*;
119 changes: 119 additions & 0 deletions linfa-supervised/src/linear_regression/algorithm.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
#![allow(non_snake_case)]
use ndarray::{stack, Array, Array1, ArrayBase, Axis, Data, Ix1, Ix2};
use ndarray_linalg::Solve;
/* I will probably change the implementation for an enum for more type safety.
I have to make sure, it is a great idea when it comes to pyhton interoperability
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is considerable freedom in how to wrap the Rust version for Python consumption - as detailed in #8, we shouldn't let Python move our Rust design in directions which are not idiomatic. The wrapping code can do the bridging when required 😀

So I'd definitely suggest to go with the commented out version, which uses an enum (LinearRegression::new(false) is much more confusing than LinearRegression::new(Intercept::NoIntercept)).

enum Intercept {
NoIntercept,
Intercept(Array1<f64>)
}
pub struct LinearRegressor {
beta : Option<Array1<f64>>,
intercept : Intercept,
}
*/

/*
If fit_intercept is false, we suppose that the regression passes throught the origin
*/
/*
The simple linear regression model is
y = bX + e where e ~ N(0, sigma^2 * I)
In probabilistic terms this corresponds to
y - bX ~ N(0, sigma^2 * I)
y | X, b ~ N(bX, sigma^2 * I)
The loss for the model is simply the squared error between the model
predictions and the true values:
Loss = ||y - bX||^2
The maximum likelihood estimation for the model parameters `beta` can be computed
in closed form via the normal equation:
b = (X^T X)^{-1} X^T y
where (X^T X)^{-1} X^T is known as the pseudoinverse or Moore-Penrose inverse.
*/
pub struct LinearRegression {
beta: Option<Array1<f64>>,
fit_intercept: bool,
}

impl LinearRegression {
pub fn new(fit_intercept: bool) -> LinearRegression {
LinearRegression {
beta: None,
fit_intercept,
}
}
/* Instead of assert_eq we should probably return a Result, we first have to have a generic error type for all algorithms */
pub fn fit<A, B>(&mut self, X: &ArrayBase<A, Ix2>, Y: &ArrayBase<B, Ix1>)
where
A: Data<Elem = f64>,
B: Data<Elem = f64>,
{
let (n_samples, _) = X.dim();

// We have to make sure that the dimensions match
assert_eq!(Y.dim(), n_samples);

self.beta = if self.fit_intercept {
let dummy_column: Array<f64, _> = Array::ones((n_samples, 1));
/*
if x is has 2 features and 3 samples
x = [[1,2]
,[3,4]
,[5,6]]
dummy_column = [[1]
,[1]
,[1]]
*/
let X_with_ones = stack(Axis(1), &[dummy_column.view(), X.view()]).unwrap();
Some(LinearRegression::fit_beta(&X_with_ones, Y))
} else {
Some(LinearRegression::fit_beta(X, Y))
}
}
fn fit_beta<A, B>(X: &ArrayBase<A, Ix2>, y: &ArrayBase<B, Ix1>) -> Array1<f64>
where
A: Data<Elem = f64>,
B: Data<Elem = f64>,
{
let rhs = X.t().dot(y);
let linear_operator = X.t().dot(X);
linear_operator.solve_into(rhs).unwrap()
}

pub fn predict<A>(&self, X: &ArrayBase<A, Ix2>) -> Array1<f64>
where
A: Data<Elem = f64>,
{
let (n_samples, _) = X.dim();

// If we are fitting the intercept, we need an additional column
if self.fit_intercept {
let dummy_column: Array<f64, _> = Array::ones((n_samples, 1));
let X = stack(Axis(1), &[dummy_column.view(), X.view()]).unwrap();
match &self.beta {
None => panic!("The linear regression estimator has to be fitted first!"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not ideal - can we refactor the way we build LinearRegression to make sure that beta is always there when a user calls predict?

The easiest way to do this is re-using the same approach I put down in KMeans: a structure to hold the hyperparameters of the model (built with/without builder pattern, depending on how many hyperparameters we have there) and a fit method on it that returns a fitted LinearRegression - basically, using fit as constructor. In this way we can be sure that beta is there and we have one less panic condition 😁

Some(beta) => X.dot(beta),
}
} else {
match &self.beta {
None => panic!("The linear regression estimator has to be fitted first!"),
Some(beta) => X.dot(beta),
}
}
}
}

#[cfg(test)]
mod test {
use super::*;
use ndarray::array;
#[test]
fn linear_regression_test() {
let mut linear_regression = LinearRegression::new(false);
let x = array![[1.0], [2.0], [3.0], [4.0]];
let y = array![1.0, 2.0, 3.0, 4.0];
linear_regression.fit(&x, &y);
let x_hat = array![[6.0]];
assert_eq!(linear_regression.predict(&x_hat), array![6.0])
}
}
3 changes: 3 additions & 0 deletions linfa-supervised/src/linear_regression/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
mod algorithm;

pub use algorithm::*;
44 changes: 44 additions & 0 deletions linfa-supervised/src/ridge_regression/algorithm.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#![allow(non_snake_case)]
use ndarray::{Array, Array1, ArrayBase, Data, Ix1, Ix2};
use ndarray_linalg::Solve;
/* The difference between a linear regression and a Ridge regression is
that ridge regression has an L2 penalisation term to having some features
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo? avoid having?

"taking all the credit" for the output. It is also a way to deal with over-fitting by adding bias.
Some details ...
b = (X^T X + aI)X^T y with a being the regularisation/penalisation term
*/

pub struct RidgeRegression {
beta: Option<Array1<f64>>,
alpha: f64,
}

impl RidgeRegression {
pub fn new(alpha: f64) -> RidgeRegression {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we have an intercept parameter here as well?

RidgeRegression {
beta: None,
alpha: alpha,
}
}

pub fn fit<A, B>(&mut self, X: &ArrayBase<A, Ix2>, Y: &ArrayBase<B, Ix1>)
where
A: Data<Elem = f64>,
B: Data<Elem = f64>,
{
let second_term = X.t().dot(Y);
let (_, identity_size) = X.dim();
let linear_operator = X.t().dot(X) + self.alpha * Array::eye(identity_size);
self.beta = Some(linear_operator.solve_into(second_term).unwrap());
}

pub fn predict<A>(&self, X: &ArrayBase<A, Ix2>) -> Array1<f64>
where
A: Data<Elem = f64>,
{
match &self.beta {
None => panic!("The ridge regression estimator has to be fitted first!"),
Some(beta) => X.dot(beta),
}
}
}
3 changes: 3 additions & 0 deletions linfa-supervised/src/ridge_regression/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
mod algorithm;

pub use algorithm::*;
6 changes: 6 additions & 0 deletions linfa-supervised/src/utils.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
trait GradientDescent {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

fn gradient()

fn optimise(){}

}
4 changes: 4 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,7 @@
pub mod clustering {
pub use linfa_clustering::*;
}

pub mod supervised {
pub use linfa_supervised::*;
}