Home

Regressions (branch `regression`)

[XP] Least Squares
- least_squares full training gave us RMSE of 26.16.
- least_squares with polynomial expansion on reduced dataset (cf @PizzaWhisperer) gives us RMSE of 0.71 with degree 1 and 0.74 with degree 2 and RMSE > 1 for degree 3.
[XP] Ridge regression
- ridge_regression with polynomial expansion using 4-fold cross validation gave 0.71 (resp. 0.74) for degree 1 (resp. degree 2), using lambda = 1e-6. Test error increases with increasing lambda.
- Conclusion: polynomial expansion of degree 1, no regulizer
[XP] Lasso
- lasso gives RMSE of ~26 as well. Maybe there is a problem with taking the square root somewhere because this values seems especially far away from what we got in ridge_regression.
[XP] MAE
- In progress...

[EB] Fully connected neural networks

Implementation of dense connected neural networks. The input layer is the full data matrix, we then have a certain number of hidden layers (see further for the choice of the number of those layers and the number of neurons for each layers).
[EB] Train / Test data split

To verify our results and ensures we didn't overfit, we split the data into a training set containing roughly 90% of the data and a test set containing the remaining 10%.
[EB] Activators

The activation function is one of the key of the neural networks. As our task is a regression task, we chose the common choice in those cases : the MSE.
[EB] Regularizers

The neural networks are often prone to overfitting when trained on a big number of tunable parameters. We already had the occasion to observe this overfitting when training without test set. To avoid such overfitting, we add to our layers an L2 regularizer.
[EB] Dropout layers

In the litterature, dropout layers are a recent type of layers which, combined with L2 regularizers, allows to reduce the effect of overfitting.
[EB] Cross validation of neural networks

Another good method to reduce overfitting is using k-folds cross-validation. We tried to apply this techniques to our network. Unfortunately, due to the high number of features, we got 'MemoryError'. Computing with a reduced set of features works but gives bad results. We are waiting for feature selection to push this point further.