-
Notifications
You must be signed in to change notification settings - Fork 2
Home
-
[XP] Least Squares
-
least_squares
full training gave us RMSE of 26.16. -
least_squares
with polynomial expansion on reduced dataset (cf @PizzaWhisperer) gives us RMSE of 0.71 with degree 1 and 0.74 with degree 2 and RMSE > 1 for degree 3.
-
-
[XP] Ridge regression
-
ridge_regression
with polynomial expansion using 4-fold cross validation gave 0.71 (resp. 0.74) for degree 1 (resp. degree 2), usinglambda = 1e-6
. Test error increases with increasinglambda
. -
Conclusion: polynomial expansion of degree 1, no regulizer
-
-
[XP] Lasso
-
lasso
gives RMSE of ~26 as well. Maybe there is a problem with taking the square root somewhere because this values seems especially far away from what we got inridge_regression
.
-
-
[XP] MAE
- In progress...
-
[EB] Fully connected neural networks
Implementation of dense connected neural networks. The input layer is the full data matrix, we then have a certain number of hidden layers (see further for the choice of the number of those layers and the number of neurons for each layers).
-
[EB] Train / Test data split
To verify our results and ensures we didn't overfit, we split the data into a training set containing roughly 90% of the data and a test set containing the remaining 10%.
-
[EB] Activators
The activation function is one of the key of the neural networks. As our task is a regression task, we chose the common choice in those cases : the MSE.
-
[EB] Regularizers
The neural networks are often prone to overfitting when trained on a big number of tunable parameters. We already had the occasion to observe this overfitting when training without test set. To avoid such overfitting, we add to our layers an L2 regularizer.
-
[EB] Dropout layers
In the litterature, dropout layers are a recent type of layers which, combined with L2 regularizers, allows to reduce the effect of overfitting.
-
[EB] Cross validation of neural networks
Another good method to reduce overfitting is using k-folds cross-validation. We tried to apply this techniques to our network. Unfortunately, due to the high number of features, we got 'MemoryError'. Computing with a reduced set of features works but gives bad results. We are waiting for feature selection to push this point further.