You have already been familiar with the complete ML pipelines (both supervised and unsupervised) by conducting past labs. However, every dataset is different and as your experience grows you are able to choose better solutions in different scenarios. Therefore, keep practicing with all the datasets you can find as much as you can.
Linear regression model is not the silver bullet for all supervised learning analysis. In this lab we will present you a problem scenario where different supervised learning models are more appropriate. You will conduct a complete supervised learning analysis, apply different models, and compare their performances.
Open the main.ipynb
file in the your-code
directory. Follow the instructions and add your code and explanations as necessary. At the end, in addition to completing the cells please also save your RF model as a pickle file.
main.ipynb
with your responses.- mushroom.sav file of your RF model.
Upon completion, add your deliverables to git. Then commit git and push your branch to the remote.
Mushroom Classification @Kaggle
Consequences of multicollinearity
Chi-Square Test of Independence
sklearn.model_selection.train_test_split
sklearn.ensemble.RandomForestClassifier
sklearn.metrics.confusion_matrix
sklearn.ensemble.GradientBoostingClassifier
pickle - Python object serialization