CS 7641 coursework
Data Sources
Prerequisites:
- Install all the dependent python libraries - scikit-learn, matplotlib, numpy and pandas
- util.py provides utlity function to input data and print learning curves abstracted for other files
- Store the .py files and data in the same folder
Decision Tree:
- In DecisionTree.py, remove line 85 and replace it with a function call to
draw_learning_curve_1()
todraw_learning_curve_2()
to create the learning curves for Phishing Dataset and Optical Recognition Dataset, respectively. Modify line 84 with X1, Y1 or X2, Y2 to pick the dataset. - Additionally, a 3-D graph can be generated with the function
testBothParams()
and max depth graph can be generated by callingtestMaxDepth()
KNN:
- Follow same steps as Decision Trees by calling functions to create graphs using KNN.py
SVM:
- Use either plot function to plot the graphs
Neural Network : Multilayer Perceptron:
- The code in MultiLayerPerceptron.py generates epoch curves for Phishing Data, uncomment the last block and comment lines 85 to 98 to generate them for Optical Recognition Dataset
Boosting:
- Uncomment line 41 to 62 to create graphs for Optical Recognition Dataset, change
learning_rate
ormax_depth
to create graphs for a higher or lower learning_rate or depth - Comment the block for Phishing or create two sub plots
References:
- Scikit-learn documentation
- Darraghdog's code