My activities for data science learning club
Learning activities for which I have written blog posts can be found here
The activities that are contained in this repo are explained in the following sections.
Contains visualisations of NFL data, mainly focusing on teams.
Performed Naive Bayes on the mushroom data set
Wrote report in rmarkdown.
Performed Linear Regression on salary dataset
Wrote report in rmarkdown. Shows use of ggplot, ggpairs (GGally), reshape2, lm Deals with correlation, visualisation of data (numeric+categoric), details of lm(), computation & visualisation of train & test error
I based my implementation on the notebook Computing the optimal road trip across the U.S. by (c) Randal S. Olson (http://www.randalolson.com/, rhiever on github). (License info).
I ported his algorithm to R and used the data that is stored by his notebook as a .tsv file. The code performs a genetic algorithm on a data set of US cities to find the shortest distance to visit all cities.