diff --git a/preparation/Broom_R_Package.Rmd b/preparation/Broom_R_Package.Rmd
new file mode 100644
index 0000000..e007b45
--- /dev/null
+++ b/preparation/Broom_R_Package.Rmd
@@ -0,0 +1,341 @@
+---
+title: "Correlation"
+subtitle: "Statistics With R"
+author: "R Workshop"
+output:
+ prettydoc::html_pretty:
+ theme: cayman
+ highlight: github
+---
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo = TRUE)
+require(prettydoc)
+require(tidyverse)
+require(janitor)
+```
+
+Broom
+============
+
+#### Convert Statistical Analysis Objects into Tidy Data Frames
+
+Convert statistical analysis objects from R into tidy data frames, so that they can more easily be combined, reshaped and otherwise processed with tools like 'dplyr', 'tidyr' and 'ggplot2'.
+
+The package provides three S3 generics:
+* **``tidy``**: summarizes a model's statistical findings such as coefficients of a regression;
+* **``augment``**: adds columns to the original data such as predictions, residuals and cluster assignments
+* **``glance``**: which provides a one-row summary of model-level statistics.
+
+
+
+```R
+# Run once, then comment out
+install.packages("broom")
+
+library(magrittr)
+
+library(broom)
+```
+
+ Installing package into '/home/nbcommon/R'
+ (as 'lib' is unspecified)
+
+
+### Fit a Linear Model
+* Just use a simple example using inbuilt data sets
+* Save it as `myModel`
+
+
+```R
+lm(mpg ~ wt + cyl, data=mtcars)
+```
+
+
+
+ Call:
+ lm(formula = mpg ~ wt + cyl, data = mtcars)
+
+ Coefficients:
+ (Intercept) wt cyl
+ 39.686 -3.191 -1.508
+
+
+
+
+```R
+myModel <- lm(mpg ~ wt + cyl, data=mtcars)
+summary(myModel)
+```
+
+
+
+ Call:
+ lm(formula = mpg ~ wt + cyl, data = mtcars)
+
+ Residuals:
+ Min 1Q Median 3Q Max
+ -4.2893 -1.5512 -0.4684 1.5743 6.1004
+
+ Coefficients:
+ Estimate Std. Error t value Pr(>|t|)
+ (Intercept) 39.6863 1.7150 23.141 < 2e-16 ***
+ wt -3.1910 0.7569 -4.216 0.000222 ***
+ cyl -1.5078 0.4147 -3.636 0.001064 **
+ ---
+ Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
+
+ Residual standard error: 2.568 on 29 degrees of freedom
+ Multiple R-squared: 0.8302, Adjusted R-squared: 0.8185
+ F-statistic: 70.91 on 2 and 29 DF, p-value: 6.809e-12
+
+
+
+#### 1. The `tidy()` function
+
+**``tidy()``** constructs a data frame that summarizes the model's statistical findings. This includes coefficients and p-values for each term in a regression, per-cluster information in clustering applications, or per-test information for multtest functions.
+
+
+
+```R
+tidy(myModel)
+```
+
+
+
+term | estimate | std.error | statistic | p.value |
+
+ (Intercept) | 39.686261 | 1.7149840 | 23.140893 | 3.043182e-20 |
+ wt | -3.190972 | 0.7569065 | -4.215808 | 2.220200e-04 |
+ cyl | -1.507795 | 0.4146883 | -3.635972 | 1.064282e-03 |
+
+
+
+
+
+
+```R
+myTidyModel <- tidy(myModel)
+```
+
+
+```R
+class(myTidyModel)
+```
+
+
+'data.frame'
+
+
+
+```R
+names(myTidyModel)
+```
+
+
+
+ - 'term'
+ - 'estimate'
+ - 'std.error'
+ - 'statistic'
+ - 'p.value'
+
+
+
+
+
+```R
+myTidyModel$p.value %>% round(4)
+```
+
+
+
+ - 0
+ - 2e-04
+ - 0.0011
+
+
+
+
+#### 2. The `augment()` function
+
+**``augment()``** add columns to the original data that was modeled.
+This includes predictions, residuals, and cluster assignments.
+
+
+
+```R
+myModel <- lm(mpg ~ wt+cyl, mtcars)
+augment(myModel)
+
+```
+
+
+
+.rownames | mpg | wt | cyl | .fitted | .se.fit | .resid | .hat | .sigma | .cooksd | .std.resid |
+
+ Mazda RX4 | 21.0 | 2.620 | 6 | 22.27914 | 0.6011667 | -1.27914467 | 0.05482311 | 2.601105 | 0.0050772590 | -0.51244825 |
+ Mazda RX4 Wag | 21.0 | 2.875 | 6 | 21.46545 | 0.4976294 | -0.46544677 | 0.03756521 | 2.611423 | 0.0004442585 | -0.18478693 |
+ Datsun 710 | 22.8 | 2.320 | 4 | 26.25203 | 0.7252444 | -3.45202624 | 0.07978891 | 2.522911 | 0.0567764620 | -1.40157794 |
+ Hornet 4 Drive | 21.4 | 3.215 | 6 | 20.38052 | 0.4602669 | 1.01948376 | 0.03213611 | 2.605613 | 0.0018029260 | 0.40360828 |
+ Hornet Sportabout | 18.7 | 3.440 | 8 | 16.64696 | 0.7752706 | 2.05304242 | 0.09117599 | 2.581072 | 0.0235271472 | 0.83877393 |
+ Valiant | 18.1 | 3.460 | 6 | 19.59873 | 0.5178496 | -1.49872807 | 0.04068001 | 2.596911 | 0.0050205614 | -0.59597493 |
+ Duster 360 | 14.3 | 3.570 | 8 | 16.23213 | 0.7267482 | -1.93213120 | 0.08012014 | 2.585079 | 0.0178733213 | -0.78461743 |
+ Merc 240D | 24.4 | 3.190 | 4 | 23.47588 | 1.0000172 | 0.92411952 | 0.15170109 | 2.606073 | 0.0091033181 | 0.39078743 |
+ Merc 230 | 22.8 | 3.150 | 4 | 23.60352 | 0.9793969 | -0.80351937 | 0.14550945 | 2.607793 | 0.0065061176 | -0.33855530 |
+ Merc 280 | 19.2 | 3.440 | 6 | 19.66255 | 0.5108741 | -0.46254751 | 0.03959146 | 2.611439 | 0.0004643600 | -0.18382951 |
+ Merc 280C | 17.8 | 3.440 | 6 | 19.66255 | 0.5108741 | -1.86254751 | 0.03959146 | 2.588159 | 0.0075293380 | -0.74022926 |
+ Merc 450SE | 16.4 | 4.070 | 8 | 14.63665 | 0.6544576 | 1.76335487 | 0.06497359 | 2.590136 | 0.0116847953 | 0.71025562 |
+ Merc 450SL | 17.3 | 3.730 | 8 | 15.72158 | 0.6819424 | 1.57842434 | 0.07054547 | 2.594578 | 0.0102875723 | 0.63767089 |
+ Merc 450SLC | 15.2 | 3.780 | 8 | 15.56203 | 0.6718159 | -0.36202705 | 0.06846591 | 2.612000 | 0.0005228914 | -0.14609271 |
+ Cadillac Fleetwood | 10.4 | 5.250 | 8 | 10.87130 | 1.1525645 | -0.47129800 | 0.20151356 | 2.611060 | 0.0035498738 | -0.20542284 |
+ Lincoln Continental | 10.4 | 5.424 | 8 | 10.31607 | 1.2633704 | 0.08393115 | 0.24212252 | 2.612898 | 0.0001501537 | 0.03755005 |
+ Chrysler Imperial | 14.7 | 5.345 | 8 | 10.56816 | 1.2125441 | 4.13184435 | 0.22303287 | 2.458216 | 0.3189363624 | 1.82570047 |
+ Fiat 128 | 32.4 | 2.200 | 4 | 26.63494 | 0.7270859 | 5.76505710 | 0.08019462 | 2.353101 | 0.1592990291 | 2.34122168 |
+ Honda Civic | 30.4 | 1.615 | 4 | 28.50166 | 0.8820281 | 1.89833840 | 0.11801538 | 2.584888 | 0.0276449872 | 0.78728146 |
+ Toyota Corolla | 33.9 | 1.835 | 4 | 27.79965 | 0.7988791 | 6.10035227 | 0.09681350 | 2.314308 | 0.2233281268 | 2.50007531 |
+ Toyota Corona | 21.5 | 2.465 | 4 | 25.78934 | 0.7380797 | -4.28933528 | 0.08263810 | 2.472103 | 0.0913548207 | -1.74424120 |
+ Dodge Challenger | 15.5 | 3.520 | 8 | 16.39168 | 0.7442464 | -0.89167980 | 0.08402476 | 2.607023 | 0.0040263378 | -0.36287242 |
+ AMC Javelin | 15.2 | 3.435 | 8 | 16.66291 | 0.7773252 | -1.46291244 | 0.09165987 | 2.596811 | 0.0120218543 | -0.59783451 |
+ Camaro Z28 | 13.3 | 3.840 | 8 | 15.37057 | 0.6623197 | -2.07056872 | 0.06654404 | 2.581383 | 0.0165559199 | -0.83469849 |
+ Pontiac Firebird | 19.2 | 3.845 | 8 | 15.35461 | 0.6616629 | 3.84538614 | 0.06641213 | 2.502378 | 0.0569730451 | 1.55006265 |
+ Fiat X1-9 | 27.3 | 1.935 | 4 | 27.48055 | 0.7700721 | -0.18055052 | 0.08995733 | 2.612717 | 0.0001790454 | -0.07371481 |
+ Porsche 914-2 | 26.0 | 2.140 | 4 | 26.82640 | 0.7322422 | -0.82640123 | 0.08133608 | 2.607877 | 0.0033281614 | -0.33581456 |
+ Lotus Europa | 30.4 | 1.513 | 4 | 28.82714 | 0.9282190 | 1.57285924 | 0.13069974 | 2.593440 | 0.0216355209 | 0.65704006 |
+ Ford Pantera L | 15.8 | 3.170 | 8 | 17.50852 | 0.9023791 | -1.70852005 | 0.12352416 | 2.590102 | 0.0237336584 | -0.71078295 |
+ Ferrari Dino | 19.7 | 2.770 | 6 | 21.80050 | 0.5342815 | -2.10049885 | 0.04330261 | 2.581252 | 0.0105550987 | -0.83641545 |
+ Maserati Bora | 15.0 | 3.570 | 8 | 16.23213 | 0.7267482 | -1.23213120 | 0.08012014 | 2.601659 | 0.0072685192 | -0.50035506 |
+ Volvo 142E | 21.4 | 2.780 | 4 | 24.78418 | 0.8176667 | -3.38417906 | 0.10142065 | 2.524357 | 0.0727399065 | -1.39047125 |
+
+
+
+
+
+#### 3. The ``glance()`` function
+
+**``glance()``** construct a concise one-row summary of the model. This typically contains values such as R2, adjusted R2, and residual standard error that are computed once for the entire model.
+
+
+
+```R
+glance(myModel)
+
+
+```
+
+
+
+p.value | df | logLik |
+
+ 6.808955e-12 | 3 | -74.00503 |
+
+
+
+
+
+## K-Means
+
+
+```R
+kmeans(iris[,1:4],3 )
+```
+
+
+ K-means clustering with 3 clusters of sizes 96, 33, 21
+
+ Cluster means:
+ Sepal.Length Sepal.Width Petal.Length Petal.Width
+ 1 6.314583 2.895833 4.973958 1.7031250
+ 2 5.175758 3.624242 1.472727 0.2727273
+ 3 4.738095 2.904762 1.790476 0.3523810
+
+ Clustering vector:
+ [1] 2 3 3 3 2 2 2 2 3 3 2 2 3 3 2 2 2 2 2 2 2 2 2 2 3 3 2 2 2 3 3 2 2 2 3 2 2
+ [38] 2 3 2 2 3 3 2 2 3 2 3 2 2 1 1 1 1 1 1 1 3 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1
+ [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1
+ [112] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
+ [149] 1 1
+
+ Within cluster sum of squares by cluster:
+ [1] 118.651875 6.432121 17.669524
+ (between_SS / total_SS = 79.0 %)
+
+ Available components:
+
+ [1] "cluster" "centers" "totss" "withinss" "tot.withinss"
+ [6] "betweenss" "size" "iter" "ifault"
+
+
+
+```R
+KMmodel <- kmeans(iris[,1:4],3 )
+summary(KMmodel)
+```
+
+
+ Length Class Mode
+ cluster 150 -none- numeric
+ centers 12 -none- numeric
+ totss 1 -none- numeric
+ withinss 3 -none- numeric
+ tot.withinss 1 -none- numeric
+ betweenss 1 -none- numeric
+ size 3 -none- numeric
+ iter 1 -none- numeric
+ ifault 1 -none- numeric
+
+
+
+```R
+tidy(KMmodel)
+```
+
+
+
+x1 | x2 | x3 | x4 | size | withinss | cluster |
+
+ 5.006000 | 3.428000 | 1.462000 | 0.246000 | 50 | 15.15100 | 1 |
+ 5.901613 | 2.748387 | 4.393548 | 1.433871 | 62 | 39.82097 | 2 |
+ 6.850000 | 3.073684 | 5.742105 | 2.071053 | 38 | 23.87947 | 3 |
+
+
+
+
+
+
+```R
+# augment(Model,Data)
+```
+
+
+```R
+augment(KMmodel,iris[,1:4]) %>%head()
+```
+
+
+
+Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | .cluster |
+
+ 5.1 | 3.5 | 1.4 | 0.2 | 1 |
+ 4.9 | 3.0 | 1.4 | 0.2 | 1 |
+ 4.7 | 3.2 | 1.3 | 0.2 | 1 |
+ 4.6 | 3.1 | 1.5 | 0.2 | 1 |
+ 5.0 | 3.6 | 1.4 | 0.2 | 1 |
+ 5.4 | 3.9 | 1.7 | 0.4 | 1 |
+
+
+
+
+
+
+```R
+glance(KMmodel)
+```
+
+
+
+totss | tot.withinss | betweenss | iter |
+
+ 681.3706 | 78.85144 | 602.5192 | 2 |
+
+