-
Notifications
You must be signed in to change notification settings - Fork 7
/
README.Rmd
178 lines (135 loc) · 5.93 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# tidyAML <img src="man/figures/logo.png" width="147" height="170" align="right" />
<!-- badges: start -->
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/tidyAML)](https://cran.r-project.org/package=tidyAML)
![](https://cranlogs.r-pkg.org/badges/tidyAML)
![](https://cranlogs.r-pkg.org/badges/grand-total/tidyAML)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html##experimental)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](https://makeapullrequest.com)
<!-- badges: end -->
## Introduction
Welcome to __`{tidyAML}`__ which is a new R package that makes it easy to use the `tidymodels` ecosystem to perform automated machine learning (AutoML). This package provides a simple and intuitive interface that allows users to quickly generate machine learning models without worrying about the underlying details. It also includes a safety mechanism that ensures that the package will fail gracefully if any required extension packages are not installed on the user's machine. With `{tidyAML}`, users can easily build high-quality machine learning models in just a few lines of code. Whether you are a beginner or an experienced machine learning practitioner, `{tidyAML}` has something to offer.
Some ideas are that we should be able to generate regression
models on the fly without having to actually go through the process of building the
specification, especially if it is a non-tuning model, meaning we are not planing
on tuning hyper-parameters like `penalty` and `cost`.
The idea is not to re-write the excellent work the `tidymodels` team has done (because
it's not possible) but rather to try and make an enhanced easy to use set of functions
that do what they say and can generate many models and predictions at once.
This is similar to the great `h2o` package, but, `{tidyAML}` does not require java
to be setup properly like `h2o` because `{tidyAML}` is built on `tidymodels`.
## Thanks
Thank you [Garrick Aden-Buie](https://fosstodon.org/@grrrck/109479826278916014) for the easy name change suggestion.
## Installation
You can install `{tidyAML}` like so:
```{r message=FALSE, warning=FALSE}
#install.packages("tidyAML")
```
Or the development version from GitHub
```{r, warning=FALSE, message=FALSE}
# install.packages("devtools")
#devtools::install_github("spsanderson/tidyAML")
```
## Examples
Part of the reason to use `{tidyAML}` is so that you can generate many models of
your data set. One way of modeling a data set is using regression for some numeric
output. There is a convienent function in __tidyAML__ that will generate a set of
non-tuning models for _fast regression_. Let's take a look below.
First let's load the library
```{r example}
library(tidyAML)
```
Now lets see the function in action.
```{r message=FALSE, warning=FALSE}
fast_regression_parsnip_spec_tbl(.parsnip_fns = "linear_reg")
fast_regression_parsnip_spec_tbl(.parsnip_eng = c("lm","glm"))
fast_regression_parsnip_spec_tbl(.parsnip_eng = c("lm","glm","gee"),
.parsnip_fns = "linear_reg")
```
As shown we can easily select the models we want either by choosing the supported
`parsnip` function like `linear_reg()` or by choose the desired `engine`, you can
also use them both in conjunction with each other!
This function also does add a class to the output. Let's see it.
```{r message=FALSE, warning=FALSE}
class(fast_regression_parsnip_spec_tbl())
```
We see that there are two added classes, first `fst_reg_spec_tbl` because this
creates a set of non-tuning regression models and then `tidyaml_mod_spec_tbl` because
this is a model specification tibble built with `{tidyAML}`
Now, what if you want to create a non-tuning model spec without using the
`fast_regression_parsnip_spec_tbl()` function. Well, you can. The function is called
`create_model_spec()`.
```{r message=FALSE, warning=FALSE}
create_model_spec(
.parsnip_eng = list("lm","glm","glmnet","cubist"),
.parsnip_fns = list(
"linear_reg",
"linear_reg",
"linear_reg",
"cubist_rules"
)
)
create_model_spec(
.parsnip_eng = list("lm","glm","glmnet","cubist"),
.parsnip_fns = list(
"linear_reg",
"linear_reg",
"linear_reg",
"cubist_rules"
),
.return_tibble = FALSE
)
```
Now the reason we are here. Let's take a look at the first function for modeling
with `{tidyAML}`, __`fast_regression()`__.
```{r, warning=FALSE, message=FALSE}
library(recipes)
library(dplyr)
rec_obj <- recipe(mpg ~ ., data = mtcars)
frt_tbl <- fast_regression(
.data = mtcars,
.rec_obj = rec_obj,
.parsnip_eng = c("lm","glm","gee"),
.parsnip_fns = "linear_reg",
.drop_na = FALSE
)
glimpse(frt_tbl)
```
As we see above, one of the models has gracefully failed, thanks in part to the
function `purrr::safely()`, which was used to make what I call __safe_make__ functions.
Let's look at the fitted workflow predictions.
```{r message=FALSE, warning=FALSE}
frt_tbl$pred_wflw
```
Now let's load the `multilevelmod` library so that we can run the `gee` linear regression.
```{r message=FALSE, warning=FALSE}
library(multilevelmod)
rec_obj <- recipe(mpg ~ ., data = mtcars)
frt_tbl <- fast_regression(
.data = mtcars,
.rec_obj = rec_obj,
.parsnip_eng = c("lm","glm","gee"),
.parsnip_fns = "linear_reg"
)
extract_wflw_pred(frt_tbl, 1:3)
```
_Getting Regression Residuals_
Getting residuals is easy with `{tidyAML}`. Let's take a look.
```{r message=FALSE, warning=FALSE}
extract_regression_residuals(frt_tbl)
```
You can also pivot them into a long format making plotting easy with `ggplot2`.
```{r message=FALSE, warning=FALSE}
extract_regression_residuals(frt_tbl, .pivot_long = TRUE)
```