-
Notifications
You must be signed in to change notification settings - Fork 0
/
caret-trainfunction.Rmd
45 lines (24 loc) · 2.12 KB
/
caret-trainfunction.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
The ``train()`` Function
===========================================
One of the primary tools in the package is the ``train()`` function which can be used to
\item evaluate, using resampling, the effect of model tuning parameters on performance
\item choose the ``optimal" model across these parameters
\item estimate model performance from a training set
This function sets up a grid of tuning parameters for a number of classification and regression routines, fits each model and calculates a resampling based performance measure.
\item ``train()`` function can be used to tune models by picking the complexity parameters that are associated with the optimal resampling statistics.
\item For particular model, a grid of parameters (if any) is created and the model is trained on slightly different data for each candidate combination of tuning parameters.
\item Across each data set, the performance of held-out samples is calculated and the mean and standard deviation is summarized for each combination.
\item The combination with the optimal resampling statistic is chosen as the final model and the entire training set is used to fit a final model.
#### Syntax
< !--- % http://www.jstatsoft.org/v28/i05/paper -->
The ``train()`` function has the following arguments:
[``x``:] a matrix or data frame of predictors. Currently, the function only accepts numeric
values (i.e., no factors or character variables).
In some cases, the ``model.matrix`` function may be needed to generate a data frame or matrix of purely numeric data``.
[``y``:] a numeric or factor vector of outcomes. The function determines the type of problem
(classification or regression) from the type of the response given in this argument.
[``method``:] a character string specifying the type of model to be used.
[``metric``:] a character string with values of "``Accuracy``", "``Kappa``", "``RMSE``" or "``Rsquared``".
The metric value determines the objective function used to select the final model. For example,
selecting ``Kappa" makes the function select the tuning parameters with the largest value
of the mean Kappa statistic computed from the held-out samples.