Arguments
- algo
-사용자가 임의로 지정할 알고리즘명 (default: "XGBoost")
+A name of the algorithm which can be customized by user (default: "XGBoost").
- engine
-모델을 생성할 때 사용할 패키지 ("xgboost" (default))
+The name of software that should be used to fit the model ("xgboost" (default)).
- mode
-분석 유형 ("classification" (default), "regression")
+The model type. It should be "classification" or "regression" ("classification" (default), "regression").
- trainingData
-훈련데이터 셋
+The training data.
- splitedData
-train-test 데이터 분할 정보를 포함하고 있는 전체 데이터 셋
+A data frame including metadata of split.
- formula
-모델링을 위한 수식
+formula for modeling
- rec
-데이터, 전처리 정보를 포함한 recipe object
+Recipe object containing preprocessing information for cross-validation.
- v
-v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.)
+Applying v-fold cross validation in modeling process (default: 5).
+
+
+- gridNum
+Initial number of iterations to run before starting the optimization algorithm.
+
+
+- iter
+The maximum number of search iterations.
- metric
-모델의 성능을 평가할 기준지표 (classification: "roc_auc" (default), "accuracy" / regression: "rmse" (default), "rsq")
+Metric to evaluate the performance (classification: "roc_auc" (default), "accuracy" / regression: "rmse" (default), "rsq").
-- ...
-hyperparameters의 범위에 대한 Min, Max, Levels 값에 해당하는 파라미터를 지정합니다.
+- seed
+Seed for reproducible results.
diff --git a/docs/search.json b/docs/search.json
index cd191a6..e8e2fde 100644
--- a/docs/search.json
+++ b/docs/search.json
@@ -1 +1 @@
-[{"path":"/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Yeonchan Seong. Author, maintainer.","code":""},{"path":"/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Seong Y (2023). stove: Stove. R package version 0.0.0.9000, https://github.com/statgarten/stove.","code":"@Manual{, title = {stove: Stove}, author = {Yeonchan Seong}, year = {2023}, note = {R package version 0.0.0.9000}, url = {https://github.com/statgarten/stove}, }"},{"path":"/index.html","id":"yellow_heart-stove-","dir":"","previous_headings":"","what":"Stove","title":"Stove","text":"Description stove","code":""},{"path":"/index.html","id":"wrench-install","dir":"","previous_headings":"","what":"🔧 Install","title":"Stove","text":"","code":"# install.packages(\"devtools\") devtools::install_github(\"statgarten/stove\")"},{"path":"/index.html","id":"example-code","dir":"","previous_headings":"","what":"Example Code","title":"Stove","text":"documents contain example code ML workflows using stove.","code":""},{"path":"/index.html","id":"sample-data-import","dir":"","previous_headings":"Example Code","what":"Sample Data Import","title":"Stove","text":"Example code Sample Data Import","code":""},{"path":"/index.html","id":"data-split-and-define-preprocessing","dir":"","previous_headings":"Example Code","what":"Data split and Define preprocessing","title":"Stove","text":"Example code Data split Define preprocessing","code":""},{"path":"/index.html","id":"modeling","dir":"","previous_headings":"Example Code","what":"Modeling","title":"Stove","text":"Example code Modeling","code":""},{"path":"/index.html","id":"clipboard-dependency","dir":"","previous_headings":"","what":"📋 Dependency","title":"Stove","text":"assertthat - 0.2.1base64enc - 0.1-3 … sessioninfo::package_info()","code":""},{"path":"/index.html","id":"blush-authors","dir":"","previous_headings":"","what":"😊 Authors","title":"Stove","text":"Yeonchan Seong @ycseong07","code":""},{"path":"/index.html","id":"memo-license","dir":"","previous_headings":"","what":"📝 License","title":"Stove","text":"Copyright ©️ 2022 Yeonchan Seong project MIT licensed","code":""},{"path":"/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2023 stove authors Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"/reference/bayesOptCV.html","id":null,"dir":"Reference","previous_headings":"","what":"Bayesian optimization with cross validation — bayesOptCV","title":"Bayesian optimization with cross validation — bayesOptCV","text":"Bayesian optimization cross validation","code":""},{"path":"/reference/bayesOptCV.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Bayesian optimization with cross validation — bayesOptCV","text":"","code":"bayesOptCV( rec = NULL, model = NULL, v = NULL, trainingData = NULL, gridNum = NULL, iter = NULL, seed = NULL )"},{"path":"/reference/bayesOptCV.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Bayesian optimization with cross validation — bayesOptCV","text":"rec 데이터, 전처리 정보를 포함한 recipe object model hyperparameters, ngine, mode 정보가 포함된 model object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) trainingData 훈련데이터 셋 iter grid search를 수행할 때 각 hyperparameter의 값을 담은 object seed seed값 설정 initial 몇 개의 grid로","code":""},{"path":"/reference/bayesOptCV.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Bayesian optimization with cross validation — bayesOptCV","text":"교차검증 수행 과정에서, Bayesian optimization을 통해 모델의 하이퍼파라미터를 최적화합니다.","code":""},{"path":"/reference/clusteringVis.html","id":null,"dir":"Reference","previous_headings":"","what":"clusteringVis — clusteringVis","title":"clusteringVis — clusteringVis","text":"clusteringVis","code":""},{"path":"/reference/clusteringVis.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"clusteringVis — clusteringVis","text":"","code":"clusteringVis( data = NULL, model = NULL, maxK = \"15\", nBoot = \"100\", selectOptimal = \"silhouette\", seedNum = \"6471\" )"},{"path":"/reference/clusteringVis.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"clusteringVis — clusteringVis","text":"data data model model maxK maxK nStart nStart","code":""},{"path":"/reference/clusteringVis.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"clusteringVis — clusteringVis","text":"Deprecated","code":""},{"path":"/reference/decisionTree.html","id":null,"dir":"Reference","previous_headings":"","what":"Decision Tree — decisionTree","title":"Decision Tree — decisionTree","text":"Decision Tree","code":""},{"path":"/reference/decisionTree.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Decision Tree — decisionTree","text":"","code":"decisionTree( algo = \"Decision Tree\", engine = \"rpart\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/decisionTree.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Decision Tree — decisionTree","text":"algo 사용자가 임의로 지정할 알고리즘명 (default: \"Decision Tree\") engine 모델을 생성할 때 사용할 패키지 (\"rpart\" (default), \"C50\", \"partykit\") mode 분석 유형 (\"classification\" (default), \"regression\") trainingData 훈련데이터 셋 splitedData train-test 데이터 분할 정보를 포함하고 있는 전체 데이터 셋 formula 모델링을 위한 수식 rec 데이터, 전처리 정보를 포함한 recipe object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) metric 모델의 성능을 평가할 기준지표 (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\") ... hyperparameters의 범위에 대한 Min, Max, Levels 값에 해당하는 파라미터를 지정합니다.","code":""},{"path":"/reference/decisionTree.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Decision Tree — decisionTree","text":"의사결정나무 알고리즘 함수. 의사 결정 규칙 (Decision rule)을 나무 형태로 분류해 나가는 분석 기법을 말합니다. hyperparameters: tree_depth: 최종 예측값에 다다르기까지 몇 번 트리를 분할할지 설정합니다. min_n: 트리를 분할하기 위해 필요한 관측값의 최소 개수를 설정합니다. cost_complexity: 트리 분할을 위해 필요한 비용을 설정합니다. 0일 경우, 가능한 모든 분할이 수행됩니다.","code":""},{"path":"/reference/evalMetricsR.html","id":null,"dir":"Reference","previous_headings":"","what":"Evaluation metrics for Regression — evalMetricsR","title":"Evaluation metrics for Regression — evalMetricsR","text":"Evaluation metrics Regression","code":""},{"path":"/reference/evalMetricsR.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Evaluation metrics for Regression — evalMetricsR","text":"","code":"evalMetricsR(modelsList, targetVar)"},{"path":"/reference/evalMetricsR.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Evaluation metrics for Regression — evalMetricsR","text":"modelsList ML 모델 리스트 targetVar 타겟 변수","code":""},{"path":"/reference/evalMetricsR.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Evaluation metrics for Regression — evalMetricsR","text":"ML 모델 리스트로부터 Regression 모델들에 대한 Evaluation metrics를 생성합니다.","code":""},{"path":"/reference/fitBestModel.html","id":null,"dir":"Reference","previous_headings":"","what":"fitting in best model — fitBestModel","title":"fitting in best model — fitBestModel","text":"fitting best model","code":""},{"path":"/reference/fitBestModel.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"fitting in best model — fitBestModel","text":"","code":"fitBestModel( optResult, metric, model, formula, trainingData, splitedData, algo )"},{"path":"/reference/fitBestModel.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"fitting in best model — fitBestModel","text":"optResult gridSearchCV의 결과값 metric 모델의 성능을 평가할 기준지표 (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\") model hyperparameters, ngine, mode 정보가 포함된 model object formula 모델링을 위한 수식 trainingData 훈련데이터 셋 splitedData train-test 데이터 분할 정보를 포함하고 있는 전체 데이터 셋 algo 사용자가 임의로 지정할 알고리즘명 (default: \"linear Regression\")","code":""},{"path":"/reference/fitBestModel.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"fitting in best model — fitBestModel","text":"gridSearchCV 함수 리턴값을 받아 가장 성능이 좋은 모델을 fitting합니다.","code":""},{"path":"/reference/gridSearchCV.html","id":null,"dir":"Reference","previous_headings":"","what":"Grid search with cross validation — gridSearchCV","title":"Grid search with cross validation — gridSearchCV","text":"Grid search cross validation","code":""},{"path":"/reference/gridSearchCV.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Grid search with cross validation — gridSearchCV","text":"","code":"gridSearchCV( rec = NULL, model = NULL, v = NULL, trainingData = NULL, parameterGrid = NULL, seed = NULL )"},{"path":"/reference/gridSearchCV.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Grid search with cross validation — gridSearchCV","text":"rec 데이터, 전처리 정보를 포함한 recipe object model hyperparameters, ngine, mode 정보가 포함된 model object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) trainingData 훈련데이터 셋 seed seed값 설정 parameter_grid grid search를 수행할 때 각 hyperparameter의 값을 담은 object","code":""},{"path":"/reference/gridSearchCV.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Grid search with cross validation — gridSearchCV","text":"하이퍼파라미터를 탐색하는 Grid Search와 데이터 셋을 나누어 평가하는 cross validation을 함께 수행합니다.","code":""},{"path":"/reference/kMeansClustering.html","id":null,"dir":"Reference","previous_headings":"","what":"K means clustering — kMeansClustering","title":"K means clustering — kMeansClustering","text":"K means clustering","code":""},{"path":"/reference/kMeansClustering.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"K means clustering — kMeansClustering","text":"","code":"kMeansClustering( data, maxK = 15, nStart = 25, iterMax = 10, nBoot = 100, algorithm = \"Hartigan-Wong\", selectOptimal = \"silhouette\", seedNum = 6471 )"},{"path":"/reference/kMeansClustering.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"K means clustering — kMeansClustering","text":"data 전처리가 완료된 데이터 maxK 클러스터링 수행 시 군집을 2, 3, ..., maxK개로 분할 (default: 15) iterMax 반복계산을 수행할 최대 횟수 (default: 10) nBoot gap statictic을 사용해 클러스터링을 수행할 때 Monte Carlo (bootstrap) 샘플의 개수 (selectOptimal == \"gap_stat\" 일 경우에만 지정, default: 100) algorithm K means를 수행할 알고리즘 선택 (\"Hartigan-Wong\" (default), \"Lloyd\", \"Forgy\", \"MacQueen\") selectOptimal 최적의 K값을 선정할 때 사용할 method 선택 (\"silhouette\" (default), \"gap_stat\") seedNum seed값 설정 nstart 랜덤 샘플에 대해 초기 클러스터링을 nstart번 시행 (default: 25)","code":""},{"path":"/reference/kMeansClustering.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"K means clustering — kMeansClustering","text":"K means clustering selectOptimal: silhouette, gap_stat hyperparameters: maxK, nstart","code":""},{"path":"/reference/KNN.html","id":null,"dir":"Reference","previous_headings":"","what":"K-Nearest Neighbors — KNN","title":"K-Nearest Neighbors — KNN","text":"K-Nearest Neighbors","code":""},{"path":"/reference/KNN.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"K-Nearest Neighbors — KNN","text":"","code":"KNN( algo = \"KNN\", engine = \"kknn\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/KNN.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"K-Nearest Neighbors — KNN","text":"algo 사용자가 임의로 지정할 알고리즘명 (default: \"KNN\") engine 모델을 생성할 때 사용할 패키지 (\"kknn\" (default)) mode 분석 유형 (\"classification\" (default), \"regression\") trainingData 훈련데이터 셋 splitedData train-test 데이터 분할 정보를 포함하고 있는 전체 데이터 셋 formula 모델링을 위한 수식 rec 데이터, 전처리 정보를 포함한 recipe object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) metric 모델의 성능을 평가할 기준지표 (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\") ... hyperparameters의 범위에 대한 Min, Max, Levels 값에 해당하는 파라미터를 지정합니다.","code":""},{"path":"/reference/KNN.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"K-Nearest Neighbors — KNN","text":"KNN 알고리즘 함수. 데이터로부터 거리가 가까운 K개의 다른 데이터의 레이블을 참조하여 분류하는 알고리즘 hyperparameters: neighbors","code":""},{"path":"/reference/lightGbm.html","id":null,"dir":"Reference","previous_headings":"","what":"Light GBM — lightGbm","title":"Light GBM — lightGbm","text":"Light GBM","code":""},{"path":"/reference/lightGbm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Light GBM — lightGbm","text":"","code":"lightGbm( algo = \"lightGBM\", engine = \"lightgbm\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 15, metric = NULL, seed = 1234 )"},{"path":"/reference/lightGbm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Light GBM — lightGbm","text":"algo 사용자가 임의로 지정할 알고리즘명 (default: \"lightGBM\") engine 모델을 생성할 때 사용할 패키지 (\"lightgbm\" (default)) mode 분석 유형 (\"classification\" (default), \"regression\") trainingData 훈련데이터 셋 splitedData train-test 데이터 분할 정보를 포함하고 있는 전체 데이터 셋 formula 모델링을 위한 수식 rec 데이터, 전처리 정보를 포함한 recipe object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) metric 모델의 성능을 평가할 기준지표 (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\") ... hyperparameters의 범위에 대한 Min, Max, Levels 값에 해당하는 파라미터를 지정합니다.","code":""},{"path":"/reference/lightGbm.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Light GBM — lightGbm","text":"Light GBM","code":""},{"path":"/reference/linearRegression.html","id":null,"dir":"Reference","previous_headings":"","what":"Linear Regression — linearRegression","title":"Linear Regression — linearRegression","text":"Linear Regression","code":""},{"path":"/reference/linearRegression.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Linear Regression — linearRegression","text":"","code":"linearRegression( algo = \"Linear Regression\", engine = \"glmnet\", mode = \"regression\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = \"rmse\", seed = 1234 )"},{"path":"/reference/linearRegression.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Linear Regression — linearRegression","text":"algo 사용자가 임의로 지정할 알고리즘명 (default: \"Linear Regression\") engine 모델을 생성할 때 사용할 패키지 (\"glmnet\" (default), \"lm\", \"glm\", \"stan\") mode 분석 유형 (\"regression\" (default)) trainingData 훈련데이터 셋 splitedData train-test 데이터 분할 정보를 포함하고 있는 전체 데이터 셋 formula 모델링을 위한 수식 rec 데이터, 전처리 정보를 포함한 recipe object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) metric 모델의 성능을 평가할 기준지표 (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\") ... hyperparameters의 범위에 대한 Min, Max, Levels 값에 해당하는 파라미터를 지정합니다.","code":""},{"path":"/reference/linearRegression.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Linear Regression — linearRegression","text":"선형 회귀 알고리즘 함수. 선형회귀는 다음과 같은 가정을 합니다. 1) target - features 간의 선형성, 2) features 간 작은 다중공선성, 3) 등분산성 가정, 4) 오차항의 정규분포, 5) 오차항 간 적은 상관성. 만약 데이터가 이 가정을 충족하지 않는 경우 성능이 저하될 수 있습니다. hyperparameters: penalty, mixture","code":""},{"path":"/reference/logisticRegression.html","id":null,"dir":"Reference","previous_headings":"","what":"logistic Regression — logisticRegression","title":"logistic Regression — logisticRegression","text":"로지스틱 회귀 알고리즘 함수. 예측 변수들이 정규분포를 따르지 않아도 사용할 수 있습니다. 그러나 이 알고리즘은 결과 변수가 선형적으로 구분되며, 예측 변수들의 값이 결과 변수와 선형 관계를 갖는다고 가정합니다. 만약 데이터가 이 가정을 충족하지 않는 경우 성능이 저하될 수 있습니다.","code":""},{"path":"/reference/logisticRegression.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"logistic Regression — logisticRegression","text":"","code":"logisticRegression( algo = \"logistic Regression\", engine = \"glmnet\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = \"roc_auc\", seed = 1234 )"},{"path":"/reference/logisticRegression.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"logistic Regression — logisticRegression","text":"algo 사용자가 임의로 지정할 알고리즘명 (default: \"logistic Regression\") engine 모델을 생성할 때 사용할 패키지 (\"glmnet\" (default)) mode 분석 유형 (\"classification\" (default)) trainingData 훈련데이터 셋 splitedData train-test 데이터 분할 정보를 포함하고 있는 전체 데이터 셋 formula 모델링을 위한 수식 rec 데이터, 전처리 정보를 포함한 recipe object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) metric 모델의 성능을 평가할 기준지표 (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\") ... hyperparameters의 범위에 대한 Min, Max, Levels 값에 해당하는 파라미터를 지정합니다.","code":""},{"path":"/reference/logisticRegression.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"logistic Regression — logisticRegression","text":"로지스틱 회귀 알고리즘 함수. 예측 변수들이 정규분포를 따르지 않아도 사용할 수 있습니다. 그러나 이 알고리즘은 결과 변수가 선형적으로 구분되며, 예측 변수들의 값이 결과 변수와 선형 관계를 갖는다고 가정합니다. 만약 데이터가 이 가정을 충족하지 않는 경우 성능이 저하될 수 있습니다. 필요 hyperparameters: penalty, mixture","code":""},{"path":"/reference/MLP.html","id":null,"dir":"Reference","previous_headings":"","what":"neural network — MLP","title":"neural network — MLP","text":"neural network","code":""},{"path":"/reference/MLP.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"neural network — MLP","text":"","code":"MLP( algo = \"MLP\", engine = \"nnet\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/MLP.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"neural network — MLP","text":"algo 사용자가 임의로 지정할 알고리즘명 (default: \"MLP\") engine 모델을 생성할 때 사용할 패키지 (\"nnet\" (default)) mode 분석 유형 (\"classification\" (default), \"regression\") trainingData 훈련데이터 셋 splitedData train-test 데이터 분할 정보를 포함하고 있는 전체 데이터 셋 formula 모델링을 위한 수식 rec 데이터, 전처리 정보를 포함한 recipe object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) metric 모델의 성능을 평가할 기준지표 (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\") ... hyperparameters의 범위에 대한 Min, Max, Levels 값에 해당하는 파라미터를 지정합니다.","code":""},{"path":"/reference/MLP.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"neural network — MLP","text":"neural network 알고리즘 함수. neural network 모델은 생물학적인 뉴런을 수학적으로 모델링한 것. 여러개의 뉴런으로부터 입력값을 받아서 세포체에 저장하다가 자신의 용량을 넘어서면 외부로 출력값을 내보내는 것처럼, 인공신경망 뉴런은 여러 입력값을 받아서 일정 수준이 넘어서면 활성화되어 출력값을 내보낸다. hyperparameters: hidden_units, penalty, dropout, epochs, activation, learn_rate","code":""},{"path":"/reference/naiveBayes.html","id":null,"dir":"Reference","previous_headings":"","what":"Naive Bayes — naiveBayes","title":"Naive Bayes — naiveBayes","text":"Naive Bayes","code":""},{"path":"/reference/naiveBayes.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Naive Bayes — naiveBayes","text":"","code":"naiveBayes( algo = \"Naive Bayes\", engine = \"klaR\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/naiveBayes.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Naive Bayes — naiveBayes","text":"algo 사용자가 임의로 지정할 알고리즘명 (default: \"Naive Bayes\") engine 모델을 생성할 때 사용할 패키지 (\"klaR\" (default), naivebayes) mode 분석 유형 (\"classification\" (default)) trainingData 훈련데이터 셋 splitedData train-test 데이터 분할 정보를 포함하고 있는 전체 데이터 셋 formula 모델링을 위한 수식 rec 데이터, 전처리 정보를 포함한 recipe object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) metric 모델의 성능을 평가할 기준지표 (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\") ... hyperparameters의 범위에 대한 Min, Max, Levels 값에 해당하는 파라미터를 지정합니다.","code":""},{"path":"/reference/naiveBayes.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Naive Bayes — naiveBayes","text":"Naive Bayes hyperparameters: smoothness, Laplace","code":""},{"path":"/reference/pipe.html","id":null,"dir":"Reference","previous_headings":"","what":"AUC-ROC Curve — %>%","title":"AUC-ROC Curve — %>%","text":"AUC-ROC Curve Confusion matrix Regression plot Evaluation metrics Classification","code":""},{"path":"/reference/pipe.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"AUC-ROC Curve — %>%","text":"","code":"rocCurve(modelsList, targetVar) confusionMatrix(modelName, modelsList, targetVar) regressionPlot(modelName, modelsList, targetVar) evalMetricsC(modelsList, targetVar)"},{"path":"/reference/pipe.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"AUC-ROC Curve — %>%","text":"modelsList ML 모델 리스트 targetVar 타겟 변수 modelName 모델명","code":""},{"path":"/reference/pipe.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"AUC-ROC Curve — %>%","text":"ML 모델 리스트로부터 AUC-ROC Curve를 생성합니다. ML 모델 리스트 내 특정 모델에 대해 Confusion matrix를 생성합니다. ML 모델 리스트 내 특정 모델에 대해 Regression plot를 생성합니다. ML 모델 리스트로부터 Classification 모델들에 대한 Evaluation metrics를 생성합니다.","code":""},{"path":"/reference/prepForCV.html","id":null,"dir":"Reference","previous_headings":"","what":"Preprocessing for cross validation — prepForCV","title":"Preprocessing for cross validation — prepForCV","text":"Preprocessing cross validation","code":""},{"path":"/reference/prepForCV.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Preprocessing for cross validation — prepForCV","text":"","code":"prepForCV( data = NULL, formula = NULL, imputation = FALSE, normalization = FALSE, nominalImputationType = \"mode\", numericImputationType = \"mean\", normalizationType = \"range\", seed = \"4814\" )"},{"path":"/reference/prepForCV.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Preprocessing for cross validation — prepForCV","text":"data data formula formula imputation imputation normalization normalization normalizationType normalizationType seed seed imputationType imputationType","code":""},{"path":"/reference/prepForCV.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Preprocessing for cross validation — prepForCV","text":"Deprecated","code":""},{"path":"/reference/randomForest.html","id":null,"dir":"Reference","previous_headings":"","what":"Random Forest — randomForest","title":"Random Forest — randomForest","text":"Random Forest","code":""},{"path":"/reference/randomForest.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Random Forest — randomForest","text":"","code":"randomForest( algo = \"Random Forest\", engine = \"ranger\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/randomForest.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Random Forest — randomForest","text":"algo 사용자가 임의로 지정할 알고리즘명 (default: \"Random Forest\") engine 모델을 생성할 때 사용할 패키지 (\"rpart\" (default), \"randomForest\", \"partykit\") mode 분석 유형 (\"classification\" (default), \"regression\") trainingData 훈련데이터 셋 splitedData train-test 데이터 분할 정보를 포함하고 있는 전체 데이터 셋 formula 모델링을 위한 수식 rec 데이터, 전처리 정보를 포함한 recipe object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) metric 모델의 성능을 평가할 기준지표 (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\") ... hyperparameters의 범위에 대한 Min, Max, Levels 값에 해당하는 파라미터를 지정합니다.","code":""},{"path":"/reference/randomForest.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Random Forest — randomForest","text":"랜덤 포레스트 알고리즘 함수. 여러개의 Decision Tree를 형성. 새로운 데이터 포인트를 각 트리에 동시에 통과 시켜 각 트리가 분류한 결과에서 투표를 실시하여 가장 많이 득표한 결과를 최종 분류 결과로 선택 hyperparameters: trees: 결정트리의 개수를 지정합니다. min_n: 트리를 분할하기 위해 필요한 관측값의 최소 개수를 설정합니다. mtry: 트리를 분할하기 위해 필요한 feature의 수를 설정합니다.","code":""},{"path":"/reference/SVMLinear.html","id":null,"dir":"Reference","previous_headings":"","what":"SVMLinear — SVMLinear","title":"SVMLinear — SVMLinear","text":"SVMLinear","code":""},{"path":"/reference/SVMLinear.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"SVMLinear — SVMLinear","text":"","code":"SVMLinear( algo = \"SVM\", engine = \"kernlab\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 15, metric = NULL, seed = 1234 )"},{"path":"/reference/SVMLinear.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"SVMLinear — SVMLinear","text":"SVMLinear","code":""},{"path":"/reference/SVMPoly.html","id":null,"dir":"Reference","previous_headings":"","what":"SVMPoly — SVMPoly","title":"SVMPoly — SVMPoly","text":"SVMPoly","code":""},{"path":"/reference/SVMPoly.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"SVMPoly — SVMPoly","text":"","code":"SVMPoly( algo = \"SVM\", engine = \"kernlab\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 15, metric = NULL, seed = 1234 )"},{"path":"/reference/SVMPoly.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"SVMPoly — SVMPoly","text":"SVMPoly","code":""},{"path":"/reference/SVMRbf.html","id":null,"dir":"Reference","previous_headings":"","what":"SVMRbf — SVMRbf","title":"SVMRbf — SVMRbf","text":"SVMRbf","code":""},{"path":"/reference/SVMRbf.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"SVMRbf — SVMRbf","text":"","code":"SVMRbf( algo = \"SVM\", engine = \"kernlab\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 15, metric = NULL, seed = 1234 )"},{"path":"/reference/SVMRbf.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"SVMRbf — SVMRbf","text":"SVMRbf","code":""},{"path":"/reference/trainTestSplit.html","id":null,"dir":"Reference","previous_headings":"","what":"Train-Test Split — trainTestSplit","title":"Train-Test Split — trainTestSplit","text":"Train-Test Split","code":""},{"path":"/reference/trainTestSplit.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Train-Test Split — trainTestSplit","text":"","code":"trainTestSplit(data = NULL, target = NULL, prop, seed = \"4814\")"},{"path":"/reference/trainTestSplit.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Train-Test Split — trainTestSplit","text":"data 전처리가 완료된 전체 data target 타겟 변수 prop 전체 데이터 중 훈련 데이터로 사용할 비율 seed seed값 설정","code":""},{"path":"/reference/trainTestSplit.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Train-Test Split — trainTestSplit","text":"Data를 Train set과 Test set으로 분리합니다.","code":""},{"path":"/reference/xgBoost.html","id":null,"dir":"Reference","previous_headings":"","what":"XGBoost — xgBoost","title":"XGBoost — xgBoost","text":"XGBoost","code":""},{"path":"/reference/xgBoost.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"XGBoost — xgBoost","text":"","code":"xgBoost( algo = \"XGBoost\", engine = \"xgboost\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/xgBoost.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"XGBoost — xgBoost","text":"algo 사용자가 임의로 지정할 알고리즘명 (default: \"XGBoost\") engine 모델을 생성할 때 사용할 패키지 (\"xgboost\" (default)) mode 분석 유형 (\"classification\" (default), \"regression\") trainingData 훈련데이터 셋 splitedData train-test 데이터 분할 정보를 포함하고 있는 전체 데이터 셋 formula 모델링을 위한 수식 rec 데이터, 전처리 정보를 포함한 recipe object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) metric 모델의 성능을 평가할 기준지표 (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\") ... hyperparameters의 범위에 대한 Min, Max, Levels 값에 해당하는 파라미터를 지정합니다.","code":""},{"path":"/reference/xgBoost.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"XGBoost — xgBoost","text":"XGBoost","code":""}]
+[{"path":"/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2023 stove authors Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Yeonchan Seong. Author, maintainer.","code":""},{"path":"/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Seong Y (2023). stove: Stove. R package version 1.1, https://github.com/statgarten/stove.","code":"@Manual{, title = {stove: Stove}, author = {Yeonchan Seong}, year = {2023}, note = {R package version 1.1}, url = {https://github.com/statgarten/stove}, }"},{"path":"/index.html","id":"yellow_heart-stove-","dir":"","previous_headings":"","what":"Stove","title":"Stove","text":"stove package provides functions ML modeling. Packages Tidymodels used, configured easy ML beginners use. Although belongs statgarten whose packages incorporated shiny app, stove package also can used console.","code":""},{"path":"/index.html","id":"wrench-install","dir":"","previous_headings":"","what":"🔧 Install","title":"Stove","text":"","code":"# install.packages(\"devtools\") devtools::install_github(\"statgarten/stove\")"},{"path":[]},{"path":"/index.html","id":"id_1-sample-data-import","dir":"","previous_headings":"Example Code","what":"1. Sample Data Import","title":"Stove","text":"```{r} # remotes::install_github(“statgarten/datatoys”) library(stove) library(datatoys) library(dplyr) set.seed(1234) cleaned_data <- datatoys::bloodTest cleaned_data <- cleaned_data %>% mutate_at(vars(SEX, ANE, IHD, STK), factor) %>% mutate(TG = ifelse(TG < 150, 0, 1)) %>% mutate_at(vars(TG), factor) %>% group_by(TG) %>% sample_n(500) # TG(0):TG(1) = 500:500","code":"### 2. Data split and Define preprocessing ```{r} target_var <- \"TG\" train_set_ratio <- 0.7 seed <- 1234 formula <- paste0(target_var, \" ~ .\") # Split data split_tmp <- stove::trainTestSplit(data = cleaned_data, target = target_var, prop = train_set_ratio, seed = seed ) data_train <- split_tmp[[1]] # train data data_test <- split_tmp[[2]] # test data data_split <- split_tmp[[3]] # whole data with split information # Define preprocessing recipe for cross validation rec <- stove::prepForCV(data = data_train, formula = formula, imputation = T, normalization = T, seed = seed )"},{"path":"/index.html","id":"id_3-modeling","dir":"","previous_headings":"Example Code","what":"3. Modeling","title":"Stove","text":"```{r} # User input mode <- “classification” algo <- “logisticRegression” # Custom name engine <- “glmnet” # glmnet (default) v <- 2 metric <- “roc_auc” # roc_auc (default), accuracy gridNum <- 5 iter <- 10 seed <- 1234","code":""},{"path":"/index.html","id":"modeling-using-logistic-regression-algorithm","dir":"","previous_headings":"","what":"Modeling using logistic regression algorithm","title":"Stove","text":"finalized <- stove::logisticRegression( algo = algo, engine = engine, mode = mode, trainingData = data_train, splitedData = data_split, formula = formula, rec = rec, v = v, gridNum = gridNum, iter = iter, metric = metric, seed = seed ) ``` can compare several models’ performance visualize . documents contain example codes modeling workflow using stove.","code":""},{"path":"/index.html","id":"white_check_mark-recommendation","dir":"","previous_headings":"","what":"✅ Recommendation","title":"Stove","text":"training ML model, amount data required depends complexity task want solve complexity learning algorithm. ‘stove’ support training process without cross-validation. recommend training model data least 1,000 rows.","code":""},{"path":"/index.html","id":"blush-authors","dir":"","previous_headings":"","what":"😊 Authors","title":"Stove","text":"Yeonchan Seong @ycseong07","code":""},{"path":"/index.html","id":"memo-license","dir":"","previous_headings":"","what":"📝 License","title":"Stove","text":"Copyright ©️ 2022 Yeonchan Seong project MIT licensed","code":""},{"path":"/index.html","id":"clipboard-dependency","dir":"","previous_headings":"","what":"📋 Dependency","title":"Stove","text":"assertthat - 0.2.1 base64enc - 0.1-3 bayesplot - 1.10.0 boot - 1.3-28.1 C50 - 0.1.7 callr - 3.7.3 class - 7.3-20 cli - 3.6.0 cluster - 2.1.4 codetools - 0.2-18 colorspace - 2.0-3 colourpicker - 1.2.0 combinat - 0.0-8 cowplot - 1.1.1 crayon - 1.5.2 crosstalk - 1.2.0 Cubist - 0.4.1 data.table - 1.14.6 DBI - 1.1.3 dials - 1.1.0 DiceDesign - 1.9 digest - 0.6.31 discrim - 1.0.0 dplyr - 1.0.10 DT - 0.26 dygraphs - 1.1.1.6 ellipsis - 0.3.2 factoextra - 1.0.7 fansi - 1.0.3 fastmap - 1.1.0 forcats - 0.5.2 foreach - 1.5.2 Formula - 1.2-4 furrr - 0.3.1 future - 1.30.0 future.apply - 1.10.0 generics - 0.1.3 ggplot2 - 3.4.0 ggrepel - 0.9.2 glmnet - 4.1-6 globals - 0.16.2 glue - 1.6.2 gower - 1.0.1 GPfit - 1.0-8 gridExtra - 2.3 gtable - 0.3.1 gtools - 3.9.4 hardhat - 1.2.0 haven - 2.5.1 highr - 0.1 hms - 1.1.2 htmltools - 0.5.4 htmlwidgets - 1.6.1 httpuv - 1.6.7 igraph - 1.3.5 inline - 0.3.19 inum - 1.0-4 ipred - 0.9-13 iterators - 1.0.14 kknn - 1.3.1 klaR - 1.7-1 labelled - 2.10.0 later - 1.3.0 lattice - 0.20-45 lava - 1.7.1 lhs - 1.1.6 libcoin - 1.0-9 lifecycle - 1.0.3 listenv - 0.9.0 lme4 - 1.1-31 loo - 2.5.1 lubridate - 1.9.0 magrittr - 2.0.3 markdown - 1.4 MASS - 7.3-58.1 Matrix - 1.5-3 matrixStats - 0.63.0 mime - 0.12 miniUI - 0.1.1.1 minqa - 1.2.5 munsell - 0.5.0 mvtnorm - 1.1-3 naivebayes - 0.9.7 nlme - 3.1-161 nloptr - 2.0.3 nnet - 7.3-18 parallelly - 1.33.0 parsnip - 1.0.3 partykit - 1.2-16 pillar - 1.8.1 pkgbuild - 1.4.0 pkgconfig - 2.0.3 plyr - 1.8.8 prettyunits - 1.1.1 processx - 3.8.0 prodlim - 2019.11.13 promises - 1.2.0.1 ps - 1.7.0 purrr - 0.3.4 questionr - 0.7.7 R6 - 2.5.1 randomForest - 4.7-1.1 ranger - 0.14.1 RColorBrewer - 1.1-3 Rcpp - 1.0.9 RcppParallel - 5.1.6 recipes - 1.0.3 reshape2 - 1.4.4 rlang - rpart - 4.1.19 rsample - 1.1.1 rstan - 2.21.7 rstanarm - 2.21.3 rstantools - 2.2.0 rstudioapi - 0.14 scales - 1.2.1 sessioninfo - 1.2.2 shape - 1.4.6 shiny - 1.7.4 shinyjs - 2.1.0 shinystan - 2.6.0 shinythemes - 1.2.0 StanHeaders - 2.21.0-7 stringi - 1.7.8 stringr - 1.5.0 survival - 3.5-0 threejs - 0.3.3 tibble - 3.1.8 tidyr - 1.2.1 tidyselect - 1.2.0 timechange - 0.1.1 timeDate - 4022.108 treesnip - 0.1.0.9001 tune - 1.0.1 utf8 - 1.2.2 vctrs - 0.5.1 withr - 2.5.0 workflows - 1.1.2 xtable - 1.8-4 xts - 0.12.2 yardstick - 1.1.0 zoo - 1.8-11","code":""},{"path":"/reference/KNN.html","id":null,"dir":"Reference","previous_headings":"","what":"K-Nearest Neighbors — KNN","title":"K-Nearest Neighbors — KNN","text":"K-Nearest Neighbors","code":""},{"path":"/reference/KNN.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"K-Nearest Neighbors — KNN","text":"","code":"KNN( algo = \"KNN\", engine = \"kknn\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/KNN.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"K-Nearest Neighbors — KNN","text":"algo name algorithm can customized user (default: \"KNN\"). engine name software used fit model (\"kknn\" (default)). mode model type. \"classification\" \"regression\" (\"classification\" (default), \"regression\"). trainingData training data. splitedData data frame including metadata split. formula formula modeling rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/KNN.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"K-Nearest Neighbors — KNN","text":"function training user-defined K-Nearest Neighbors model. Hyperparameters tuning: neighbors","code":""},{"path":"/reference/MLP.html","id":null,"dir":"Reference","previous_headings":"","what":"neural network — MLP","title":"neural network — MLP","text":"neural network","code":""},{"path":"/reference/MLP.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"neural network — MLP","text":"","code":"MLP( algo = \"MLP\", engine = \"nnet\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/MLP.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"neural network — MLP","text":"algo name algorithm can customized user (default: \"MLP\"). engine name software used fit model (\"nnet\" (default)). mode model type. \"classification\" \"regression\" (\"classification\" (default), \"regression\"). trainingData training data. splitedData data frame including metadata split. formula formula modeling rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/MLP.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"neural network — MLP","text":"function training user-defined MLP model. Hyperparameters tuning: hidden_units, penalty, epochs","code":""},{"path":"/reference/SVMLinear.html","id":null,"dir":"Reference","previous_headings":"","what":"SVMLinear — SVMLinear","title":"SVMLinear — SVMLinear","text":"SVMLinear","code":""},{"path":"/reference/SVMLinear.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"SVMLinear — SVMLinear","text":"","code":"SVMLinear( algo = \"SVMLinear\", engine = \"kernlab\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 15, metric = NULL, seed = 1234 )"},{"path":"/reference/SVMLinear.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"SVMLinear — SVMLinear","text":"algo name algorithm can customized user (default: \"SVMLinear\"). engine name software used fit model (\"kernlab\" (default)). mode model type. \"classification\" \"regression\" (\"classification\" (default), \"regression\"). trainingData training data. splitedData data frame including metadata split. formula formula modeling rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/SVMLinear.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"SVMLinear — SVMLinear","text":"function training user-defined SVM Linear model.","code":""},{"path":"/reference/SVMPoly.html","id":null,"dir":"Reference","previous_headings":"","what":"SVMPoly — SVMPoly","title":"SVMPoly — SVMPoly","text":"SVMPoly","code":""},{"path":"/reference/SVMPoly.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"SVMPoly — SVMPoly","text":"","code":"SVMPoly( algo = \"SVMPoly\", engine = \"kernlab\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 15, metric = NULL, seed = 1234 )"},{"path":"/reference/SVMPoly.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"SVMPoly — SVMPoly","text":"algo name algorithm can customized user (default: \"SVMPoly\"). engine name software used fit model (\"kernlab\" (default)). mode model type. \"classification\" \"regression\" (\"classification\" (default), \"regression\"). trainingData training data. splitedData data frame including metadata split. formula formula modeling rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/SVMPoly.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"SVMPoly — SVMPoly","text":"function training user-defined SVM Poly model.","code":""},{"path":"/reference/SVMRbf.html","id":null,"dir":"Reference","previous_headings":"","what":"SVMRbf — SVMRbf","title":"SVMRbf — SVMRbf","text":"SVMRbf","code":""},{"path":"/reference/SVMRbf.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"SVMRbf — SVMRbf","text":"","code":"SVMRbf( algo = \"SVMRbf\", engine = \"kernlab\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 15, metric = NULL, seed = 1234 )"},{"path":"/reference/SVMRbf.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"SVMRbf — SVMRbf","text":"algo name algorithm can customized user (default: \"SVMRbf\"). engine name software used fit model (\"kernlab\" (default)). mode model type. \"classification\" \"regression\" (\"classification\" (default), \"regression\"). trainingData training data. splitedData data frame including metadata split. formula formula modeling rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/SVMRbf.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"SVMRbf — SVMRbf","text":"function training user-defined SVM Rbf model.","code":""},{"path":"/reference/bayesOptCV.html","id":null,"dir":"Reference","previous_headings":"","what":"Bayesian optimization with cross validation — bayesOptCV","title":"Bayesian optimization with cross validation — bayesOptCV","text":"Bayesian optimization cross validation","code":""},{"path":"/reference/bayesOptCV.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Bayesian optimization with cross validation — bayesOptCV","text":"","code":"bayesOptCV( rec = NULL, model = NULL, v = NULL, trainingData = NULL, gridNum = NULL, iter = NULL, seed = NULL )"},{"path":"/reference/bayesOptCV.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Bayesian optimization with cross validation — bayesOptCV","text":"rec recipe object including local preprocessing. model model object including list hyperparameters, engine mode. v Perform cross-validation dividing training data v folds. trainingData training data. gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. seed Seed reproducible results.","code":""},{"path":"/reference/bayesOptCV.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Bayesian optimization with cross validation — bayesOptCV","text":"Optimize hyperparameters model Cross Validation Bayesian optimization.","code":""},{"path":"/reference/clusteringVis.html","id":null,"dir":"Reference","previous_headings":"","what":"clusteringVis — clusteringVis","title":"clusteringVis — clusteringVis","text":"clusteringVis","code":""},{"path":"/reference/clusteringVis.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"clusteringVis — clusteringVis","text":"","code":"clusteringVis( data = NULL, model = NULL, maxK = \"15\", nBoot = \"100\", selectOptimal = \"silhouette\", seedNum = \"6471\" )"},{"path":"/reference/clusteringVis.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"clusteringVis — clusteringVis","text":"data data model model maxK maxK nStart nStart","code":""},{"path":"/reference/clusteringVis.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"clusteringVis — clusteringVis","text":"Deprecated","code":""},{"path":"/reference/decisionTree.html","id":null,"dir":"Reference","previous_headings":"","what":"Decision Tree — decisionTree","title":"Decision Tree — decisionTree","text":"Decision Tree","code":""},{"path":"/reference/decisionTree.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Decision Tree — decisionTree","text":"","code":"decisionTree( algo = \"Decision Tree\", engine = \"rpart\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/decisionTree.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Decision Tree — decisionTree","text":"algo name algorithm can customized user (default: \"Decision Tree\"). engine name software used fit model (\"rpart\" (default), \"C50\", \"partykit\"). mode model type. \"classification\" \"regression\" (\"classification\" (default), \"regression\"). trainingData training data. splitedData data frame including metadata split. formula formula modeling rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/decisionTree.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Decision Tree — decisionTree","text":"function training user-defined Decision Tree model. Hyperparameters tuning: tree_depth, min_n, cost_complexity","code":""},{"path":"/reference/evalMetricsR.html","id":null,"dir":"Reference","previous_headings":"","what":"Evaluation metrics for Regression — evalMetricsR","title":"Evaluation metrics for Regression — evalMetricsR","text":"Evaluation metrics Regression","code":""},{"path":"/reference/evalMetricsR.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Evaluation metrics for Regression — evalMetricsR","text":"","code":"evalMetricsR(modelsList, targetVar)"},{"path":"/reference/evalMetricsR.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Evaluation metrics for Regression — evalMetricsR","text":"modelsList ML 모델 리스트 targetVar 타겟 변수","code":""},{"path":"/reference/evalMetricsR.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Evaluation metrics for Regression — evalMetricsR","text":"ML 모델 리스트로부터 Regression 모델들에 대한 Evaluation metrics를 생성합니다.","code":""},{"path":"/reference/fitBestModel.html","id":null,"dir":"Reference","previous_headings":"","what":"fitting in best model — fitBestModel","title":"fitting in best model — fitBestModel","text":"fitting best model","code":""},{"path":"/reference/fitBestModel.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"fitting in best model — fitBestModel","text":"","code":"fitBestModel( optResult, metric, model, formula, trainingData, splitedData, modelName )"},{"path":"/reference/fitBestModel.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"fitting in best model — fitBestModel","text":"optResult result object bayesOptCV metric Baseline metric evaluating model performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\") model model object including list hyperparameters, engine mode. formula formula modeling trainingData training data. splitedData whole dataset including information fold modelName name model defined algorithm engine selected user","code":""},{"path":"/reference/fitBestModel.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"fitting in best model — fitBestModel","text":"Get bayesOptCV function's return value fit model.","code":""},{"path":"/reference/gridSearchCV.html","id":null,"dir":"Reference","previous_headings":"","what":"Grid search with cross validation — gridSearchCV","title":"Grid search with cross validation — gridSearchCV","text":"Grid search cross validation","code":""},{"path":"/reference/gridSearchCV.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Grid search with cross validation — gridSearchCV","text":"","code":"gridSearchCV( rec = NULL, model = NULL, v = NULL, trainingData = NULL, parameterGrid = NULL, seed = NULL )"},{"path":"/reference/gridSearchCV.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Grid search with cross validation — gridSearchCV","text":"rec 데이터, 전처리 정보를 포함한 recipe object model hyperparameters, ngine, mode 정보가 포함된 model object v v-fold cross validation을 진행 (default: 5, 각 fold 별로 30개 이상의 observations가 있어야 유효한 모델링 결과를 얻을 수 있습니다.) trainingData 훈련데이터 셋 seed seed값 설정 parameter_grid grid search를 수행할 때 각 hyperparameter의 값을 담은 object","code":""},{"path":"/reference/gridSearchCV.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Grid search with cross validation — gridSearchCV","text":"하이퍼파라미터를 탐색하는 Grid Search와 데이터 셋을 나누어 평가하는 cross validation을 함께 수행합니다.","code":""},{"path":"/reference/kMeansClustering.html","id":null,"dir":"Reference","previous_headings":"","what":"K means clustering — kMeansClustering","title":"K means clustering — kMeansClustering","text":"K means clustering","code":""},{"path":"/reference/kMeansClustering.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"K means clustering — kMeansClustering","text":"","code":"kMeansClustering( data, maxK = 15, nStart = 25, iterMax = 10, nBoot = 100, algorithm = \"Hartigan-Wong\", selectOptimal = \"silhouette\", seedNum = 6471 )"},{"path":"/reference/kMeansClustering.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"K means clustering — kMeansClustering","text":"data 전처리가 완료된 데이터 maxK 클러스터링 수행 시 군집을 2, 3, ..., maxK개로 분할 (default: 15) iterMax 반복계산을 수행할 최대 횟수 (default: 10) nBoot gap statictic을 사용해 클러스터링을 수행할 때 Monte Carlo (bootstrap) 샘플의 개수 (selectOptimal == \"gap_stat\" 일 경우에만 지정, default: 100) algorithm K means를 수행할 알고리즘 선택 (\"Hartigan-Wong\" (default), \"Lloyd\", \"Forgy\", \"MacQueen\") selectOptimal 최적의 K값을 선정할 때 사용할 method 선택 (\"silhouette\" (default), \"gap_stat\") seedNum seed값 설정 nstart 랜덤 샘플에 대해 초기 클러스터링을 nstart번 시행 (default: 25)","code":""},{"path":"/reference/kMeansClustering.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"K means clustering — kMeansClustering","text":"function K means clustering. parameters tuning: maxK, nstart","code":""},{"path":"/reference/lightGbm.html","id":null,"dir":"Reference","previous_headings":"","what":"Light GBM — lightGbm","title":"Light GBM — lightGbm","text":"Light GBM","code":""},{"path":"/reference/lightGbm.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Light GBM — lightGbm","text":"","code":"lightGbm( algo = \"lightGBM\", engine = \"lightgbm\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 15, metric = NULL, seed = 1234 )"},{"path":"/reference/lightGbm.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Light GBM — lightGbm","text":"algo name algorithm can customized user. (default: \"lightGBM\"). engine name software used fit model(\"lightgbm\" (default)). mode model type. \"classification\" \"regression\" (\"classification\" (default), \"regression\"). trainingData training data. splitedData data frame including metadata split. formula formula modeling rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/lightGbm.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Light GBM — lightGbm","text":"function training user-defined Light GBM model. Hyperparameters tuning: tree_depth, trees, learn_rate, mtry, min_n, loss_reduction","code":""},{"path":"/reference/linearRegression.html","id":null,"dir":"Reference","previous_headings":"","what":"Linear Regression — linearRegression","title":"Linear Regression — linearRegression","text":"Linear Regression","code":""},{"path":"/reference/linearRegression.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Linear Regression — linearRegression","text":"","code":"linearRegression( algo = \"Linear Regression\", engine = \"glmnet\", mode = \"regression\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = \"rmse\", seed = 1234 )"},{"path":"/reference/linearRegression.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Linear Regression — linearRegression","text":"algo name algorithm can customized user (default: \"Linear Regression\"). engine name software used fit model (\"glmnet\" (default), \"lm\", \"glm\", \"stan\"). mode model type. \"classification\" \"regression\" (\"regression\" (default)). trainingData training data. splitedData data frame including metadata split. formula formula modeling rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/linearRegression.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Linear Regression — linearRegression","text":"function training user-defined Linear Regression model. Hyperparameters tuning: penalty, mixture","code":""},{"path":"/reference/logisticRegression.html","id":null,"dir":"Reference","previous_headings":"","what":"Logistic Regression — logisticRegression","title":"Logistic Regression — logisticRegression","text":"function training user-defined Logistic regression model. function supports: binary classification","code":""},{"path":"/reference/logisticRegression.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Logistic Regression — logisticRegression","text":"","code":"logisticRegression( algo = \"Logistic Regression\", engine = \"glmnet\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = \"roc_auc\", seed = 1234 )"},{"path":"/reference/logisticRegression.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Logistic Regression — logisticRegression","text":"algo name algorithm can customized user (default: \"Logistic Regression\"). engine name software used fit model (Option: \"glmnet\" (default)). mode model type. \"classification\" \"regression\" (Option: \"classification\" (default)). trainingData training data. splitedData whole dataset including information fold formula formula modeling rec Recipe object containing preprocessing information cross-validation v Applying v-fold cross validation modeling process (default: 5) gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/logisticRegression.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Logistic Regression — logisticRegression","text":"Hyperparameters tuning: penalty, mixture","code":""},{"path":"/reference/multinomialRegression.html","id":null,"dir":"Reference","previous_headings":"","what":"Multinomial Regression — multinomialRegression","title":"Multinomial Regression — multinomialRegression","text":"function training user-defined Multinomial regression model. function supports: multinomial classification","code":""},{"path":"/reference/multinomialRegression.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Multinomial Regression — multinomialRegression","text":"","code":"multinomialRegression( algo = \"Multinomial Regression\", engine = \"glmnet\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = \"roc_auc\", seed = 1234 )"},{"path":"/reference/multinomialRegression.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Multinomial Regression — multinomialRegression","text":"algo name algorithm can customized user (default: \"Multinomial Regression\"). engine name software used fit model (Option: \"glmnet\" (default)). mode model type. \"classification\" \"regression\" (Option: \"classification\" (default)). trainingData data frame training. splitedData data frame including metadata split. formula formula modeling. rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/multinomialRegression.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Multinomial Regression — multinomialRegression","text":"Hyperparameters tuning: penalty, mixture","code":""},{"path":"/reference/naiveBayes.html","id":null,"dir":"Reference","previous_headings":"","what":"Naive Bayes — naiveBayes","title":"Naive Bayes — naiveBayes","text":"Naive Bayes","code":""},{"path":"/reference/naiveBayes.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Naive Bayes — naiveBayes","text":"","code":"naiveBayes( algo = \"Naive Bayes\", engine = \"klaR\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/naiveBayes.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Naive Bayes — naiveBayes","text":"algo name algorithm can customized user (default: \"Naive Bayes\"). engine name software used fit model (\"klaR\" (default), naivebayes). mode model type. \"classification\" \"regression\" (\"classification\" (default)). trainingData training data. splitedData data frame including metadata split. formula formula modeling rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/naiveBayes.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Naive Bayes — naiveBayes","text":"function training user-defined Naive Bayes model. Hyperparameters tuning: smoothness, Laplace","code":""},{"path":"/reference/pipe.html","id":null,"dir":"Reference","previous_headings":"","what":"AUC-ROC Curve — %>%","title":"AUC-ROC Curve — %>%","text":"AUC-ROC Curve Confusion matrix Regression plot Evaluation metrics Classification","code":""},{"path":"/reference/pipe.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"AUC-ROC Curve — %>%","text":"","code":"rocCurve(modelsList, targetVar) confusionMatrix(modelName, modelsList, targetVar) regressionPlot(modelName, modelsList, targetVar) evalMetricsC(modelsList, targetVar)"},{"path":"/reference/pipe.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"AUC-ROC Curve — %>%","text":"modelsList ML 모델 리스트 targetVar 타겟 변수 modelName 모델명","code":""},{"path":"/reference/pipe.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"AUC-ROC Curve — %>%","text":"ML 모델 리스트로부터 AUC-ROC Curve를 생성합니다. ML 모델 리스트 내 특정 모델에 대해 Confusion matrix를 생성합니다. ML 모델 리스트 내 특정 모델에 대해 Regression plot를 생성합니다. ML 모델 리스트로부터 Classification 모델들에 대한 Evaluation metrics를 생성합니다.","code":""},{"path":"/reference/plotRmseComparison.html","id":null,"dir":"Reference","previous_headings":"","what":"rmsePlot — plotRmseComparison","title":"rmsePlot — plotRmseComparison","text":"rmsePlot","code":""},{"path":"/reference/plotRmseComparison.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"rmsePlot — plotRmseComparison","text":"","code":"plotRmseComparison(tunedResultsList, v = v, iter = iter)"},{"path":"/reference/plotRmseComparison.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"rmsePlot — plotRmseComparison","text":"rmsePlot","code":""},{"path":"/reference/prepForCV.html","id":null,"dir":"Reference","previous_headings":"","what":"Preprocessing for cross validation — prepForCV","title":"Preprocessing for cross validation — prepForCV","text":"Preprocessing cross validation","code":""},{"path":"/reference/prepForCV.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Preprocessing for cross validation — prepForCV","text":"","code":"prepForCV( data = NULL, formula = NULL, imputation = FALSE, normalization = FALSE, nominalImputationType = \"mode\", numericImputationType = \"mean\", normalizationType = \"range\", seed = \"4814\" )"},{"path":"/reference/prepForCV.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Preprocessing for cross validation — prepForCV","text":"data Training dataset apply local preprocessing recipe. formula formula modeling imputation \"imputation = TRUE\", model trained using cross-validation imputation. normalization \"normalization = TRUE\", model trained using cross-validation normalization nominalImputationType Imputation method nominal variable (Option: mode(default), bag, knn) numericImputationType Imputation method numeric variable (Option: mean(default), bag, knn, linear, lower, median, roll) normalizationType Normalization method (Option: range(default), center, normalization, scale) seed seed","code":""},{"path":"/reference/prepForCV.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Preprocessing for cross validation — prepForCV","text":"Define local preprocessing method applied training data fold training data divided several folds.","code":""},{"path":"/reference/randomForest.html","id":null,"dir":"Reference","previous_headings":"","what":"Random Forest — randomForest","title":"Random Forest — randomForest","text":"Random Forest","code":""},{"path":"/reference/randomForest.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Random Forest — randomForest","text":"","code":"randomForest( algo = \"Random Forest\", engine = \"ranger\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/randomForest.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Random Forest — randomForest","text":"algo name algorithm can customized user (default: \"Random Forest\"). engine name software used fit model (\"rpart\" (default), \"randomForest\", \"partykit\"). mode model type. \"classification\" \"regression\" (\"classification\" (default), \"regression\"). trainingData training data. splitedData data frame including metadata split. formula formula modeling rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/randomForest.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Random Forest — randomForest","text":"function training user-defined Random Forest model. Hyperparameters tuning: trees, min_n, mtry","code":""},{"path":"/reference/trainTestSplit.html","id":null,"dir":"Reference","previous_headings":"","what":"Train-Test Split — trainTestSplit","title":"Train-Test Split — trainTestSplit","text":"Train-Test Split","code":""},{"path":"/reference/trainTestSplit.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Train-Test Split — trainTestSplit","text":"","code":"trainTestSplit(data = NULL, target = NULL, prop, seed = \"4814\")"},{"path":"/reference/trainTestSplit.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Train-Test Split — trainTestSplit","text":"data Full data set global preprocess completed. target target variable. prop Proportion total data used training data. seed Seed reproducible results.","code":""},{"path":"/reference/trainTestSplit.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Train-Test Split — trainTestSplit","text":"Separate entire data training set test set.","code":""},{"path":"/reference/xgBoost.html","id":null,"dir":"Reference","previous_headings":"","what":"XGBoost — xgBoost","title":"XGBoost — xgBoost","text":"function training user-defined XGBoost model. Hyperparameters tuning: tree_depth, trees,learn_rate, mtry, min_n, loss_reduction, sample_size","code":""},{"path":"/reference/xgBoost.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"XGBoost — xgBoost","text":"","code":"xgBoost( algo = \"XGBoost\", engine = \"xgboost\", mode = \"classification\", trainingData = NULL, splitedData = NULL, formula = NULL, rec = NULL, v = 5, gridNum = 5, iter = 10, metric = NULL, seed = 1234 )"},{"path":"/reference/xgBoost.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"XGBoost — xgBoost","text":"algo name algorithm can customized user (default: \"XGBoost\"). engine name software used fit model (\"xgboost\" (default)). mode model type. \"classification\" \"regression\" (\"classification\" (default), \"regression\"). trainingData training data. splitedData data frame including metadata split. formula formula modeling rec Recipe object containing preprocessing information cross-validation. v Applying v-fold cross validation modeling process (default: 5). gridNum Initial number iterations run starting optimization algorithm. iter maximum number search iterations. metric Metric evaluate performance (classification: \"roc_auc\" (default), \"accuracy\" / regression: \"rmse\" (default), \"rsq\"). seed Seed reproducible results.","code":""},{"path":"/reference/xgBoost.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"XGBoost — xgBoost","text":"XGBoost","code":""}]
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index fb35e7d..00f4fc6 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -3,6 +3,12 @@
/404.html
+
+ /LICENSE-text.html
+
+
+ /LICENSE.html
+
/authors.html
@@ -10,10 +16,19 @@
/index.html
- /LICENSE-text.html
+ /reference/KNN.html
- /LICENSE.html
+ /reference/MLP.html
+
+
+ /reference/SVMLinear.html
+
+
+ /reference/SVMPoly.html
+
+
+ /reference/SVMRbf.html
/reference/bayesOptCV.html
@@ -39,9 +54,6 @@
/reference/kMeansClustering.html
-
- /reference/KNN.html
-
/reference/lightGbm.html
@@ -52,7 +64,7 @@
/reference/logisticRegression.html
- /reference/MLP.html
+ /reference/multinomialRegression.html
/reference/naiveBayes.html
@@ -61,19 +73,13 @@
/reference/pipe.html
- /reference/prepForCV.html
+ /reference/plotRmseComparison.html
- /reference/randomForest.html
-
-
- /reference/SVMLinear.html
-
-
- /reference/SVMPoly.html
+ /reference/prepForCV.html
- /reference/SVMRbf.html
+ /reference/randomForest.html
/reference/trainTestSplit.html