From 1e38439ff60601bfd66331a0bb978da13b02602c Mon Sep 17 00:00:00 2001 From: valer1435 Date: Sat, 8 Jun 2024 21:22:40 +0300 Subject: [PATCH 1/6] add ai gen files --- docs/source/examples/index.rst | 2 +- .../examples/simple/api_classification.rst | 106 +++++ docs/source/examples/simple/api_explain.rst | 83 ++++ .../examples/simple/api_forecasting.rst | 124 ++++++ .../source/examples/simple/api_regression.rst | 123 ++++++ docs/source/examples/simple/cgru.rst | 116 +++++ .../classification_with_api_builder.rst | 90 ++++ .../simple/classification_with_tuning.rst | 93 ++++ .../examples/simple/cli_call_example.rst | 105 +++++ docs/source/examples/simple/fitted_values.rst | 102 +++++ .../simple/image_classification_problem.rst | 103 +++++ docs/source/examples/simple/index.rst | 32 ++ .../examples/simple/multiclass_prediction.rst | 156 +++++++ .../simple/multiple_ts_forecasting_tasks.rst | 73 ++++ .../pipeline_and_history_visualization.rst | 111 +++++ .../examples/simple/pipeline_explain.rst | 102 +++++ .../simple/pipeline_import_export.rst | 227 ++++++++++ docs/source/examples/simple/pipeline_log.rst | 85 ++++ docs/source/examples/simple/pipeline_tune.rst | 116 +++++ .../simple/pipeline_tuning_with_iopt.rst | 125 ++++++ .../simple/pipeline_visualization.rst | 144 +++++++ .../simple/regression_with_tuning.rst | 85 ++++ .../examples/simple/resample_example.rst | 130 ++++++ docs/source/examples/simple/ts_pipelines.rst | 398 ++++++++++++++++++ .../examples/simple/tuning_pipelines.rst | 132 ++++++ 25 files changed, 2962 insertions(+), 1 deletion(-) create mode 100644 docs/source/examples/simple/api_classification.rst create mode 100644 docs/source/examples/simple/api_explain.rst create mode 100644 docs/source/examples/simple/api_forecasting.rst create mode 100644 docs/source/examples/simple/api_regression.rst create mode 100644 docs/source/examples/simple/cgru.rst create mode 100644 docs/source/examples/simple/classification_with_api_builder.rst create mode 100644 docs/source/examples/simple/classification_with_tuning.rst create mode 100644 docs/source/examples/simple/cli_call_example.rst create mode 100644 docs/source/examples/simple/fitted_values.rst create mode 100644 docs/source/examples/simple/image_classification_problem.rst create mode 100644 docs/source/examples/simple/index.rst create mode 100644 docs/source/examples/simple/multiclass_prediction.rst create mode 100644 docs/source/examples/simple/multiple_ts_forecasting_tasks.rst create mode 100644 docs/source/examples/simple/pipeline_and_history_visualization.rst create mode 100644 docs/source/examples/simple/pipeline_explain.rst create mode 100644 docs/source/examples/simple/pipeline_import_export.rst create mode 100644 docs/source/examples/simple/pipeline_log.rst create mode 100644 docs/source/examples/simple/pipeline_tune.rst create mode 100644 docs/source/examples/simple/pipeline_tuning_with_iopt.rst create mode 100644 docs/source/examples/simple/pipeline_visualization.rst create mode 100644 docs/source/examples/simple/regression_with_tuning.rst create mode 100644 docs/source/examples/simple/resample_example.rst create mode 100644 docs/source/examples/simple/ts_pipelines.rst create mode 100644 docs/source/examples/simple/tuning_pipelines.rst diff --git a/docs/source/examples/index.rst b/docs/source/examples/index.rst index 9218b72b0d..165ae88be8 100644 --- a/docs/source/examples/index.rst +++ b/docs/source/examples/index.rst @@ -7,7 +7,6 @@ In this section you can find notebooks and useful pipeline structures for variou .. toctree:: :glob: :maxdepth: 1 - :caption: Contents classification_example regression_example @@ -17,3 +16,4 @@ In this section you can find notebooks and useful pipeline structures for variou classification_pipelines regression_pipelines ts_pipelines + simple/index diff --git a/docs/source/examples/simple/api_classification.rst b/docs/source/examples/simple/api_classification.rst new file mode 100644 index 0000000000..72842d1d58 --- /dev/null +++ b/docs/source/examples/simple/api_classification.rst @@ -0,0 +1,106 @@ + +Fedot Classification Example +============================ + +This example demonstrates how to use the Fedot framework for automated machine learning (AutoML) to perform classification tasks. It compares a baseline model with a more sophisticated automated model that includes hyperparameter tuning. + +Overview +-------- + +The example uses the Fedot library to train and evaluate classification models on a dataset. It first sets up a baseline model using a predefined Random Forest model and then sets up an automated model with hyperparameter tuning. The performance of both models is evaluated, and the automated model's predictions are visualized if the `visualization` parameter is set to `True`. + +Step-by-Step Guide +------------------ + +1. **Import Necessary Modules** + + .. code-block:: python + + from fedot import Fedot + from fedot.core.utils import fedot_project_root, set_random_seed + +2. **Define the `run_classification_example` Function** + + This function takes three parameters: `timeout`, `visualization`, and `with_tuning`. It sets up the problem type, data paths, and initializes the models. + + .. code-block:: python + + def run_classification_example(timeout: float = None, visualization=False, with_tuning=True): + problem = 'classification' + train_data_path = f'{fedot_project_root()}/cases/data/scoring/scoring_train.csv' + test_data_path = f'{fedot_project_root()}/cases/data/scoring/scoring_test.csv' + +3. **Initialize and Train the Baseline Model** + + The baseline model is initialized with a predefined Random Forest model and trained on the training data. + + .. code-block:: python + + baseline_model = Fedot(problem=problem, timeout=timeout) + baseline_model.fit(features=train_data_path, target='target', predefined_model='rf') + +4. **Make Predictions with the Baseline Model** + + The baseline model makes predictions on the test data and its metrics are printed. + + .. code-block:: python + + baseline_model.predict(features=test_data_path) + print(baseline_model.get_metrics()) + +5. **Initialize and Train the Automated Model** + + The automated model is initialized with settings for best quality, hyperparameter tuning, and other parameters. It is trained on the training data. + + .. code-block:: python + + auto_model = Fedot(problem=problem, timeout=timeout, n_jobs=-1, preset='best_quality', + max_pipeline_fit_time=5, metric=['roc_auc', 'precision'], with_tuning=with_tuning) + auto_model.fit(features=train_data_path, target='target') + +6. **Make Predictions with the Automated Model** + + The automated model makes probability predictions on the test data, and its metrics are printed with a specified rounding order. + + .. code-block:: python + + prediction = auto_model.predict_proba(features=test_data_path) + print(auto_model.get_metrics(rounding_order=4)) + +7. **Visualize the Predictions (Optional)** + + If `visualization` is set to `True`, the predictions of the automated model are visualized. + + .. code-block:: python + + if visualization: + auto_model.plot_prediction() + +8. **Return the Predictions** + + The function returns the predictions made by the automated model. + + .. code-block:: python + + return prediction + +9. **Run the Example** + + The example is executed with a specified timeout and visualization. + + .. code-block:: python + + if __name__ == '__main__': + set_random_seed(42) + run_classification_example(timeout=10.0, visualization=True) + +Usage +----- + +To use this example, you can copy and paste the code into your Python environment. Ensure that the Fedot library is installed and that the paths to the datasets are correct. You can modify the `timeout`, `visualization`, and `with_tuning` parameters to suit your needs. + +.. note:: + This example assumes that the required datasets are available at the specified paths within the Fedot project structure. If you are using different datasets, you will need to adjust the `train_data_path` and `test_data_path` variables accordingly. + +.. note:: + For more information on the Fedot library and its capabilities, please refer to the `Fedot documentation `_. diff --git a/docs/source/examples/simple/api_explain.rst b/docs/source/examples/simple/api_explain.rst new file mode 100644 index 0000000000..0592a58e0c --- /dev/null +++ b/docs/source/examples/simple/api_explain.rst @@ -0,0 +1,83 @@ + +Fedot API Explain Example +========================= + +Overview +-------- + +This example demonstrates how to use the Fedot framework to build a classification model and explain its predictions using a surrogate decision tree. The example uses a dataset from the Fedot project's cases directory, specifically the cancer training dataset. The model is built using a predefined Random Forest model and then explained using a surrogate decision tree method. + +Step-by-Step Guide +------------------ + +1. **Import Necessary Libraries** + + The first step is to import the necessary libraries, which include `pandas` for data manipulation and `Fedot` for the machine learning framework. + + .. code-block:: python + + import pandas as pd + from fedot import Fedot + from fedot.core.utils import fedot_project_root + +2. **Define the Function to Run the Example** + + The function `run_api_explain_example` is defined with parameters for visualization, timeout, and whether to perform model tuning. + + .. code-block:: python + + def run_api_explain_example(visualization=False, timeout=None, with_tuning=True): + +3. **Load the Training Data** + + The training data is loaded from a CSV file located in the Fedot project's cases/data/cancer directory. + + .. code-block:: python + + train_data = pd.read_csv(f'{fedot_project_root()}/cases/data/cancer/cancer_train.csv', index_col=0) + +4. **Prepare Visualization Parameters** + + The feature and class names are extracted for visualization purposes. + + .. code-block:: python + + feature_names = train_data.columns.tolist() + target_name = feature_names.pop() + target = train_data[target_name] + class_names = target.unique().astype(str).tolist() + +5. **Build the Classification Model** + + A Fedot model is initialized with the problem type 'classification', a timeout, and whether to perform model tuning. The model is then fitted using the training data and a predefined Random Forest model. + + .. code-block:: python + + model = Fedot(problem='classification', timeout=timeout, with_tuning=with_tuning) + model.fit(features=train_data, target=target_name, predefined_model='rf') + +6. **Explain the Model Predictions** + + The model's predictions are explained using a surrogate decision tree method. If visualization is enabled, the explanation is saved as an image. + + .. code-block:: python + + explainer = model.explain( + method='surrogate_dt', visualization=visualization, + save_path=figure_path, dpi=200, feature_names=feature_names, + class_names=class_names, precision=6 + ) + +7. **Run the Example** + + The example is executed with visualization enabled and a timeout of 5 seconds. + + .. code-block:: python + + if __name__ == '__main__': + run_api_explain_example(visualization=True, timeout=5) + +Conclusion +---------- + +This example showcases the use of the Fedot framework for building a classification model and explaining its predictions. It demonstrates how to load data, build a model, and use a surrogate decision tree to explain the model's decisions. The example can be easily adapted for different datasets and models by modifying the data loading and model configuration sections. diff --git a/docs/source/examples/simple/api_forecasting.rst b/docs/source/examples/simple/api_forecasting.rst new file mode 100644 index 0000000000..b07af15f02 --- /dev/null +++ b/docs/source/examples/simple/api_forecasting.rst @@ -0,0 +1,124 @@ + +Fedot Time Series Forecasting Example +================================================================= + +Summary +------- + +This example demonstrates how to use the Fedot framework for time series forecasting. It includes the process of loading a time series dataset, setting up the forecasting task, training a model, and visualizing the results. The example is structured to be easily adaptable for different datasets and forecasting horizons. + +Step-by-Step Guide +------------------ + +### Importing Necessary Libraries + +.. code-block:: python + + import logging + import random + + import numpy as np + import pandas as pd + from matplotlib import pyplot as plt + + from fedot import Fedot + from fedot.core.data.data import InputData + from fedot.core.data.data_split import train_test_data_setup + from fedot.core.repository.dataset_types import DataTypesEnum + from fedot.core.repository.tasks import TsForecastingParams, Task, TaskTypesEnum + from fedot.core.utils import fedot_project_root + + logging.raiseExceptions = False + +### Defining Paths and Datasets + +.. code-block:: python + + _TS_EXAMPLES_DATA_PATH = fedot_project_root().joinpath('examples/data/ts') + + TS_DATASETS = { + 'm4_daily': _TS_EXAMPLES_DATA_PATH.joinpath('M4Daily.csv'), + 'm4_monthly': _TS_EXAMPLES_DATA_PATH.joinpath('M4Monthly.csv'), + 'm4_quarterly': _TS_EXAMPLES_DATA_PATH.joinpath('M4Quarterly.csv'), + 'm4_weekly': _TS_EXAMPLES_DATA_PATH.joinpath('M4Weekly.csv'), + 'm4_yearly': _TS_EXAMPLES_DATA_PATH.joinpath('M4Yearly.csv'), + 'australia': _TS_EXAMPLES_DATA_PATH.joinpath('australia.csv'), + 'beer': _TS_EXAMPLES_DATA_PATH.joinpath('beer.csv'), + 'salaries': _TS_EXAMPLES_DATA_PATH.joinpath('salaries.csv'), + 'stackoverflow': _TS_EXAMPLES_DATA_PATH.joinpath('stackoverflow.csv'), + 'test_sea': fedot_project_root().joinpath('test', 'data', 'simple_sea_level.csv') + } + +### Function to Load and Prepare Time Series Data + +.. code-block:: python + + def get_ts_data(dataset='m4_monthly', horizon: int = 30, m4_id=None, validation_blocks=None): + time_series = pd.read_csv(TS_DATASETS[dataset]) + + task = Task(TaskTypesEnum.ts_forecasting, + TsForecastingParams(forecast_length=horizon)) + if 'm4' in dataset: + if not m4_id: + label = random.choice(np.unique(time_series['label'])) + else: + label = m4_id + print(label) + time_series = time_series[time_series['label'] == label] + idx = time_series['datetime'].values + else: + label = dataset + if dataset not in ['australia']: + idx = pd.to_datetime(time_series['idx'].values) + else: + # non datetime indexes + idx = time_series['idx'].values + + time_series = time_series['value'].values + train_input = InputData(idx=idx, + features=time_series, + target=time_series, + task=task, + data_type=DataTypesEnum.ts) + train_data, test_data = train_test_data_setup(train_input, validation_blocks=validation_blocks) + return train_data, test_data, label + +### Main Function for Time Series Forecasting + +.. code-block:: python + + def run_ts_forecasting_example(dataset='australia', horizon: int = 30, timeout: float = None, + visualization=False, validation_blocks=2, with_tuning=True): + train_data, test_data, label = get_ts_data(dataset, horizon, validation_blocks=validation_blocks) + # init model for the time series forecasting + + model = Fedot(problem='ts_forecasting', + task_params=Task(TaskTypesEnum.ts_forecasting, + TsForecastingParams(forecast_length=horizon)).task_params, + timeout=timeout, + n_jobs=-1, + metric='mae', + with_tuning=with_tuning) + + model.fit(train_data) + + pred_fedot = model.forecast(test_data) + if visualization: + model.current_pipeline.show() + plt.plot(train_data.idx, train_data.features, label='features') + plt.plot(test_data.idx, test_data.target, label='target') + plt.plot(test_data.idx, pred_fedot, label='fedot') + plt.grid() + plt.legend() + plt.show() + + return pred_fedot + +### Running the Example + +.. code-block:: python + + if __name__ == '__main__': + run_ts_forecasting_example(dataset='m4_monthly', horizon=14, timeout=2., validation_blocks=None, visualization=True) + +This documentation page provides a comprehensive guide to using the Fedot framework for time series forecasting. It includes all necessary code snippets and explanations to ensure that users can easily understand and adapt the example for their own purposes. \ No newline at end of file diff --git a/docs/source/examples/simple/api_regression.rst b/docs/source/examples/simple/api_regression.rst new file mode 100644 index 0000000000..d3990fbdab --- /dev/null +++ b/docs/source/examples/simple/api_regression.rst @@ -0,0 +1,123 @@ +.. _regression_example: + +========================================================================= +Regression Example with Fedot Framework +========================================================================= + +This example demonstrates how to use the Fedot framework to perform regression analysis on a dataset. The code provided imports necessary modules, loads data, sets up a regression task, trains a model, makes predictions, and visualizes the results. + +.. code-block:: python + + import logging + + from fedot import Fedot + from fedot.core.data.data import InputData + from fedot.core.data.data_split import train_test_data_setup + from fedot.core.repository.tasks import TaskTypesEnum, Task + from fedot.core.utils import fedot_project_root + + def run_regression_example(visualise: bool = False, with_tuning: bool = True, + timeout: float = 2., preset: str = 'auto'): + data_path = f'{fedot_project_root()}/cases/data/cholesterol/cholesterol.csv' + + data = InputData.from_csv(data_path, + task=Task(TaskTypesEnum.regression)) + train, test = train_test_data_setup(data) + problem = 'regression' + + composer_params = {'history_dir': 'custom_history_dir', 'preset': preset} + auto_model = Fedot(problem=problem, seed=42, timeout=timeout, logging_level=logging.FATAL, + with_tuning=with_tuning, **composer_params) + + auto_model.fit(features=train, target='target') + prediction = auto_model.predict(features=test) + if visualise: + auto_model.history.save('saved_regression_history.json') + auto_model.plot_prediction() + print(auto_model.get_metrics()) + return prediction + + if __name__ == '__main__': + run_regression_example(visualise=True) + +Step-by-Step Guide +------------------ + +1. **Importing Modules** + + The first block imports necessary modules from the Fedot framework and the Python standard library. + + .. code-block:: python + + import logging + + from fedot import Fedot + from fedot.core.data.data import InputData + from fedot.core.data.data_split import train_test_data_setup + from fedot.core.repository.tasks import TaskTypesEnum, Task + from fedot.core.utils import fedot_project_root + +2. **Defining the Function** + + The function `run_regression_example` is defined with parameters for visualization, model tuning, timeout, and preset configuration. + + .. code-block:: python + + def run_regression_example(visualise: bool = False, with_tuning: bool = True, + timeout: float = 2., preset: str = 'auto'): + +3. **Loading and Preparing Data** + + The data is loaded from a CSV file and prepared for the regression task. The data is then split into training and testing sets. + + .. code-block:: python + + data_path = f'{fedot_project_root()}/cases/data/cholesterol/cholesterol.csv' + + data = InputData.from_csv(data_path, + task=Task(TaskTypesEnum.regression)) + train, test = train_test_data_setup(data) + problem = 'regression' + +4. **Configuring and Training the Model** + + A Fedot model is configured with specified parameters and trained on the training data. + + .. code-block:: python + + composer_params = {'history_dir': 'custom_history_dir', 'preset': preset} + auto_model = Fedot(problem=problem, seed=42, timeout=timeout, logging_level=logging.FATAL, + with_tuning=with_tuning, **composer_params) + + auto_model.fit(features=train, target='target') + +5. **Making Predictions and Visualizing Results** + + The model makes predictions on the test data. If `visualise` is set to True, the history is saved and a prediction plot is generated. + + .. code-block:: python + + prediction = auto_model.predict(features=test) + if visualise: + auto_model.history.save('saved_regression_history.json') + auto_model.plot_prediction() + +6. **Printing Metrics and Returning Prediction** + + The model's metrics are printed, and the prediction results are returned. + + .. code-block:: python + + print(auto_model.get_metrics()) + return prediction + +7. **Running the Example** + + The example is executed with visualization enabled. + + .. code-block:: python + + if __name__ == '__main__': + run_regression_example(visualise=True) + +This documentation page provides a comprehensive overview of the regression example using the Fedot framework. Users can copy and paste the provided code to apply regression analysis to their own datasets. \ No newline at end of file diff --git a/docs/source/examples/simple/cgru.rst b/docs/source/examples/simple/cgru.rst new file mode 100644 index 0000000000..b5c90be5d0 --- /dev/null +++ b/docs/source/examples/simple/cgru.rst @@ -0,0 +1,116 @@ + +.. _cgru_forecasting_example: + +========================================================================= +Example: CGru Forecasting with Fedot Pipeline +========================================================================= + +Overview +-------- + +This example demonstrates the use of the Fedot framework to build and apply a forecasting pipeline using the CGru model for time series data. The pipeline is designed to predict future values of a time series based on historical data. The example includes data preparation, model fitting, prediction, and visualization of the results. + +Step-by-Step Guide +------------------ + +1. Importing Necessary Libraries +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + import numpy as np + from sklearn.metrics import mean_squared_error, mean_absolute_error + + from examples.advanced.time_series_forecasting.composing_pipelines import visualise, get_border_line_info + from examples.simple.time_series_forecasting.api_forecasting import get_ts_data + from fedot.core.pipelines.pipeline_builder import PipelineBuilder + +This block imports the required libraries and functions for the example. It includes NumPy for numerical operations, Scikit-learn for calculating error metrics, and specific functions and classes from the Fedot framework for time series forecasting and pipeline construction. + +2. Defining the Forecasting Function +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + def cgru_forecasting(): + """ Example of cgru pipeline serialization """ + horizon = 12 + window_size = 200 + train_data, test_data = get_ts_data('salaries', horizon) + +This function initializes the forecasting process. It sets the forecasting horizon and window size, and retrieves the training and testing data for the 'salaries' time series. + +3. Building the Pipeline +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + pipeline = PipelineBuilder().add_node("lagged", params={'window_size': window_size}).add_node("cgru").build() + +The pipeline is constructed using the PipelineBuilder. It includes a 'lagged' preprocessing node with a specified window size and a 'cgru' model node for the actual forecasting. + +4. Fitting the Model and Making Predictions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + pipeline.fit(train_data) + prediction = pipeline.predict(test_data).predict[0] + +The pipeline is fitted on the training data, and predictions are made on the test data. + +5. Preparing Data for Visualization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + plot_info = [ + {'idx': np.concatenate([train_data.idx, test_data.idx]), + 'series': np.concatenate([test_data.features, test_data.target]), + 'label': 'Actual time series'}, + {'idx': test_data.idx, + 'series': np.ravel(prediction), + 'label': 'prediction'}, + get_border_line_info(test_data.idx[0], + prediction, + np.ravel(np.concatenate([test_data.features, test_data.target])), + 'Border line') + ] + +Data is prepared for visualization, including the actual time series, predictions, and a border line. + +6. Calculating Error Metrics +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + rmse = mean_squared_error(test_data.target, prediction, squared=False) + mae = mean_absolute_error(test_data.target, prediction) + print(f'RMSE - {rmse:.4f}') + print(f'MAE - {mae:.4f}') + +Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) are calculated and printed. + +7. Visualizing the Results +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + visualise(plot_info) + +The results are visualized using the `visualise` function. + +8. Running the Example +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + if __name__ == '__main__': + cgru_forecasting() + +The example is executed if the script is run as the main program. + +Conclusion +---------- + +This example provides a comprehensive guide on how to use the Fedot framework to create a forecasting pipeline with the CGru model. It covers data retrieval, pipeline construction, model fitting, prediction, error calculation, and visualization. Users can adapt this example to their own time series forecasting tasks by modifying the data source and pipeline configuration. \ No newline at end of file diff --git a/docs/source/examples/simple/classification_with_api_builder.rst b/docs/source/examples/simple/classification_with_api_builder.rst new file mode 100644 index 0000000000..f447f44790 --- /dev/null +++ b/docs/source/examples/simple/classification_with_api_builder.rst @@ -0,0 +1,90 @@ + +.. _fedot_classification_example: + +==================================================================== +Classification Example with FEDOT Framework +==================================================================== + +This example demonstrates how to use the FEDOT (Framework for Evidential Data Transformation) to perform a classification task. The example uses a predefined dataset for training and testing, and it showcases the setup and execution of a classification pipeline with hyperparameter tuning and evaluation metrics. + +Overview +-------- + +The FEDOT framework is designed to automate the process of building and optimizing machine learning pipelines. This example specifically focuses on a classification problem, where the goal is to predict a categorical target variable based on input features. + +The example is structured into several logical blocks: + +1. **Initialization and Configuration**: Setting up the FEDOT instance with the desired problem type, configuration options, and evaluation metrics. +2. **Data Loading**: Specifying the paths to the training and testing datasets. +3. **Model Training**: Fitting the FEDOT pipeline to the training data. +4. **Prediction**: Generating probability predictions on the test data. +5. **Visualization**: Plotting the prediction results. + +Step-by-Step Guide +------------------ + +Below is a detailed breakdown of the code, ensuring that each line is explained and can be easily understood and replicated. + +Initialization and Configuration +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + from fedot import FedotBuilder + from fedot.core.utils import fedot_project_root + + if __name__ == '__main__': + train_data_path = f'{fedot_project_root()}/cases/data/scoring/scoring_train.csv' + test_data_path = f'{fedot_project_root()}/cases/data/scoring/scoring_test.csv' + + fedot = (FedotBuilder(problem='classification') + .setup_composition(timeout=10, with_tuning=True, preset='best_quality') + .setup_pipeline_evaluation(max_pipeline_fit_time=5, metric=['roc_auc', 'precision']) + .build()) + +In this block, the FEDOT framework is imported, and the paths to the training and testing datasets are defined. The FEDOT instance is then configured for a classification problem, with a timeout for the composition process, hyperparameter tuning enabled, and a preset for the best quality. The evaluation metrics are set to `roc_auc` and `precision`. + +Data Loading +^^^^^^^^^^^^ + +The data loading is implicit in the paths defined in the initialization block. The paths are constructed using the `fedot_project_root()` function, which returns the root directory of the FEDOT project, and then the paths to the specific CSV files are appended. + +Model Training +^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + fedot.fit(features=train_data_path, target='target') + +This line fits the FEDOT pipeline to the training data. The `features` parameter is set to the path of the training dataset, and the `target` parameter specifies the name of the target column in the dataset. + +Prediction +^^^^^^^^^^ + +.. code-block:: python + + fedot.predict_proba(features=test_data_path) + +Here, the FEDOT pipeline is used to generate probability predictions for the test dataset. The `features` parameter is set to the path of the test dataset. + +Visualization +^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + fedot.plot_prediction() + +The final line of the example plots the prediction results, providing a visual representation of the model's performance. + +Conclusion +---------- + +This example provides a comprehensive guide on how to use the FEDOT framework for a classification task. By following this guide, users can replicate the example and adapt it to their own datasets and classification problems. + +.. note:: + Ensure that the required datasets are available at the specified paths and that the FEDOT framework is properly installed and configured. + +.. seealso:: + For more information on the FEDOT framework, visit the `official documentation `_. + +This documentation page is formatted in .rst (reStructuredText) for use in Sphinx-based documentation systems, which are commonly used for Python projects. It provides a clear and structured explanation of the code example, ensuring that users can understand and apply the example to their own classification tasks. \ No newline at end of file diff --git a/docs/source/examples/simple/classification_with_tuning.rst b/docs/source/examples/simple/classification_with_tuning.rst new file mode 100644 index 0000000000..a44719d6d5 --- /dev/null +++ b/docs/source/examples/simple/classification_with_tuning.rst @@ -0,0 +1,93 @@ + +.. _classification_tuning_example: + +==================================================================== +Classification Tuning Example with Fedot +==================================================================== + +This example demonstrates how to use the Fedot framework for tuning a classification pipeline. The example generates synthetic classification datasets with varying parameters and applies a Random Forest classifier, which is then optionally tuned using a simultaneous tuner. + +.. note:: + This example requires the `Fedot` framework and its dependencies to be installed. + +.. contents:: Table of Contents + :depth: 2 + :local: + +Setup and Imports +----------------- + +The first step is to import necessary modules and libraries: + +.. code-block:: python + + import numpy as np + from golem.core.tuning.simultaneous import SimultaneousTuner + from sklearn.metrics import roc_auc_score as roc_auc + from sklearn.model_selection import train_test_split + + from examples.simple.classification.classification_pipelines import classification_random_forest_pipeline + from fedot.core.data.data import InputData + from fedot.core.pipelines.tuning.tuner_builder import TunerBuilder + from fedot.core.repository.dataset_types import DataTypesEnum + from fedot.core.repository.metrics_repository import ClassificationMetricsEnum + from fedot.core.repository.tasks import Task, TaskTypesEnum + from fedot.core.utils import set_random_seed + from fedot.utilities.synth_dataset_generator import classification_dataset + +Data Preparation +---------------- + +The `get_classification_dataset` function prepares synthetic classification datasets with specified parameters: + +.. code-block:: python + + def get_classification_dataset(features_options, samples_amount=250, + features_amount=5, classes_amount=2, weights=None): + ... + return x_data_train, y_data_train, x_data_test, y_data_test + +Model Prediction Conversion +--------------------------- + +The `convert_to_labels` function converts model predictions to binary labels: + +.. code-block:: python + + def convert_to_labels(root_operation, prediction): + ... + return preds + +Classification Tuning Experiment +-------------------------------- + +The `run_classification_tuning_experiment` function runs the classification tuning experiment: + +.. code-block:: python + + def run_classification_tuning_experiment(pipeline, tuner=None): + ... + if tuner is not None: + ... + print('Obtained metrics after tuning:') + print(f"{roc_auc(y_test, preds_tuned):.4f}\n") + +Running the Example +------------------- + +To run the example, execute the following script: + +.. code-block:: python + + if __name__ == '__main__': + set_random_seed(2020) + run_classification_tuning_experiment(pipeline=classification_random_forest_pipeline(), + tuner=SimultaneousTuner) + +.. note:: + Ensure you have the necessary permissions and dependencies installed to run the script. + +Conclusion +---------- + +This example showcases the use of Fedot for tuning a classification pipeline, demonstrating how to generate synthetic datasets, apply a classifier, and optionally tune the pipeline for better performance. \ No newline at end of file diff --git a/docs/source/examples/simple/cli_call_example.rst b/docs/source/examples/simple/cli_call_example.rst new file mode 100644 index 0000000000..ccb7f3195b --- /dev/null +++ b/docs/source/examples/simple/cli_call_example.rst @@ -0,0 +1,105 @@ + +Fedot CLI Execution Example +========================== + +This example demonstrates how to execute Fedot tasks (time series forecasting, classification, and regression) from the command line interface (CLI) using .bat files. The code provided manipulates the Python environment path in the .bat files to ensure correct execution and saves the predictions to a CSV file. + +Overview +-------- + +The example consists of several functions that handle the execution of .bat files for different Fedot tasks. The main functions are: + +- `change_python_path`: Modifies the Python environment path in a .bat file. +- `run_console`: Executes a .bat file, waits for it to complete, and returns the predictions as a pandas DataFrame. +- `run_cli_ts_forecasting`, `run_cli_classification`, `run_cli_regression`: Specific functions to run .bat files for time series forecasting, classification, and regression tasks, respectively. + +Step-by-Step Guide +------------------ + +1. **Import Necessary Libraries** + :: + + import os + import sys + from subprocess import Popen + import subprocess + import pandas as pd + + This block imports the necessary Python libraries for file handling, subprocess execution, and data manipulation. + +2. **Define Constants and Functions** + :: + + env_name = sys.executable + env_path_placeholder = 'DEFAULT' + predictions_path = '../../fedot/api/predictions.csv' + + def change_python_path(file_name, old, new): + """ Function for changing env path in .bat for users settings""" + with open(file_name, "r+") as file: + text = file.read() + file.seek(0) + text = text.replace(old, new) + file.write(text) + file.truncate() + + def run_console(bat_name): + """ Function for running .bat files with returning prediction as df""" + try: + os.remove(predictions_path) + except Exception: + pass + change_python_path(bat_name, env_path_placeholder, env_name) + process = Popen(bat_name, creationflags=subprocess.CREATE_NEW_CONSOLE) + process.wait() + change_python_path(bat_name, env_name, env_path_placeholder) + print(f"\nPrediction saved at {predictions_path}") + df = pd.read_csv(predictions_path) + return df + + - `env_name` stores the path to the Python executable. + - `env_path_placeholder` is a placeholder used in the .bat files. + - `predictions_path` is the path where predictions are saved. + - `change_python_path` function replaces the Python environment path in a .bat file. + - `run_console` function runs a .bat file, waits for it to finish, and returns the predictions as a DataFrame. + +3. **Run Specific CLI Tasks** + :: + + def run_cli_ts_forecasting(): + """ Test executing ts_forecasting problem from cli with saving prediction""" + bat_name = 'cli_ts_call.bat' + run_console(bat_name) + + def run_cli_classification(): + """ Test executing classification problem from cli with saving prediction""" + bat_name = 'cli_classification_call.bat' + run_console(bat_name) + + def run_cli_regression(): + """ Test executing regression problem from cli with saving prediction""" + bat_name = 'cli_regression_call.bat' + run_console(bat_name) + + These functions specify the .bat files to be run for each task and call `run_console` to execute them. + +4. **Main Execution Block** + :: + + if __name__ == '__main__': + run_cli_classification() + + This block ensures that the `run_cli_classification` function is called when the script is executed directly. + +Usage +----- + +To use this example, ensure you have the appropriate .bat files (`cli_ts_call.bat`, `cli_classification_call.bat`, `cli_regression_call.bat`) in the correct directory. Modify the `env_name` and `predictions_path` variables if necessary to match your environment and desired output path. + +Run the script, and it will execute the specified .bat file, save the predictions to a CSV file, and print the location of the saved predictions. + +.. note:: + Ensure that Fedot is built as a package in your environment for the .bat files to execute correctly. + +.. seealso:: + For more information on Fedot and its CLI usage, refer to the `Fedot documentation `_. \ No newline at end of file diff --git a/docs/source/examples/simple/fitted_values.rst b/docs/source/examples/simple/fitted_values.rst new file mode 100644 index 0000000000..9c535021f0 --- /dev/null +++ b/docs/source/examples/simple/fitted_values.rst @@ -0,0 +1,102 @@ +.. _fitted_time_series_example: + +========================================================================= +Example: Visualizing Fitted Time Series Values +========================================================================= + +This example demonstrates how to use the FEDOT framework to obtain and visualize fitted values of a time series. Fitted values are the predictions made by a model on the training data, which help in understanding how well the model captures the underlying structure of the time series. + +.. code-block:: python + + from matplotlib import pyplot as plt + from fedot.core.data.data import InputData + from fedot.core.pipelines.ts_wrappers import fitted_values, in_sample_fitted_values + from fedot.core.repository.tasks import Task, TaskTypesEnum, TsForecastingParams + from test.unit.pipelines.test_pipeline_ts_wrappers import get_simple_short_lagged_pipeline + + def show_fitted_time_series(len_forecast=24): + """ + Shows an example of how to get fitted values of a time series by any + pipeline created by FEDOT + + fitted values - are the predictions of the pipelines on the training sample. + For time series, these values show how well the model reproduces the time + series structure + """ + task = Task(TaskTypesEnum.ts_forecasting, + TsForecastingParams(forecast_length=len_forecast)) + + ts_input = InputData.from_csv_time_series(file_path='../../../cases/data/time_series/metocean.csv', + task=task, target_column='value') + + pipeline = get_simple_short_lagged_pipeline() + train_predicted = pipeline.fit(ts_input) + + # Get fitted values for every 10th forecast + fitted_ts_10 = fitted_values(ts_input, train_predicted, 10) + # Average for all forecasting horizons + fitted_ts_act = fitted_values(ts_input, train_predicted) + # In-sample forecasting fitted values + in_sample_validated = in_sample_fitted_values(ts_input, train_predicted) + + plt.plot(range(len(ts_input.idx)), ts_input.target, label='Actual time series', alpha=0.8) + plt.plot(fitted_ts_10.idx, fitted_ts_10.predict, label='Fitted values horizon 10', alpha=0.2) + plt.plot(fitted_ts_act.idx, fitted_ts_act.predict, label='Fitted values all', alpha=0.2) + plt.plot(in_sample_validated.idx, in_sample_validated.predict, label='In-sample fitted values') + plt.legend() + plt.grid() + plt.show() + + if __name__ == '__main__': + show_fitted_time_series() + +Step-by-Step Guide +------------------ + +1. **Task Definition**: + The task is defined as time series forecasting with a specified forecast length. + + .. code-block:: python + + task = Task(TaskTypesEnum.ts_forecasting, + TsForecastingParams(forecast_length=len_forecast)) + +2. **Data Loading**: + The time series data is loaded from a CSV file. + + .. code-block:: python + + ts_input = InputData.from_csv_time_series(file_path='../../../cases/data/time_series/metocean.csv', + task=task, target_column='value') + +3. **Pipeline Creation and Training**: + A pipeline is created and trained on the input data. + + .. code-block:: python + + pipeline = get_simple_short_lagged_pipeline() + train_predicted = pipeline.fit(ts_input) + +4. **Obtaining Fitted Values**: + Fitted values are calculated for different scenarios (every 10th forecast, all forecasts, and in-sample). + + .. code-block:: python + + fitted_ts_10 = fitted_values(ts_input, train_predicted, 10) + fitted_ts_act = fitted_values(ts_input, train_predicted) + in_sample_validated = in_sample_fitted_values(ts_input, train_predicted) + +5. **Visualization**: + The actual time series and the fitted values are plotted. + + .. code-block:: python + + plt.plot(range(len(ts_input.idx)), ts_input.target, label='Actual time series', alpha=0.8) + plt.plot(fitted_ts_10.idx, fitted_ts_10.predict, label='Fitted values horizon 10', alpha=0.2) + plt.plot(fitted_ts_act.idx, fitted_ts_act.predict, label='Fitted values all', alpha=0.2) + plt.plot(in_sample_validated.idx, in_sample_validated.predict, label='In-sample fitted values') + plt.legend() + plt.grid() + plt.show() + +This example provides a clear demonstration of how to use FEDOT to obtain and visualize fitted values of a time series, which is crucial for assessing the model's performance on the training data. \ No newline at end of file diff --git a/docs/source/examples/simple/image_classification_problem.rst b/docs/source/examples/simple/image_classification_problem.rst new file mode 100644 index 0000000000..0eed3a9092 --- /dev/null +++ b/docs/source/examples/simple/image_classification_problem.rst @@ -0,0 +1,103 @@ + +.. _image_classification_example: + +Image Classification Example +============================ + +This example demonstrates how to use the Fedot framework to solve an image classification problem using a Convolutional Neural Network (CNN) pipeline. The example uses the MNIST dataset, which is a set of 70,000 small images of digits handwritten by high school students and employees of the US Census Bureau. + +Overview +-------- + +The example is structured into several logical blocks: + +1. **Module Import and Requirement Check**: The code starts by importing necessary modules and checking if TensorFlow is installed. If not, it warns the user to install it. +2. **Metric Calculation Function**: A function to calculate the ROC AUC score for validation. +3. **Main Function**: A function to run the image classification problem, which includes setting up the task, preparing the datasets, creating and fitting the pipeline, and evaluating the model. +4. **Main Execution Block**: The main block where the training and testing datasets are loaded, and the main function is called. + +Step-by-Step Guide +------------------ + +1. **Module Import and Requirement Check** + + .. code-block:: python + + from golem.utilities.requirements_notificator import warn_requirement + + try: + import tensorflow as tf + except ModuleNotFoundError: + warn_requirement('tensorflow', 'fedot[extra]') + + from sklearn.metrics import roc_auc_score as roc_auc + + from examples.simple.classification.classification_pipelines import cnn_composite_pipeline + from fedot.core.data.data import InputData, OutputData + from fedot.core.repository.tasks import Task, TaskTypesEnum + from fedot.core.utils import set_random_seed + + This block imports the necessary modules and checks for the presence of TensorFlow. If TensorFlow is not found, it notifies the user to install it. + +2. **Metric Calculation Function** + + .. code-block:: python + + def calculate_validation_metric(predicted: OutputData, dataset_to_validate: InputData) -> float: + # the quality assessment for the simulation results + roc_auc_value = roc_auc(y_true=dataset_to_validate.target, + y_score=predicted.predict, + multi_class="ovo") + return roc_auc_value + + This function calculates the ROC AUC score for the validation dataset. It takes the predicted output and the validation dataset as inputs and returns the ROC AUC value. + +3. **Main Function** + + .. code-block:: python + + def run_image_classification_problem(train_dataset: tuple, + test_dataset: tuple, + composite_flag: bool = True): + task = Task(TaskTypesEnum.classification) + + x_train, y_train = train_dataset[0], train_dataset[1] + x_test, y_test = test_dataset[0], test_dataset[1] + + dataset_to_train = InputData.from_image(images=x_train, + labels=y_train, + task=task) + dataset_to_validate = InputData.from_image(images=x_test, + labels=y_test, + task=task) + + pipeline = cnn_composite_pipeline(composite_flag) + pipeline.fit(input_data=dataset_to_train) + predictions = pipeline.predict(dataset_to_validate) + roc_auc_on_valid = calculate_validation_metric(predictions, + dataset_to_validate) + return roc_auc_on_valid, dataset_to_train, dataset_to_validate + + This function sets up the classification task, prepares the training and validation datasets, creates a CNN pipeline, fits the model, makes predictions, and calculates the ROC AUC score on the validation set. + +4. **Main Execution Block** + + .. code-block:: python + + if __name__ == '__main__': + set_random_seed(1) + + training_set, testing_set = tf.keras.datasets.mnist.load_data(path='mnist.npz') + roc_auc_on_valid, dataset_to_train, dataset_to_validate = run_image_classification_problem( + train_dataset=training_set, + test_dataset=testing_set) + + In this block, the random seed is set, the MNIST dataset is loaded, and the main function is called with the training and testing datasets. + +Usage +----- + +To use this example, you can copy and paste the provided code into your Python environment. Ensure that you have the required dependencies installed, such as TensorFlow and Fedot with the 'extra' package. You can then run the script to see the ROC AUC score for the image classification task on the MNIST dataset. + +.. note:: + Make sure to have the necessary permissions and paths set correctly to load the MNIST dataset. \ No newline at end of file diff --git a/docs/source/examples/simple/index.rst b/docs/source/examples/simple/index.rst new file mode 100644 index 0000000000..bb3e34c32c --- /dev/null +++ b/docs/source/examples/simple/index.rst @@ -0,0 +1,32 @@ +Simple example +====================== + +This section provides basic examples of FEDOT usage: + +.. toctree:: + :glob: + :maxdepth: 2 + + api_classification + api_explain + api_forecasting + api_regression + cgru + classification_with_api_builder + classification_with_tuning + fitted_values + image_classification_problem + multiclass_prediction + multiple_ts_forecasting_tasks + pipeline_and_history_visualization + pipeline_explain + pipeline_import_export + pipeline_log + pipeline_tune + pipeline_tuning_with_iopt + pipeline_visualization + regression_with_tuning + resample_example + ts_pipelines + tuning_pipelines + diff --git a/docs/source/examples/simple/multiclass_prediction.rst b/docs/source/examples/simple/multiclass_prediction.rst new file mode 100644 index 0000000000..10f1099175 --- /dev/null +++ b/docs/source/examples/simple/multiclass_prediction.rst @@ -0,0 +1,156 @@ + +.. _multi_clf_examples_from_excel: + +================================================================================= +Multi-Class Classification Examples from Excel Files +================================================================================= + +This example demonstrates how to use the `FEDOT `_ framework to perform multi-class classification tasks using data stored in Excel files. The example covers the entire process from reading the data, training a model, validating its performance, and applying it to new data. + +Prerequisites +------------- + +Ensure you have the following Python packages installed: + +- `pandas` +- `openpyxl` +- `FEDOT` +- `sklearn` + +You can install the required packages using pip: + +.. code-block:: bash + + pip install pandas openpyxl fedot[examples] sklearn + +Example Overview +----------------- + +The example is structured into several functions that handle different parts of the machine learning pipeline: + +1. **Data Loading and Preprocessing**: The `create_multi_clf_examples_from_excel` function reads an Excel file, splits the data into training and testing sets, and optionally saves the data to CSV files. + +2. **Model Training**: The `get_model` function trains a model using the training data. It uses a genetic programming-based composer to find the optimal model structure. + +3. **Model Application**: The `apply_model_to_data` function applies the trained model to new data and generates predictions. + +4. **Model Validation**: The `validate_model_quality` function evaluates the model's performance using the ROC AUC metric. + +Step-by-Step Guide +------------------ + +1. **Data Loading and Preprocessing** + + The `create_multi_clf_examples_from_excel` function is responsible for loading data from an Excel file and preparing it for model training. Here's how it works: + + .. code-block:: python + + def create_multi_clf_examples_from_excel(file_path: str, return_df: bool = False): + df = pd.read_excel(file_path, engine='openpyxl') + train, test = split_data(df) + file_dir_name = file_path.replace('.', '/').split('/')[-2] + file_csv_name = f'{file_dir_name}.csv' + directory_names = ['examples', 'data', file_dir_name] + + ensure_directory_exists(directory_names) + if return_df: + path = os.path.join(directory_names[0], directory_names[1], directory_names[2], file_csv_name) + full_file_path = os.path.join(str(fedot_project_root()), path) + save_file_to_csv(df, full_file_path) + return df, full_file_path + else: + full_train_file_path, full_test_file_path = get_split_data_paths(directory_names) + save_file_to_csv(train, full_train_file_path) + save_file_to_csv(train, full_test_file_path) + return full_train_file_path, full_test_file_path + +2. **Model Training** + + The `get_model` function trains a model using the training data. It uses a genetic programming-based composer to find the optimal model structure. + + .. code-block:: python + + def get_model(train_file_path: str, cur_lead_time: datetime.timedelta = timedelta(seconds=60)): + task = Task(task_type=TaskTypesEnum.classification) + dataset_to_compose = InputData.from_csv(train_file_path, task=task) + + models_repo = OperationTypesRepository() + available_model_types = models_repo.suitable_operation(task_type=task.task_type, tags=['simple']) + + metric_function = ClassificationMetricsEnum.ROCAUC_penalty + + composer_requirements = PipelineComposerRequirements( + primary=available_model_types, secondary=available_model_types, + timeout=cur_lead_time) + + builder = ComposerBuilder(task).with_requirements(composer_requirements).with_metrics(metric_function) + composer = builder.build() + + pipeline_evo_composed = composer.compose_pipeline(data=dataset_to_compose) + pipeline_evo_composed.fit(input_data=dataset_to_compose) + + return pipeline_evo_composed + +3. **Model Application** + + The `apply_model_to_data` function applies the trained model to new data and generates predictions. + + .. code-block:: python + + def apply_model_to_data(model: Pipeline, data_path: str): + df, file_path = create_multi_clf_examples_from_excel(data_path, return_df=True) + dataset_to_apply = InputData.from_csv(file_path, target_columns=None) + evo_predicted = model.predict(dataset_to_apply) + df['forecast'] = probs_to_labels(evo_predicted.predict) + return df + +4. **Model Validation** + + The `validate_model_quality` function evaluates the model's performance using the ROC AUC metric. + + .. code-block:: python + + def validate_model_quality(model: Pipeline, data_path: str): + dataset_to_validate = InputData.from_csv(data_path) + predicted_labels = model.predict(dataset_to_validate).predict + + roc_auc_valid = round(roc_auc(y_true=test_data.target, + y_score=predicted_labels, + multi_class='ovo', + average='macro'), 3) + return roc_auc_valid + +Running the Example +------------------- + +To run the example, execute the following code: + +.. code-block:: python + + if __name__ == '__main__': + set_random_seed(1) + + data_path = Path('../../data') + file_path_first = data_path.joinpath('example1.xlsx') + file_path_second = data_path.joinpath('example2.xlsx') + file_path_third = data_path.joinpath('example3.xlsx') + + train_file_path, test_file_path = create_multi_clf_examples_from_excel(file_path_first) + test_data = InputData.from_csv(test_file_path) + + fitted_model = get_model(train_file_path) + + fitted_model.show() + + roc_auc_score = validate_model_quality(fitted_model, test_file_path) + print(f'ROC AUC metric is {roc_auc_score}') + + final_prediction_first = apply_model_to_data(fitted_model, file_path_second) + print(final_prediction_first['forecast']) + + final_prediction_second = apply_model_to_data(fitted_model, file_path_third) + print(final_prediction_second['forecast']) + +This will load data from three Excel files, train a model, validate its performance, and apply it to new data, printing the ROC AUC score and the model's predictions. + +This documentation page provides a comprehensive guide to the example code, ensuring that users can understand and replicate the process for their own purposes. \ No newline at end of file diff --git a/docs/source/examples/simple/multiple_ts_forecasting_tasks.rst b/docs/source/examples/simple/multiple_ts_forecasting_tasks.rst new file mode 100644 index 0000000000..13b2e23dde --- /dev/null +++ b/docs/source/examples/simple/multiple_ts_forecasting_tasks.rst @@ -0,0 +1,73 @@ + +Fedot Time Series Forecasting Example +================================================================ + +Overview +-------- + +This example demonstrates how to use the Fedot framework for time series forecasting. It sets up a Fedot model builder with specific configurations, applies it to multiple datasets, and evaluates the performance of the generated models. + +Step-by-Step Guide +------------------ + +1. Importing Necessary Modules +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + from fedot import FedotBuilder + from fedot.core.utils import fedot_project_root + +2. Setting Up the Fedot Builder +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + if __name__ == '__main__': + SEED = 42 + + builder = (FedotBuilder('ts_forecasting') + .setup_composition(preset='fast_train', timeout=0.5, with_tuning=True, seed=SEED) + .setup_evolution(num_of_generations=3) + .setup_pipeline_evaluation(metric='mae')) + + In this block, the Fedot builder is initialized with a focus on time series forecasting. It sets up the composition with a fast training preset, a timeout of 0.5 seconds, tuning enabled, and a fixed seed for reproducibility. The evolution setup specifies 3 generations, and the evaluation metric is set to MAE (Mean Absolute Error). + +3. Defining the Dataset Path +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + datasets_path = fedot_project_root() / 'examples/data/ts' + + This line defines the path to the directory containing the time series datasets. + +4. Processing Each Dataset +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + resulting_models = {} + for data_path in datasets_path.iterdir(): + if data_path.name == 'ts_sea_level.csv': + continue + fedot = builder.build() + fedot.fit(data_path, target='value') + fedot.predict(features=fedot.train_data, validation_blocks=2) + fedot.plot_prediction() + fedot.current_pipeline.show() + resulting_models[data_path.stem] = fedot + + In this loop, each dataset file (excluding 'ts_sea_level.csv') is processed. For each file: + + - A Fedot model is built using the configured builder. + - The model is trained on the dataset with the target column 'value'. + - Predictions are made using the training data with 2 validation blocks. + - A plot of the prediction is generated. + - The current pipeline is displayed. + - The model is stored in a dictionary with the dataset filename stem as the key. + +Conclusion +---------- + +This example provides a comprehensive guide on using Fedot for time series forecasting. It demonstrates how to configure and use the Fedot builder to process multiple datasets, evaluate models, and visualize predictions. Users can easily adapt this example to their own datasets and forecasting tasks by modifying the dataset paths and configurations as needed. \ No newline at end of file diff --git a/docs/source/examples/simple/pipeline_and_history_visualization.rst b/docs/source/examples/simple/pipeline_and_history_visualization.rst new file mode 100644 index 0000000000..6568712bd1 --- /dev/null +++ b/docs/source/examples/simple/pipeline_and_history_visualization.rst @@ -0,0 +1,111 @@ +.. _pipeline_and_history_visualization: + +========================================================================= +Visualizing Pipeline Composition History Example +========================================================================= + +This example demonstrates how to visualize the composition history of a machine learning pipeline and the best pipeline itself. The code provided loads a pipeline optimization history, restores the best pipeline from that history, and then visualizes various aspects of the history and the pipeline. + +.. code-block:: python + + from pathlib import Path + from golem.core.optimisers.opt_history_objects.opt_history import OptHistory + from fedot.core.pipelines.adapters import PipelineAdapter + from fedot.core.utils import fedot_project_root + from fedot.core.visualisation.pipeline_specific_visuals import PipelineHistoryVisualizer + + def run_pipeline_and_history_visualization(): + """ The function runs visualization of the composing history and the best pipeline. """ + # Gather pipeline and history. + history = OptHistory.load(Path(fedot_project_root(), 'examples', 'data', 'histories', 'scoring_case_history.json')) + pipeline = PipelineAdapter().restore(history.individuals[-1][-1].graph) + # Show visualizations. + pipeline.show() + history_visualizer = PipelineHistoryVisualizer(history) + history_visualizer.fitness_line() + history_visualizer.fitness_box(best_fraction=0.5) + history_visualizer.operations_kde() + history_visualizer.operations_animated_bar(save_path='example_animation.gif', show_fitness=True) + history_visualizer.fitness_line_interactive() + + if __name__ == '__main__': + run_pipeline_and_history_visualization() + +Step-by-Step Guide +------------------ + +1. **Importing Necessary Modules** + + .. code-block:: python + + from pathlib import Path + from golem.core.optimisers.opt_history_objects.opt_history import OptHistory + from fedot.core.pipelines.adapters import PipelineAdapter + from fedot.core.utils import fedot_project_root + from fedot.core.visualisation.pipeline_specific_visuals import PipelineHistoryVisualizer + + Here, the necessary modules are imported to handle the pipeline history, restore the pipeline, and visualize the history and pipeline. + +2. **Function Definition** + + .. code-block:: python + + def run_pipeline_and_history_visualization(): + """ The function runs visualization of the composing history and the best pipeline. """ + + The function `run_pipeline_and_history_visualization` is defined to encapsulate the logic of loading and visualizing the pipeline and its history. + +3. **Loading the Pipeline History** + + .. code-block:: python + + history = OptHistory.load(Path(fedot_project_root(), 'examples', 'data', 'histories', 'scoring_case_history.json')) + + The pipeline optimization history is loaded from a JSON file located in the specified path. + +4. **Restoring the Best Pipeline** + + .. code-block:: python + + pipeline = PipelineAdapter().restore(history.individuals[-1][-1].graph) + + The best pipeline is restored from the graph representation stored in the last individual of the history. + +5. **Visualizing the Pipeline** + + .. code-block:: python + + pipeline.show() + + The pipeline is visualized using its `show` method. + +6. **Creating a Visualizer for the History** + + .. code-block:: python + + history_visualizer = PipelineHistoryVisualizer(history) + + A visualizer object is created to handle the visualization of the pipeline history. + +7. **Visualizing Different Aspects of the History** + + .. code-block:: python + + history_visualizer.fitness_line() + history_visualizer.fitness_box(best_fraction=0.5) + history_visualizer.operations_kde() + history_visualizer.operations_animated_bar(save_path='example_animation.gif', show_fitness=True) + history_visualizer.fitness_line_interactive() + + Various methods are called on the visualizer to display different visualizations of the history, including a line plot of fitness, a box plot, a kernel density estimation plot for operations, an animated bar chart, and an interactive line plot of fitness. + +8. **Running the Function** + + .. code-block:: python + + if __name__ == '__main__': + run_pipeline_and_history_visualization() + + The function is called if the script is run as the main program. + +This documentation page provides a comprehensive guide to understanding and using the provided code example for visualizing a machine learning pipeline and its composition history. Users can copy and paste the code into their environment and adapt it to their own purposes. \ No newline at end of file diff --git a/docs/source/examples/simple/pipeline_explain.rst b/docs/source/examples/simple/pipeline_explain.rst new file mode 100644 index 0000000000..5708cb62ca --- /dev/null +++ b/docs/source/examples/simple/pipeline_explain.rst @@ -0,0 +1,102 @@ + +.. _pipeline_explain_example: + +========================================================================= +Example: Explaining a Classification Pipeline +========================================================================= + +This example demonstrates how to use the `Fedot` framework to explain a classification pipeline. The pipeline is applied to a dataset from a CSV file, and the explanation of the pipeline's decision-making process is visualized. + +Overview +-------- + +The example script performs the following tasks: + +1. Loads training data from a CSV file. +2. Constructs a complex classification pipeline using predefined components. +3. Fits the pipeline to the training data. +4. Explains the pipeline using a surrogate decision tree model. +5. Visualizes the explanation and saves the plot. + +Step-by-Step Guide +------------------ + +1. **Specifying Paths** + + The paths for the training data and the output figure are specified: + + .. code-block:: python + + train_data_path = os.path.join(fedot_project_root(), 'cases', 'data', 'cancer', 'cancer_train.csv') + figure_path = 'pipeline_explain_example.png' + +2. **Feature and Class Names for Visualization** + + The feature and class names are extracted from the training data: + + .. code-block:: python + + feature_names = pd.read_csv(train_data_path, index_col=0, nrows=0).columns.tolist() + target_name = feature_names.pop() + target = pd.read_csv(train_data_path, usecols=[target_name])[target_name] + class_names = target.unique().astype(str).tolist() + +3. **Data Load** + + The training data is loaded into an `InputData` object: + + .. code-block:: python + + train_data = InputData.from_csv(train_data_path) + +4. **Pipeline Composition** + + A complex classification pipeline is composed using a predefined function: + + .. code-block:: python + + pipeline = classification_complex_pipeline() + +5. **Pipeline Fitting** + + The pipeline is fitted to the training data: + + .. code-block:: python + + pipeline.fit(train_data) + +6. **Pipeline Explaining** + + The pipeline is explained using a surrogate decision tree model: + + .. code-block:: python + + explainer = explain_pipeline(pipeline, data=train_data, method='surrogate_dt', visualization=True) + +7. **Visualizing Explanation and Saving the Plot** + + The explanation is visualized and the plot is saved: + + .. code-block:: python + + print(f'Built surrogate model: {explainer.surrogate_str}') + explainer.visualize(save_path=figure_path, dpi=200, feature_names=feature_names, class_names=class_names, + precision=6) + +Running the Example +------------------- + +To run this example, execute the following code: + +.. code-block:: python + + if __name__ == '__main__': + run_pipeline_explain() + +This will execute the `run_pipeline_explain` function, which performs all the steps described above. + +.. note:: + Ensure that the required data files and dependencies are available in your environment before running the example. + +.. seealso:: + For more information on the `Fedot` framework and its capabilities, refer to the `official documentation `_. diff --git a/docs/source/examples/simple/pipeline_import_export.rst b/docs/source/examples/simple/pipeline_import_export.rst new file mode 100644 index 0000000000..274582594c --- /dev/null +++ b/docs/source/examples/simple/pipeline_import_export.rst @@ -0,0 +1,227 @@ +.. _import_export_example: + +Import and Export of Regression Pipeline Example +=========================================================================== + +This example demonstrates how to import and export a regression pipeline using the Fedot framework. The pipeline is specifically designed for regression tasks and uses the RANSAC algorithm for model fitting. The example covers the creation of a regression dataset, defining the regression task, training the model, and then exporting and importing the pipeline to verify its functionality. + +.. code-block:: python + + import json + import os + + import numpy as np + + from examples.simple.regression.regression_with_tuning import get_regression_dataset + from examples.simple.regression.regression_pipelines import regression_ransac_pipeline + from fedot.core.data.data import InputData + from fedot.core.pipelines.pipeline import Pipeline + from fedot.core.repository.dataset_types import DataTypesEnum + from fedot.core.repository.tasks import Task, TaskTypesEnum + from fedot.core.utils import fedot_project_root + + def create_correct_path(path: str, dirname_flag: bool = False): + """ + Create path with time which was created during the testing process. + """ + # TODO: this function is used in many places, but now is not really needed + last_el = None + for dirname in next(os.walk(os.path.curdir))[1]: + if dirname.endswith(path): + if dirname_flag: + last_el = dirname + else: + file = os.path.join(dirname, path + '.json') + last_el = file + return last_el + + def run_import_export_example(pipeline_path, pipeline): + features_options = {'informative': 1, 'bias': 0.0} + samples_amount = 100 + features_amount = 2 + x_train, y_train, x_test, y_test = get_regression_dataset(features_options, + samples_amount, + features_amount) + + # Define regression task + task = Task(TaskTypesEnum.regression) + + # Prepare data to train the model + train_input = InputData(idx=np.arange(0, len(x_train)), + features=x_train, + target=y_train, + task=task, + data_type=DataTypesEnum.table) + + predict_input = InputData(idx=np.arange(0, len(x_test)), + features=x_test, + target=None, + task=task, + data_type=DataTypesEnum.table) + + # Get pipeline and fit it + pipeline.fit_from_scratch(train_input) + + predicted_output = pipeline.predict(predict_input) + prediction_before_export = np.array(predicted_output.predict) + print(f'Before export {prediction_before_export[:4]}') + + # Export it + path_to_save_and_load = f'{fedot_project_root()}/examples/simple/{pipeline_path}' + pipeline.save(path=path_to_save_and_load, create_subdir=False, is_datetime_in_path=False) + + # Import pipeline + new_pipeline = Pipeline().load(path_to_save_and_load) + + predicted_output_after_export = new_pipeline.predict(predict_input) + prediction_after_export = np.array(predicted_output_after_export.predict) + + print(f'After import {prediction_after_export[:4]}') + + dict_pipeline, dict_fitted_operations = pipeline.save() + dict_pipeline = json.loads(dict_pipeline) + pipeline_from_dict = Pipeline.from_serialized(dict_pipeline, dict_fitted_operations) + + predicted_output = pipeline_from_dict.predict(predict_input) + prediction = np.array(predicted_output.predict) + print(f'Prediction from pipeline loaded from dict {prediction[:4]}') + + if __name__ == '__main__': + run_import_export_example(pipeline_path='import_export', pipeline=regression_ransac_pipeline()) + +Step-by-Step Guide +------------------ + +1. **Import Necessary Libraries and Modules** + + The example starts by importing necessary libraries and modules required for the regression task and pipeline operations. + + .. code-block:: python + + import json + import os + + import numpy as np + + from examples.simple.regression.regression_with_tuning import get_regression_dataset + from examples.simple.regression.regression_pipelines import regression_ransac_pipeline + from fedot.core.data.data import InputData + from fedot.core.pipelines.pipeline import Pipeline + from fedot.core.repository.dataset_types import DataTypesEnum + from fedot.core.repository.tasks import Task, TaskTypesEnum + from fedot.core.utils import fedot_project_root + +2. **Create a Function to Handle Paths** + + A function `create_correct_path` is defined to handle paths, although it is commented as not needed for the current example. + + .. code-block:: python + + def create_correct_path(path: str, dirname_flag: bool = False): + # ... + +3. **Define the Main Function for Import and Export** + + The `run_import_export_example` function is defined to handle the entire process of creating a dataset, training a model, and then exporting and importing the pipeline. + + .. code-block:: python + + def run_import_export_example(pipeline_path, pipeline): + # ... + +4. **Create a Regression Dataset** + + The `get_regression_dataset` function is used to generate a dataset with specified parameters. + + .. code-block:: python + + x_train, y_train, x_test, y_test = get_regression_dataset(features_options, + samples_amount, + features_amount) + +5. **Define the Regression Task** + + A regression task is defined using the `Task` class from the Fedot framework. + + .. code-block:: python + + task = Task(TaskTypesEnum.regression) + +6. **Prepare Data for Training and Prediction** + + Data is prepared for both training and prediction using the `InputData` class. + + .. code-block:: python + + train_input = InputData(idx=np.arange(0, len(x_train)), + features=x_train, + target=y_train, + task=task, + data_type=DataTypesEnum.table) + + predict_input = InputData(idx=np.arange(0, len(x_test)), + features=x_test, + target=None, + task=task, + data_type=DataTypesEnum.table) + +7. **Train the Pipeline** + + The pipeline is trained using the training data. + + .. code-block:: python + + pipeline.fit_from_scratch(train_input) + +8. **Predict Using the Trained Pipeline** + + Predictions are made on the test data using the trained pipeline. + + .. code-block:: python + + predicted_output = pipeline.predict(predict_input) + +9. **Export the Pipeline** + + The pipeline is exported to a specified path. + + .. code-block:: python + + pipeline.save(path=path_to_save_and_load, create_subdir=False, is_datetime_in_path=False) + +10. **Import the Pipeline** + + The pipeline is imported from the saved path. + + .. code-block:: python + + new_pipeline = Pipeline().load(path_to_save_and_load) + +11. **Predict Using the Imported Pipeline** + + Predictions are made again to verify the functionality of the imported pipeline. + + .. code-block:: python + + predicted_output_after_export = new_pipeline.predict(predict_input) + +12. **Save and Load Pipeline from Dictionary** + + The pipeline is also saved and loaded from a dictionary format to demonstrate an alternative method of serialization. + + .. code-block:: python + + dict_pipeline, dict_fitted_operations = pipeline.save() + dict_pipeline = json.loads(dict_pipeline) + pipeline_from_dict = Pipeline.from_serialized(dict_pipeline, dict_fitted_operations) + +13. **Execute the Main Function** + + The `run_import_export_example` function is executed with a specified pipeline path and the regression pipeline. + + .. code-block:: python + + if __name__ == '__main__': + run_import_export_example(pipeline_path='import_export', pipeline=regression_ransac_pipeline()) + +This documentation page provides a comprehensive guide to understanding and using the import and export functionality of regression pipelines in the Fedot framework. Users can copy and paste the provided code snippets to implement similar functionality in their own projects. \ No newline at end of file diff --git a/docs/source/examples/simple/pipeline_log.rst b/docs/source/examples/simple/pipeline_log.rst new file mode 100644 index 0000000000..c0df99179b --- /dev/null +++ b/docs/source/examples/simple/pipeline_log.rst @@ -0,0 +1,85 @@ +.. _log_example_page: + +Logging Example +=============== + +This example demonstrates how to integrate logging into a machine learning pipeline using the `golem` framework. The main task solved here is the creation and fitting of a complex classification pipeline, with detailed logging of the process. + +Overview +-------- + +The example showcases the use of logging to track the execution of a machine learning pipeline. It includes steps to set up a log file, create a classification pipeline, and fit the pipeline to training data. The logging mechanism is used to record important events and status updates during the execution of the pipeline. + +Step-by-Step Guide +------------------ + +1. **Import Necessary Modules** + + The first block of code imports the necessary modules and functions required for the example. + + .. code-block:: python + + import logging + import pathlib + + from golem.core.log import Log + + from examples.simple.classification.classification_pipelines import classification_complex_pipeline + from examples.simple.pipeline_tune import get_case_train_test_data + +2. **Define the `run_log_example` Function** + + This function takes a `log_file` parameter and performs the following tasks: + + a. **Fetch Training Data** + + The function starts by fetching training data using the `get_case_train_test_data` function. + + .. code-block:: python + + def run_log_example(log_file): + train_data, _ = get_case_train_test_data() + + b. **Initialize Logging** + + It then initializes a logger with the specified log file and sets the output logging level to `logging.FATAL`. The logger is configured with a prefix derived from the stem of the current file's path. + + .. code-block:: python + + log = Log(log_file=log_file, output_logging_level=logging.FATAL).get_adapter(prefix=pathlib.Path(__file__).stem) + + c. **Create and Fit the Classification Pipeline** + + The function logs the start of creating the pipeline, creates the pipeline using `classification_complex_pipeline`, and then logs the start of fitting the pipeline. The pipeline is fitted to the training data. + + .. code-block:: python + + log.info('start creating pipeline') + pipeline = classification_complex_pipeline() + + log.info('start fitting pipeline') + pipeline.fit(train_data) + +3. **Run the Example** + + The example is executed by calling the `run_log_example` function with a specified log file. + + .. code-block:: python + + if __name__ == '__main__': + run_log_example(log_file='example_log.log') + +Usage +----- + +To use this example, you can copy and paste the provided code into your Python environment. Ensure that the necessary modules are available in your environment. You can modify the `log_file` parameter to specify a different log file or adjust the logging level as needed. + +.. note:: + This example assumes the availability of certain functions and modules within the `golem` framework. Ensure that these are correctly imported and available in your environment. + +.. seealso:: + For more detailed information on logging and the `golem` framework, refer to the official documentation. + +.. _golem_framework: https://golem-framework.readthedocs.io/ + +This documentation page provides a comprehensive guide to understanding and using the logging example within the `golem` framework. It ensures that users can easily follow the steps and adapt the code to their own purposes. \ No newline at end of file diff --git a/docs/source/examples/simple/pipeline_tune.rst b/docs/source/examples/simple/pipeline_tune.rst new file mode 100644 index 0000000000..215da7d64f --- /dev/null +++ b/docs/source/examples/simple/pipeline_tune.rst @@ -0,0 +1,116 @@ + +.. _classification_pipeline_tuning_example: + +Classification Pipeline Tuning Example +================================================================== + +This example demonstrates how to tune a classification pipeline using the SimultaneousTuner from the Fedot framework. The goal is to improve the ROC AUC score of a classification model by iteratively tuning the pipeline and evaluating its performance on a test dataset. + +Overview +-------- + +The example is structured into several logical blocks: + +1. **Data Loading**: The `get_case_train_test_data` function loads training and testing data from CSV files. +2. **Pipeline Initialization**: A classification pipeline is initialized using the `classification_complex_pipeline` function. +3. **Initial Prediction**: The pipeline is fitted on the training data and predictions are made on the test data to obtain an initial ROC AUC score. +4. **Pipeline Tuning**: The `pipeline_tuning` function tunes the pipeline using the SimultaneousTuner and evaluates its performance over multiple iterations. +5. **Results Analysis**: The final ROC AUC scores before and after tuning are compared and displayed. + +Step-by-Step Guide +------------------ + +### Data Loading + +.. code-block:: python + + def get_case_train_test_data(): + """ Function for getting data for train and validation """ + train_file_path, test_file_path = get_scoring_case_data_paths() + + train_data = InputData.from_csv(train_file_path) + test_data = InputData.from_csv(test_file_path) + return train_data, test_data + +### Pipeline Initialization + +.. code-block:: python + + # Pipeline composition + pipeline = classification_complex_pipeline() + +### Initial Prediction + +.. code-block:: python + + # Before tuning prediction + pipeline.fit(train_data) + before_tuning_predicted = pipeline.predict(test_data) + bfr_tun_roc_auc = roc_auc(y_true=test_data.target, y_score=before_tuning_predicted.predict) + +### Pipeline Tuning + +.. code-block:: python + + def pipeline_tuning(pipeline: Pipeline, train_data: InputData, test_data: InputData, local_iter: int, tuner_iter_num: int = 30) -> (float, list): + """ Function for tuning pipeline with SimultaneousTuner + + :param pipeline: pipeline to tune + :param train_data: InputData for train + :param test_data: InputData for validation + :param local_iter: amount of tuner launches + :param tuner_iter_num: amount of iterations, which tuner will perform + + :return mean_metric: mean value of ROC AUC metric + :return several_iter_scores_test: list with metrics + """ + several_iter_scores_test = [] + tuner = TunerBuilder(train_data.task) \ + .with_tuner(SimultaneousTuner) \ + .with_metric(ClassificationMetricsEnum.ROCAUC) \ + .with_iterations(tuner_iter_num) \ + .build(train_data) + for iteration in range(local_iter): + print(f'current local iteration {iteration}') + + # Pipeline tuning + tuned_pipeline = tuner.tune(pipeline) + + # After tuning prediction + tuned_pipeline.fit(train_data) + after_tuning_predicted = tuned_pipeline.predict(test_data) + + # Metrics + aft_tun_roc_auc = roc_auc(y_true=test_data.target, y_score=after_tuning_predicted.predict) + several_iter_scores_test.append(aft_tun_roc_auc) + + max_metric = float(np.max(several_iter_scores_test)) + return max_metric, several_iter_scores_test + +### Results Analysis + +.. code-block:: python + + if __name__ == '__main__': + train_data, test_data = get_case_train_test_data() + + # Pipeline composition + pipeline = classification_complex_pipeline() + + # Before tuning prediction + pipeline.fit(train_data) + before_tuning_predicted = pipeline.predict(test_data) + bfr_tun_roc_auc = roc_auc(y_true=test_data.target, y_score=before_tuning_predicted.predict) + + local_iter = 5 + # Pipeline tuning + after_tune_roc_auc, several_iter_scores_test = pipeline_tuning(pipeline=pipeline, train_data=train_data, test_data=test_data, local_iter=local_iter) + + print(f'Several test scores {several_iter_scores_test}') + print(f'Maximal test score over {local_iter} iterations: {after_tune_roc_auc}') + print(f'ROC-AUC before tuning {round(bfr_tun_roc_auc, 3)}') + print(f'ROC-AUC after tuning {round(after_tune_roc_auc, 3)}') + +This documentation page provides a comprehensive understanding of the classification pipeline tuning example. Users can copy and paste the provided code snippets to reproduce the example and adapt it to their own classification tasks. + +This .rst formatted documentation page is structured to guide the user through the example, explaining each logical block and providing the full code for reference. The user should be able to understand the example and apply it to their own purposes. \ No newline at end of file diff --git a/docs/source/examples/simple/pipeline_tuning_with_iopt.rst b/docs/source/examples/simple/pipeline_tuning_with_iopt.rst new file mode 100644 index 0000000000..ee58899892 --- /dev/null +++ b/docs/source/examples/simple/pipeline_tuning_with_iopt.rst @@ -0,0 +1,125 @@ + +.. _tune_pipeline_example: + +========================================================================= +Tuning a Machine Learning Pipeline Example +========================================================================= + +This example demonstrates how to tune a machine learning pipeline using the Fedot framework. The pipeline is tuned to improve its performance on a regression task using a dataset from a CSV file. The tuning process involves optimizing the pipeline's hyperparameters to minimize the Mean Squared Error (MSE) on a test dataset. + +.. note:: + This example requires the Fedot framework to be installed. You can install it using pip: + + .. code-block:: bash + + pip install fedot + +Example Overview +================ + +The example is structured into several logical blocks: + +1. **Pipeline Initialization**: A pipeline is created with a decision tree regression model and a KNN regression model. +2. **Data Loading and Splitting**: The dataset is loaded from a CSV file and split into training and testing sets. +3. **Pipeline Tuning**: The pipeline is tuned using an optimization algorithm to find the best hyperparameters. +4. **Evaluation**: The performance of the pipeline before and after tuning is evaluated and compared. + +Step-by-Step Guide +================== + +1. **Pipeline Initialization** + + The pipeline is initialized using the `PipelineBuilder` class. Nodes for decision tree regression ('dtreg') and KNN regression ('knnreg') are added, and their outputs are joined using a random forest regression ('rfr') node. + + .. code-block:: python + + pipeline = (PipelineBuilder() + .add_node('dtreg', 0) + .add_node('knnreg', 1) + .join_branches('rfr') + .build()) + +2. **Data Loading and Splitting** + + The dataset is loaded from a CSV file and converted into an `InputData` object. The data is then split into training and testing sets. + + .. code-block:: python + + data_path = f'{fedot_project_root()}/cases/data/cholesterol/cholesterol.csv' + data = InputData.from_csv(data_path, task=Task(TaskTypesEnum.regression)) + train_data, test_data = train_test_data_setup(data) + +3. **Pipeline Tuning** + + The `tune_pipeline` function is defined to perform the tuning process. It takes the pipeline, training data, testing data, and the number of tuning iterations as inputs. + + .. code-block:: python + + def tune_pipeline(pipeline: Pipeline, + train_data: InputData, + test_data: InputData, + tuner_iter_num: int = 100): + ... + + Inside the function, the pipeline is fitted to the training data, and its performance is evaluated on the test data before tuning. Then, a tuner is built using the `TunerBuilder` class, which configures the optimization process. + + .. code-block:: python + + pipeline_tuner = TunerBuilder(task) \ + .with_tuner(IOptTuner) \ + .with_requirements(requirements) \ + .with_metric(metric) \ + .with_iterations(tuner_iter_num) \ + .with_additional_params(eps=0.02, r=1.5, refine_solution=True) \ + .build(train_data) + + The pipeline is then tuned using the tuner, and the tuned pipeline is fitted to the training data. + + .. code-block:: python + + tuned_pipeline = pipeline_tuner.tune(pipeline) + tuned_pipeline.fit(train_data) + +4. **Evaluation** + + The performance of the pipeline after tuning is evaluated on the test data, and the results are printed. + + .. code-block:: python + + after_tuning_predicted = tuned_pipeline.predict(test_data) + metric_after_tuning = MSE().metric(test_data, after_tuning_predicted) + + print(f'\nMetric before tuning: {metric_before_tuning}') + print(f'Metric after tuning: {metric_after_tuning}') + + The tuned pipeline is returned by the function. + +Running the Example +=================== + +To run the example, execute the following code: + +.. code-block:: python + + if __name__ == '__main__': + pipeline = (PipelineBuilder() + .add_node('dtreg', 0) + .add_node('knnreg', 1) + .join_branches('rfr') + .build()) + data_path = f'{fedot_project_root()}/cases/data/cholesterol/cholesterol.csv' + + data = InputData.from_csv(data_path, + task=Task(TaskTypesEnum.regression)) + train_data, test_data = train_test_data_setup(data) + tuned_pipeline = tune_pipeline(pipeline, train_data, test_data, tuner_iter_num=200) + +This will load the dataset, create and tune the pipeline, and print the MSE before and after tuning. + +.. note:: + Make sure to adjust the path to the CSV file if it's located in a different directory. + +Conclusion +========== + +This example provides a practical demonstration of how to tune a machine learning pipeline using the Fedot framework. By following this guide, users can understand the process of optimizing a pipeline for better performance on regression tasks. \ No newline at end of file diff --git a/docs/source/examples/simple/pipeline_visualization.rst b/docs/source/examples/simple/pipeline_visualization.rst new file mode 100644 index 0000000000..5881e191e6 --- /dev/null +++ b/docs/source/examples/simple/pipeline_visualization.rst @@ -0,0 +1,144 @@ + +.. _pipeline_visualization_example: + +========================================================================= +Example: Visualizing Machine Learning Pipelines +========================================================================= + +This example demonstrates how to create and visualize a machine learning pipeline using the Fedot framework. The pipeline includes preprocessing steps and multiple models for prediction. The visualization options include default settings, customized styles, and different rendering engines such as networkx, pyvis, and graphviz. + +.. note:: + This example requires the `Fedot` library and optionally `pyvis` and `graphviz` for advanced visualization. + +.. contents:: Table of Contents + :depth: 2 + :local: + +Creating the Pipeline +--------------------- + +The first step is to define the structure of the pipeline. This is done by specifying the nodes and their dependencies. + +.. code-block:: python + + from fedot.core.pipelines.node import PipelineNode + from fedot.core.pipelines.pipeline import Pipeline + + def generate_pipeline() -> Pipeline: + node_scaling = PipelineNode('scaling') + node_first = PipelineNode('kmeans', nodes_from=[node_scaling]) + node_second = PipelineNode('rf', nodes_from=[node_scaling]) + node_third = PipelineNode('linear', nodes_from=[node_scaling]) + node_root = PipelineNode('logit', nodes_from=[node_first, node_second, node_third, node_scaling]) + + return Pipeline(node_root) + +Visualizing the Pipeline +------------------------- + +The pipeline can be visualized using different methods and customization options. + +Default Visualization +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + def show_default(pipeline: Pipeline): + """ Show with default properties via networkx. """ + pipeline.show() + +Customized Visualization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + def show_customized(pipeline: Pipeline): + """ Show with adjusted sizes and green nodes. """ + pipeline.show(node_color='green', edge_curvature_scale=1.2, node_size_scale=2.0, font_size_scale=1.4) + +Custom Colors Visualization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + def show_custom_colors(pipeline: Pipeline): + """ Show with colors defined by label-color dictionary. """ + pipeline.show(node_color={'scaling': 'tab:olive', 'rf': '#FF7F50', 'linear': (0, 1, 1), None: 'black'}, + edge_curvature_scale=1.2, node_size_scale=2.0, font_size_scale=1.4) + +Function-Defined Colors Visualization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + def show_complex_colors(pipeline: Pipeline): + """ Show with colors defined by function. """ + def nodes_color(labels): + if 'xgboost' in labels: + return {'xgboost': 'tab:orange', None: 'black'} + else: + return {'rf': 'tab:green', None: 'black'} + + pipeline.show(node_color=nodes_color) + +Pyvis Visualization +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + def show_pyvis(pipeline: Pipeline): + """ Show with pyvis. """ + pipeline.show(engine='pyvis') + + def show_pyvis_custom_colors(pipeline: Pipeline): + """ Show with pyvis with custom colors. """ + pipeline.show(engine='pyvis', + node_color={'scaling': 'tab:olive', 'rf': '#FF7F50', 'linear': (0, 1, 1), None: 'black'}) + +Graphviz Visualization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + def show_graphviz(pipeline: Pipeline): + """ Show with graphviz (requires Graphviz and pygraphviz). """ + pipeline.show(engine='graphviz') + + def show_graphviz_custom_colors(pipeline: Pipeline): + """ Show with graphviz with custom colors (requires Graphviz and pygraphviz). """ + pipeline.show(engine='graphviz', + node_color={'scaling': 'tab:olive', 'rf': '#FF7F50', 'linear': (0, 1, 1), None: 'black'}) + +Running the Example +-------------------- + +The main function orchestrates the creation of the pipeline and its visualization using various methods. + +.. code-block:: python + + def main(): + pipeline = generate_pipeline() + show_default(pipeline) + show_customized(pipeline) + show_custom_colors(pipeline) + show_complex_colors(pipeline) + show_complex_colors(PipelineBuilder(*pipeline.nodes).add_node('xgboost').build()) + show_pyvis(pipeline) + show_pyvis_custom_colors(pipeline) + try: + import graphviz + show_graphviz(pipeline) + show_graphviz_custom_colors(pipeline) + except ImportError: + default_log().info('Either Graphviz or pygraphviz is not installed. Skipping visualizations.') + + if __name__ == '__main__': + main() + +.. note:: + Ensure that you have the necessary libraries installed and configured correctly for the visualization engines you wish to use. + +.. seealso:: + - `Fedot Documentation `_ + - `Pyvis Documentation `_ + - `Graphviz Documentation `_ diff --git a/docs/source/examples/simple/regression_with_tuning.rst b/docs/source/examples/simple/regression_with_tuning.rst new file mode 100644 index 0000000000..8864df1438 --- /dev/null +++ b/docs/source/examples/simple/regression_with_tuning.rst @@ -0,0 +1,85 @@ + +.. _regression_with_tuning: + +==================================================================== +Regression Example with Tuning +==================================================================== + +This example demonstrates how to use a regression pipeline with tuning capabilities. The pipeline is tested on different datasets with varying numbers of samples, features, and options. The goal is to showcase the pipeline's ability to handle diverse datasets and to optimize its performance using a tuner. + +Overview +-------- + +The example consists of two main functions: `get_regression_dataset` and `run_experiment`. The `get_regression_dataset` function generates a synthetic regression dataset with specified parameters. The `run_experiment` function uses this dataset to train, predict, and tune a regression model using a pipeline. + +Step-by-Step Guide +------------------ + +1. Importing Necessary Libraries +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + from datetime import timedelta + import numpy as np + from golem.core.tuning.sequential import SequentialTuner + from sklearn.metrics import mean_absolute_error + from sklearn.model_selection import train_test_split + from examples.simple.regression.regression_pipelines import regression_ransac_pipeline + from fedot.core.data.data import InputData + from fedot.core.pipelines.tuning.tuner_builder import TunerBuilder + from fedot.core.repository.dataset_types import DataTypesEnum + from fedot.core.repository.metrics_repository import RegressionMetricsEnum + from fedot.core.repository.tasks import Task, TaskTypesEnum + from fedot.core.utils import set_random_seed + from fedot.utilities.synth_dataset_generator import regression_dataset + +2. Generating a Regression Dataset +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + def get_regression_dataset(features_options, samples_amount=250, + features_amount=5): + ... + return x_train, y_train, x_test, y_test + +This function generates a synthetic regression dataset with the specified number of samples and features. It also applies a random scaling factor to each feature and splits the data into training and testing sets. + +3. Running the Experiment +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + def run_experiment(pipeline, tuner): + ... + if __name__ == '__main__': + set_random_seed(2020) + run_experiment(regression_ransac_pipeline(), tuner=SequentialTuner) + +The `run_experiment` function iterates over different configurations of samples, features, and options. For each configuration, it generates a dataset, defines a regression task, trains the pipeline, predicts on the test set, and calculates the mean absolute error (MAE). If a tuner is provided, it also tunes the pipeline and reports the MAE after tuning. + +4. Pipeline Tuning +^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: python + + if tuner is not None: + ... + pipeline_tuner = ( + TunerBuilder(task) + .with_tuner(tuner) + .with_metric(RegressionMetricsEnum.MAE) + .with_iterations(50) + .with_timeout(timedelta(seconds=50)) + .build(train_input) + ) + tuned_pipeline = pipeline_tuner.tune(pipeline) + ... + +If a tuner is provided, the pipeline is tuned using the specified metric (MAE), number of iterations, and timeout. The tuned pipeline is then used to predict and calculate the MAE. + +Conclusion +---------- + +This example provides a comprehensive demonstration of how to use a regression pipeline with tuning capabilities. It showcases the pipeline's flexibility in handling different datasets and its ability to improve performance through tuning. Users can easily adapt this example to their own regression tasks by modifying the dataset generation parameters and the tuning configuration. \ No newline at end of file diff --git a/docs/source/examples/simple/resample_example.rst b/docs/source/examples/simple/resample_example.rst new file mode 100644 index 0000000000..674004700c --- /dev/null +++ b/docs/source/examples/simple/resample_example.rst @@ -0,0 +1,130 @@ + +.. _resample_example: + +Resample Example +================ + +This example demonstrates the use of two classification pipelines, one with balancing and one without, and includes an optional tuning process. The example can be run with either synthetic data or a real dataset. + +.. note:: + This example requires the `Fedot` framework and related libraries. + +Step-by-Step Guide +------------------ + +1. **Importing Necessary Libraries** + + The first block of code imports all the necessary libraries and modules required for the example. + + .. code-block:: python + + from datetime import timedelta + import numpy as np + import pandas as pd + from golem.core.tuning.simultaneous import SimultaneousTuner + from sklearn.metrics import roc_auc_score as roc_auc + from sklearn.model_selection import train_test_split + from examples.simple.classification.classification_pipelines import classification_pipeline_without_balancing, classification_pipeline_with_balancing + from examples.simple.classification.classification_with_tuning import get_classification_dataset + from fedot.core.data.data import InputData + from fedot.core.pipelines.tuning.tuner_builder import TunerBuilder + from fedot.core.repository.dataset_types import DataTypesEnum + from fedot.core.repository.metrics_repository import RegressionMetricsEnum + from fedot.core.repository.tasks import TaskTypesEnum, Task + from fedot.core.utils import fedot_project_root + +2. **Function Definition** + + The function `run_resample_example` is defined to encapsulate the entire process. It takes two parameters: `path_to_data` (optional, for specifying a path to a real dataset) and `tune` (a boolean indicating whether to perform tuning). + + .. code-block:: python + + def run_resample_example(path_to_data=None, tune=False): + ... + +3. **Data Preparation** + + Depending on whether `path_to_data` is provided, the function either generates synthetic data or loads and processes a real dataset. + + .. code-block:: python + + if path_to_data is None: + ... + else: + ... + +4. **Data Analysis** + + The function prints the counts of each class in the training set to show the class distribution. + + .. code-block:: python + + unique_class, counts_class = np.unique(y_train, return_counts=True) + print(f'Two classes: {unique_class}') + print(f'{unique_class[0]}: {counts_class[0]}') + print(f'{unique_class[1]}: {counts_class[1]}') + +5. **Task and Input Data Setup** + + A classification task is defined, and input data objects are created for training and prediction. + + .. code-block:: python + + task = Task(TaskTypesEnum.classification) + train_input = InputData(idx=np.arange(0, len(x_train)), features=x_train, target=y_train, task=task, data_type=DataTypesEnum.table) + predict_input = InputData(idx=np.arange(0, len(x_test)), features=x_test, target=None, task=task, data_type=DataTypesEnum.table) + +6. **Pipeline Execution without Balancing** + + A classification pipeline without balancing is created, fitted, and used to predict the test set. The ROC-AUC score is calculated and printed. + + .. code-block:: python + + print('Begin fit Pipeline without balancing') + pipeline = classification_pipeline_without_balancing() + pipeline.fit_from_scratch(train_input) + predict_labels = pipeline.predict(predict_input) + preds = predict_labels.predict + print(f'ROC-AUC of pipeline without balancing {roc_auc(y_test, preds):.4f}\n') + +7. **Pipeline Execution with Balancing** + + A classification pipeline with balancing is created, fitted, and used to predict the test set. The ROC-AUC score is calculated and printed. + + .. code-block:: python + + print('Begin fit Pipeline with balancing') + pipeline = classification_pipeline_with_balancing() + pipeline.fit(train_input) + predict_labels = pipeline.predict(predict_input) + preds = predict_labels.predict + print(f'ROC-AUC of pipeline with balancing {roc_auc(y_test, preds):.4f}\n') + +8. **Tuning Process (Optional)** + + If `tune` is set to True, the function performs a tuning process on the pipeline with balancing. The tuned pipeline is then fitted and used to predict the test set. The ROC-AUC score of the tuned pipeline is calculated and printed. + + .. code-block:: python + + if tune: + ... + print(f'ROC-AUC of tuned pipeline with balancing - {roc_auc(y_test, preds_tuned):.4f}\n') + +9. **Running the Example** + + The example is run twice: once with synthetic data and once with a real dataset, the latter including the tuning process. + + .. code-block:: python + + if __name__ == '__main__': + run_resample_example() + print('=' * 25) + run_resample_example(f'{fedot_project_root()}/examples/data/credit_card_anomaly.csv', tune=True) + +.. note:: + Ensure that the paths to datasets and the `fedot_project_root()` function are correctly configured in your environment. + +.. seealso:: + For more detailed information on the `Fedot` framework and its capabilities, refer to the `official documentation `_. + +This documentation page provides a comprehensive overview of the example, breaking down the code into logical blocks and explaining each step. Users should be able to understand and replicate the example with their own data. \ No newline at end of file diff --git a/docs/source/examples/simple/ts_pipelines.rst b/docs/source/examples/simple/ts_pipelines.rst new file mode 100644 index 0000000000..c31d3dede6 --- /dev/null +++ b/docs/source/examples/simple/ts_pipelines.rst @@ -0,0 +1,398 @@ +.. _ts_pipelines_doc: + +Time Series Pipelines Documentation +=============================================================== + +This documentation provides an overview and detailed explanation of various time series analysis pipelines implemented using the `fedot.core.pipelines.pipeline_builder` module. Each pipeline is designed to handle different aspects of time series data, including preprocessing, feature engineering, and model training. + +.. note:: + Ensure you have the necessary dependencies installed to run these pipelines. + +.. _ts_ets_pipeline: + +1. Exponential Smoothing Pipeline +--------------------------------- + +.. code-block:: python + + def ts_ets_pipeline(): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_ets_pipeline.png + :width: 55% + + Where cut - cut part of dataset and ets - exponential smoothing + """ + pip_builder = PipelineBuilder().add_node('cut').add_node('ets', + params={'error': 'add', + 'trend': 'add', + 'seasonal': 'add', + 'damped_trend': False, + 'seasonal_periods': 20}) + pipeline = pip_builder.build() + return pipeline + +This pipeline starts with a 'cut' operation to select a portion of the dataset, followed by an 'ets' (Exponential Smoothing) node for time series forecasting. The 'ets' node is configured with parameters specifying additive error, trend, and seasonal components. + +.. _ts_ets_ridge_pipeline: + +2. Exponential Smoothing with Ridge Pipeline +-------------------------------------------- + +.. code-block:: python + + def ts_ets_ridge_pipeline(): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_ets_ridge_pipeline.png + :width: 55% + + Where cut - cut part of dataset, ets - exponential smoothing + """ + pip_builder = PipelineBuilder() \ + .add_sequence(('cut', {'cut_part': 0.5}), + ('ets', {'error': 'add', 'trend': 'add', 'seasonal': 'add', + 'damped_trend': False, 'seasonal_periods': 20}), + branch_idx=0) \ + .add_sequence('lagged', 'ridge', branch_idx=1).join_branches('ridge') + + pipeline = pip_builder.build() + return pipeline + +This pipeline includes a sequence of operations starting with a 'cut' node to reduce the dataset size, followed by an 'ets' node. A separate branch with 'lagged' and 'ridge' nodes is then joined at the 'ridge' node. + +.. _ts_glm_pipeline: + +3. Generalized Linear Model Pipeline +------------------------------------ + +.. code-block:: python + + def ts_glm_pipeline(): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_glm_pipeline.png + :width: 55% + + Where glm - Generalized linear model + """ + pipeline = PipelineBuilder().add_node('glm', params={'family': 'gaussian'}).build() + return pipeline + +This simple pipeline uses a 'glm' (Generalized Linear Model) node with a Gaussian family for modeling. + +.. _ts_glm_ridge_pipeline: + +4. Generalized Linear Model with Ridge Pipeline +----------------------------------------------- + +.. code-block:: python + + def ts_glm_ridge_pipeline(): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_glm_ridge_pipeline.png + :width: 55% + + Where glm - Generalized linear model + """ + pip_builder = PipelineBuilder() \ + .add_sequence('glm', branch_idx=0) \ + .add_sequence('lagged', 'ridge', branch_idx=1).join_branches('ridge') + + pipeline = pip_builder.build() + return pipeline + +This pipeline includes a 'glm' node in one branch and a sequence of 'lagged' and 'ridge' nodes in another, which are joined at the 'ridge' node. + +.. _ts_polyfit_pipeline: + +5. Polynomial Interpolation Pipeline +------------------------------------ + +.. code-block:: python + + def ts_polyfit_pipeline(degree): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_polyfit_pipeline.png + :width: 55% + + Where polyfit - Polynomial interpolation + """ + pipeline = PipelineBuilder().add_node('polyfit', params={'degree': degree}).build() + return pipeline + +This pipeline uses a 'polyfit' node for polynomial interpolation, with the degree of the polynomial specified as a parameter. + +.. _ts_polyfit_ridge_pipeline: + +6. Polynomial Interpolation with Ridge Pipeline +----------------------------------------------- + +.. code-block:: python + + def ts_polyfit_ridge_pipeline(degree): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_polyfit_ridge_pipeline.png + :width: 55% + + Where polyfit - Polynomial interpolation + """ + pip_builder = PipelineBuilder() \ + .add_sequence(('polyfit', {'degree': degree}), branch_idx=0) \ + .add_sequence('lagged', 'ridge', branch_idx=1).join_branches('ridge') + + pipeline = pip_builder.build() + return pipeline + +This pipeline includes a 'polyfit' node in one branch and a sequence of 'lagged' and 'ridge' nodes in another, which are joined at the 'ridge' node. + +.. _ts_complex_ridge_pipeline: + +7. Complex Ridge Pipeline +------------------------- + +.. code-block:: python + + def ts_complex_ridge_pipeline(): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_complex_ridge_pipeline.png + :width: 55% + + """ + pip_builder = PipelineBuilder() \ + .add_sequence('lagged', 'ridge', branch_idx=0) \ + .add_sequence('lagged', 'ridge', branch_idx=1).join_branches('ridge') + + pipeline = pip_builder.build() + return pipeline + +This pipeline consists of two branches, each containing a 'lagged' and 'ridge' node, which are joined at the 'ridge' node. + +.. _ts_complex_ridge_smoothing_pipeline: + +8. Complex Ridge with Smoothing Pipeline +---------------------------------------- + +.. code-block:: python + + def ts_complex_ridge_smoothing_pipeline(): + """ + Pipeline looking like this + + .. image:: img_ts_pipelines/ts_complex_ridge_smoothing_pipeline.png + :width: 55% + + Where smoothing - rolling mean + """ + pip_builder = PipelineBuilder() \ + .add_sequence('smoothing', 'lagged', 'ridge', branch_idx=0) \ + .add_sequence('lagged', 'ridge', branch_idx=1).join_branches('ridge') + + pipeline = pip_builder.build() + return pipeline + +This pipeline includes a 'smoothing' node (rolling mean) followed by 'lagged' and 'ridge' nodes in one branch, and a 'lagged' and 'ridge' sequence in another, which are joined at the 'ridge' node. + +.. _ts_complex_dtreg_pipeline: + +9. Complex Decision Tree Regressor Pipeline +------------------------------------------- + +.. code-block:: python + + def ts_complex_dtreg_pipeline(first_node='lagged'): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_complex_dtreg_pipeline.png + :width: 55% + + Where dtreg = tree regressor, rfr - random forest regressor + """ + pip_builder = PipelineBuilder() \ + .add_sequence(first_node, 'dtreg', branch_idx=0) \ + .add_sequence(first_node, 'dtreg', branch_idx=1).join_branches('rfr') + + pipeline = pip_builder.build() + return pipeline + +This pipeline includes two branches, each starting with the specified 'first_node' followed by a 'dtreg' (Decision Tree Regressor) node, which are joined at the 'rfr' (Random Forest Regressor) node. + +.. _ts_multiple_ets_pipeline: + +10. Multiple Exponential Smoothing Pipeline +------------------------------------------- + +.. code-block:: python + + def ts_multiple_ets_pipeline(): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_multiple_ets_pipeline.png + :width: 55% + + Where ets - exponential_smoothing + """ + pip_builder = PipelineBuilder() \ + .add_sequence('ets', branch_idx=0) \ + .add_sequence('ets', branch_idx=1) \ + .add_sequence('ets', branch_idx=2) \ + .join_branches('lasso') + + pipeline = pip_builder.build() + return pipeline + +This pipeline includes three 'ets' (Exponential Smoothing) nodes in separate branches, which are joined at the 'lasso' node. + +.. _ts_ar_pipeline: + +11. Auto Regression Pipeline +---------------------------- + +.. code-block:: python + + def ts_ar_pipeline(): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_ar_pipeline.png + :width: 55% + + Where ar - auto regression + """ + pipeline = PipelineBuilder().add_node('ar').build() + return pipeline + +This simple pipeline uses an 'ar' (Auto Regression) node for time series forecasting. + +.. _ts_arima_pipeline: + +12. ARIMA Pipeline +------------------ + +.. code-block:: python + + def ts_arima_pipeline(): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_arima_pipeline.png + :width: 55% + + """ + pipeline = PipelineBuilder().add_node("arima").build() + return pipeline + +This pipeline uses an 'arima' node for time series forecasting, implementing the AutoRegressive Integrated Moving Average model. + +.. _ts_stl_arima_pipeline: + +13. STL-ARIMA Pipeline +---------------------- + +.. code-block:: python + + def ts_stl_arima_pipeline(): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/ts_stl_arima_pipeline.png + :width: 55% + + """ + pipeline = PipelineBuilder().add_node("stl_arima").build() + return pipeline + +This pipeline uses an 'stl_arima' node, which combines Seasonal and Trend decomposition using Loess with the ARIMA model for time series forecasting. + +.. _ts_locf_ridge_pipeline: + +14. LOCF Ridge Pipeline +----------------------- + +.. code-block:: python + + def ts_locf_ridge_pipeline(): + """ + Pipeline with naive LOCF (last observation carried forward) model + and lagged features + + .. image:: img_ts_pipelines/ts_locf_ridge_pipeline.png + :width: 55% + + """ + pip_builder = PipelineBuilder() \ + .add_sequence('locf', branch_idx=0) \ + .add_sequence('ar', branch_idx=1) \ + .join_branches('ridge') + + pipeline = pip_builder.build() + return pipeline + +This pipeline includes a 'locf' node for handling missing values using the Last Observation Carried Forward method, followed by an 'ar' node for auto regression, which is then joined with a 'ridge' node. + +.. _ts_naive_average_ridge_pipeline: + +15. Naive Average Ridge Pipeline +-------------------------------- + +.. code-block:: python + + def ts_naive_average_ridge_pipeline(): + """ + Pipeline with simple forecasting model (the forecast is mean value for known + part) + + .. image:: img_ts_pipelines/ts_naive_average_ridge_pipeline.png + :width: 55% + + """ + pip_builder = PipelineBuilder() \ + .add_sequence('ts_naive_average', branch_idx=0) \ + .add_sequence('lagged', branch_idx=1) \ + .join_branches('ridge') + + pipeline = pip_builder.build() + return pipeline + +This pipeline starts with a 'ts_naive_average' node for simple forecasting based on the mean value of known data, followed by a 'lagged' node, which is then joined with a 'ridge' node. + +.. _cgru_pipeline: + +16. Convolutional GRU Pipeline +------------------------------ + +.. code-block:: python + + def cgru_pipeline(window_size=200): + """ + Return pipeline with the following structure: + + .. image:: img_ts_pipelines/cgru_pipeline.png + :width: 55% + + Where cgru - convolutional long short-term memory model + """ + pip_builder = PipelineBuilder() \ + .add_sequence('lagged', 'ridge', branch_idx=0) \ + .add_sequence(('lagged', {'window_size': window_size}), 'cgru', branch_idx=1) \ + .join_branches('ridge') + + pipeline = pip_builder.build() + return pipeline + +This pipeline includes a 'lagged' node with a specified window size followed by a 'cgru' (Convolutional GRU) node in one branch, and a 'lagged' and 'ridge' sequence in another, which are joined at the 'ridge' node. + +This documentation provides a comprehensive guide to the various time series analysis pipelines available, each tailored to specific needs and scenarios. Users can copy and adapt these pipelines for their own projects, ensuring they understand the underlying logic and configuration of each node. \ No newline at end of file diff --git a/docs/source/examples/simple/tuning_pipelines.rst b/docs/source/examples/simple/tuning_pipelines.rst new file mode 100644 index 0000000000..e1cbce642f --- /dev/null +++ b/docs/source/examples/simple/tuning_pipelines.rst @@ -0,0 +1,132 @@ + +.. _time_series_forecasting_example: + +Time Series Forecasting Example +=========================================================== + +This example demonstrates the use of custom pipelines for time series forecasting, including optional tuning and visualization. The example uses a specific dataset and a predefined pipeline to make forecasts and evaluates the performance using mean squared error (MSE) and mean absolute error (MAE). + +Overview +-------- + +The example is structured to perform the following tasks: + +1. Load and prepare the dataset. +2. Apply a predefined pipeline to the dataset. +3. Evaluate the performance without tuning. +4. Optionally, tune the pipeline and re-evaluate the performance. +5. Visualize the results if required. + +Step-by-Step Guide +------------------ + +1. **Import Necessary Libraries** + + The example starts by importing the necessary libraries and modules required for the forecasting task. + + .. code-block:: python + + import numpy as np + from golem.core.tuning.simultaneous import SimultaneousTuner + from sklearn.metrics import mean_squared_error, mean_absolute_error + from examples.advanced.time_series_forecasting.composing_pipelines import visualise, get_border_line_info + from examples.simple.time_series_forecasting.api_forecasting import get_ts_data + from examples.simple.time_series_forecasting.ts_pipelines import ts_locf_ridge_pipeline + from fedot.core.pipelines.pipeline import Pipeline + from fedot.core.pipelines.tuning.tuner_builder import TunerBuilder + from fedot.core.repository.metrics_repository import RegressionMetricsEnum + +2. **Define the Experiment Function** + + The `run_experiment` function is defined to encapsulate the entire process of forecasting. It takes parameters for the dataset, pipeline, forecast length, and options for tuning and visualization. + + .. code-block:: python + + def run_experiment(dataset: str, pipeline: Pipeline, len_forecast=250, tuning=True, visualisalion=False): + """ Example of ts forecasting using custom pipelines with optional tuning + :param dataset: name of dataset + :param pipeline: pipeline to use + :param len_forecast: forecast length + :param tuning: is tuning needed + """ + # show initial pipeline + pipeline.print_structure() + +3. **Load and Prepare the Dataset** + + The dataset is loaded and split into training and testing sets. The target variable for the test set is also prepared. + + .. code-block:: python + + train_data, test_data, label = get_ts_data(dataset, len_forecast) + test_target = np.ravel(test_data.target) + +4. **Fit the Pipeline and Make Predictions** + + The pipeline is fitted on the training data and used to make predictions on the test data. + + .. code-block:: python + + pipeline.fit(train_data) + prediction = pipeline.predict(test_data) + predict = np.ravel(np.array(prediction.predict)) + +5. **Evaluate Performance Without Tuning** + + The performance of the pipeline without tuning is evaluated using RMSE and MAE. + + .. code-block:: python + + rmse = mean_squared_error(test_target, predict, squared=False) + mae = mean_absolute_error(test_target, predict) + metrics_info['Metrics without tuning'] = {'RMSE': round(rmse, 3), + 'MAE': round(mae, 3)} + +6. **Optionally Tune the Pipeline** + + If tuning is enabled, the pipeline is tuned using a tuner and the performance is re-evaluated. + + .. code-block:: python + + if tuning: + tuner = TunerBuilder(train_data.task) \ + .with_tuner(SimultaneousTuner) \ + .with_metric(RegressionMetricsEnum.MSE) \ + .with_iterations(300) \ + .build(train_data) + pipeline = tuner.tune(pipeline) + pipeline.fit(train_data) + prediction_after = pipeline.predict(test_data) + predict_after = np.ravel(np.array(prediction_after.predict)) + + rmse = mean_squared_error(test_target, predict_after, squared=False) + mae = mean_absolute_error(test_target, predict_after) + metrics_info['Metrics after tuning'] = {'RMSE': round(rmse, 3), + 'MAE': round(mae, 3)} + +7. **Visualize the Results** + + If visualization is enabled, the results are plotted. + + .. code-block:: python + + if visualisalion: + visualise(plot_info) + pipeline.print_structure() + +8. **Run the Experiment** + + The experiment is run with specific parameters. + + .. code-block:: python + + if __name__ == '__main__': + run_experiment('m4_monthly', ts_locf_ridge_pipeline(), len_forecast=10, tuning=True, visualisalion=True) + +Usage +----- + +To use this example, you can copy and paste the code into your Python environment. Ensure that you have the required libraries installed and that the dataset and pipeline are compatible with your use case. Adjust the parameters as needed to fit your specific forecasting task. + +.. note:: + This example assumes that the necessary modules and datasets are available in the specified paths. Make sure to set up your environment accordingly. \ No newline at end of file From 1351d45d32ad1ca0bcc9a8a31d2b6e36d17e7991 Mon Sep 17 00:00:00 2001 From: valer1435 Date: Sat, 8 Jun 2024 21:32:28 +0300 Subject: [PATCH 2/6] fix --- docs/source/examples/simple/cli_call_example.rst | 4 ++-- docs/source/examples/simple/index.rst | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/source/examples/simple/cli_call_example.rst b/docs/source/examples/simple/cli_call_example.rst index ccb7f3195b..2f8d2a39ba 100644 --- a/docs/source/examples/simple/cli_call_example.rst +++ b/docs/source/examples/simple/cli_call_example.rst @@ -1,6 +1,6 @@ - +=========================== Fedot CLI Execution Example -========================== +=========================== This example demonstrates how to execute Fedot tasks (time series forecasting, classification, and regression) from the command line interface (CLI) using .bat files. The code provided manipulates the Python environment path in the .bat files to ensure correct execution and saves the predictions to a CSV file. diff --git a/docs/source/examples/simple/index.rst b/docs/source/examples/simple/index.rst index bb3e34c32c..f122d459e6 100644 --- a/docs/source/examples/simple/index.rst +++ b/docs/source/examples/simple/index.rst @@ -17,6 +17,7 @@ This section provides basic examples of FEDOT usage: fitted_values image_classification_problem multiclass_prediction + cli_call_example multiple_ts_forecasting_tasks pipeline_and_history_visualization pipeline_explain From ec10d75fe4c2ae8148a57198b4ab2222cd077e0f Mon Sep 17 00:00:00 2001 From: valer1435 Date: Sat, 8 Jun 2024 21:57:15 +0300 Subject: [PATCH 3/6] change structure --- .../classification_with_api_builder.rst | 0 .../examples/simple/api_builder/index.rst | 13 + .../multiple_ts_forecasting_tasks.rst | 0 .../api_classification.rst | 0 .../classification_with_tuning.rst | 0 .../image_classification_problem.rst | 0 .../examples/simple/classification/index.rst | 18 + .../multiclass_prediction.rst | 0 .../{ => classification}/resample_example.rst | 0 .../cli_call_example.rst | 0 .../examples/simple/cli_application/index.rst | 10 + docs/source/examples/simple/index.rst | 24 +- .../{ => interpretable}/api_explain.rst | 0 .../examples/simple/interpretable/index.rst | 12 + .../{ => interpretable}/pipeline_explain.rst | 0 .../{ => regression}/api_regression.rst | 0 .../examples/simple/regression/index.rst | 12 + .../regression_with_tuning.rst | 0 .../api_forecasting.rst | 0 .../{ => time_series_forecasting}/cgru.rst | 0 .../fitted_values.rst | 0 .../simple/time_series_forecasting/index.rst | 14 + .../tuning_pipelines.rst | 0 docs/source/examples/simple/ts_pipelines.rst | 398 ------------------ 24 files changed, 85 insertions(+), 416 deletions(-) rename docs/source/examples/simple/{ => api_builder}/classification_with_api_builder.rst (100%) create mode 100644 docs/source/examples/simple/api_builder/index.rst rename docs/source/examples/simple/{ => api_builder}/multiple_ts_forecasting_tasks.rst (100%) rename docs/source/examples/simple/{ => classification}/api_classification.rst (100%) rename docs/source/examples/simple/{ => classification}/classification_with_tuning.rst (100%) rename docs/source/examples/simple/{ => classification}/image_classification_problem.rst (100%) create mode 100644 docs/source/examples/simple/classification/index.rst rename docs/source/examples/simple/{ => classification}/multiclass_prediction.rst (100%) rename docs/source/examples/simple/{ => classification}/resample_example.rst (100%) rename docs/source/examples/simple/{ => cli_application}/cli_call_example.rst (100%) create mode 100644 docs/source/examples/simple/cli_application/index.rst rename docs/source/examples/simple/{ => interpretable}/api_explain.rst (100%) create mode 100644 docs/source/examples/simple/interpretable/index.rst rename docs/source/examples/simple/{ => interpretable}/pipeline_explain.rst (100%) rename docs/source/examples/simple/{ => regression}/api_regression.rst (100%) create mode 100644 docs/source/examples/simple/regression/index.rst rename docs/source/examples/simple/{ => regression}/regression_with_tuning.rst (100%) rename docs/source/examples/simple/{ => time_series_forecasting}/api_forecasting.rst (100%) rename docs/source/examples/simple/{ => time_series_forecasting}/cgru.rst (100%) rename docs/source/examples/simple/{ => time_series_forecasting}/fitted_values.rst (100%) create mode 100644 docs/source/examples/simple/time_series_forecasting/index.rst rename docs/source/examples/simple/{ => time_series_forecasting}/tuning_pipelines.rst (100%) delete mode 100644 docs/source/examples/simple/ts_pipelines.rst diff --git a/docs/source/examples/simple/classification_with_api_builder.rst b/docs/source/examples/simple/api_builder/classification_with_api_builder.rst similarity index 100% rename from docs/source/examples/simple/classification_with_api_builder.rst rename to docs/source/examples/simple/api_builder/classification_with_api_builder.rst diff --git a/docs/source/examples/simple/api_builder/index.rst b/docs/source/examples/simple/api_builder/index.rst new file mode 100644 index 0000000000..14ec3d52ff --- /dev/null +++ b/docs/source/examples/simple/api_builder/index.rst @@ -0,0 +1,13 @@ +Simple example +====================== + +This section provides basic examples of FEDOT usage: + +.. toctree:: + :glob: + :maxdepth: 1 + + classification_with_api_builder + multiple_ts_forecasting_tasks + + diff --git a/docs/source/examples/simple/multiple_ts_forecasting_tasks.rst b/docs/source/examples/simple/api_builder/multiple_ts_forecasting_tasks.rst similarity index 100% rename from docs/source/examples/simple/multiple_ts_forecasting_tasks.rst rename to docs/source/examples/simple/api_builder/multiple_ts_forecasting_tasks.rst diff --git a/docs/source/examples/simple/api_classification.rst b/docs/source/examples/simple/classification/api_classification.rst similarity index 100% rename from docs/source/examples/simple/api_classification.rst rename to docs/source/examples/simple/classification/api_classification.rst diff --git a/docs/source/examples/simple/classification_with_tuning.rst b/docs/source/examples/simple/classification/classification_with_tuning.rst similarity index 100% rename from docs/source/examples/simple/classification_with_tuning.rst rename to docs/source/examples/simple/classification/classification_with_tuning.rst diff --git a/docs/source/examples/simple/image_classification_problem.rst b/docs/source/examples/simple/classification/image_classification_problem.rst similarity index 100% rename from docs/source/examples/simple/image_classification_problem.rst rename to docs/source/examples/simple/classification/image_classification_problem.rst diff --git a/docs/source/examples/simple/classification/index.rst b/docs/source/examples/simple/classification/index.rst new file mode 100644 index 0000000000..c2b912747e --- /dev/null +++ b/docs/source/examples/simple/classification/index.rst @@ -0,0 +1,18 @@ +Simple example +====================== + +This section provides basic examples of FEDOT usage: + +.. toctree:: + :glob: + :maxdepth: 1 + + api_classification + + classification_with_api_builder + classification_with_tuning + image_classification_problem + multiclass_prediction + resample_example + ts_pipelines + diff --git a/docs/source/examples/simple/multiclass_prediction.rst b/docs/source/examples/simple/classification/multiclass_prediction.rst similarity index 100% rename from docs/source/examples/simple/multiclass_prediction.rst rename to docs/source/examples/simple/classification/multiclass_prediction.rst diff --git a/docs/source/examples/simple/resample_example.rst b/docs/source/examples/simple/classification/resample_example.rst similarity index 100% rename from docs/source/examples/simple/resample_example.rst rename to docs/source/examples/simple/classification/resample_example.rst diff --git a/docs/source/examples/simple/cli_call_example.rst b/docs/source/examples/simple/cli_application/cli_call_example.rst similarity index 100% rename from docs/source/examples/simple/cli_call_example.rst rename to docs/source/examples/simple/cli_application/cli_call_example.rst diff --git a/docs/source/examples/simple/cli_application/index.rst b/docs/source/examples/simple/cli_application/index.rst new file mode 100644 index 0000000000..fb26ba9658 --- /dev/null +++ b/docs/source/examples/simple/cli_application/index.rst @@ -0,0 +1,10 @@ +Simple example +====================== + +This section provides basic examples of FEDOT usage: + +.. toctree:: + :glob: + :maxdepth: 1 + + cli_call_example diff --git a/docs/source/examples/simple/index.rst b/docs/source/examples/simple/index.rst index f122d459e6..c8f6bc6ec1 100644 --- a/docs/source/examples/simple/index.rst +++ b/docs/source/examples/simple/index.rst @@ -5,29 +5,17 @@ This section provides basic examples of FEDOT usage: .. toctree:: :glob: - :maxdepth: 2 + :maxdepth: 1 - api_classification - api_explain - api_forecasting - api_regression - cgru - classification_with_api_builder - classification_with_tuning - fitted_values - image_classification_problem - multiclass_prediction - cli_call_example - multiple_ts_forecasting_tasks + api_builder/index + classification/index + cli_application/index + interpretable/index + regression/index pipeline_and_history_visualization - pipeline_explain pipeline_import_export pipeline_log pipeline_tune pipeline_tuning_with_iopt pipeline_visualization - regression_with_tuning - resample_example - ts_pipelines - tuning_pipelines diff --git a/docs/source/examples/simple/api_explain.rst b/docs/source/examples/simple/interpretable/api_explain.rst similarity index 100% rename from docs/source/examples/simple/api_explain.rst rename to docs/source/examples/simple/interpretable/api_explain.rst diff --git a/docs/source/examples/simple/interpretable/index.rst b/docs/source/examples/simple/interpretable/index.rst new file mode 100644 index 0000000000..d9a82ac88f --- /dev/null +++ b/docs/source/examples/simple/interpretable/index.rst @@ -0,0 +1,12 @@ +Simple example +====================== + +This section provides basic examples of FEDOT usage: + +.. toctree:: + :glob: + :maxdepth: 1 + + api_explain + pipeline_explain + diff --git a/docs/source/examples/simple/pipeline_explain.rst b/docs/source/examples/simple/interpretable/pipeline_explain.rst similarity index 100% rename from docs/source/examples/simple/pipeline_explain.rst rename to docs/source/examples/simple/interpretable/pipeline_explain.rst diff --git a/docs/source/examples/simple/api_regression.rst b/docs/source/examples/simple/regression/api_regression.rst similarity index 100% rename from docs/source/examples/simple/api_regression.rst rename to docs/source/examples/simple/regression/api_regression.rst diff --git a/docs/source/examples/simple/regression/index.rst b/docs/source/examples/simple/regression/index.rst new file mode 100644 index 0000000000..ce6ac2f5fb --- /dev/null +++ b/docs/source/examples/simple/regression/index.rst @@ -0,0 +1,12 @@ +Simple example +====================== + +This section provides basic examples of FEDOT usage: + +.. toctree:: + :glob: + :maxdepth: 1 + + api_regression + regression_with_tuning + diff --git a/docs/source/examples/simple/regression_with_tuning.rst b/docs/source/examples/simple/regression/regression_with_tuning.rst similarity index 100% rename from docs/source/examples/simple/regression_with_tuning.rst rename to docs/source/examples/simple/regression/regression_with_tuning.rst diff --git a/docs/source/examples/simple/api_forecasting.rst b/docs/source/examples/simple/time_series_forecasting/api_forecasting.rst similarity index 100% rename from docs/source/examples/simple/api_forecasting.rst rename to docs/source/examples/simple/time_series_forecasting/api_forecasting.rst diff --git a/docs/source/examples/simple/cgru.rst b/docs/source/examples/simple/time_series_forecasting/cgru.rst similarity index 100% rename from docs/source/examples/simple/cgru.rst rename to docs/source/examples/simple/time_series_forecasting/cgru.rst diff --git a/docs/source/examples/simple/fitted_values.rst b/docs/source/examples/simple/time_series_forecasting/fitted_values.rst similarity index 100% rename from docs/source/examples/simple/fitted_values.rst rename to docs/source/examples/simple/time_series_forecasting/fitted_values.rst diff --git a/docs/source/examples/simple/time_series_forecasting/index.rst b/docs/source/examples/simple/time_series_forecasting/index.rst new file mode 100644 index 0000000000..1010b987de --- /dev/null +++ b/docs/source/examples/simple/time_series_forecasting/index.rst @@ -0,0 +1,14 @@ +Simple example +====================== + +This section provides basic examples of FEDOT usage: + +.. toctree:: + :glob: + :maxdepth: 1 + + api_forecasting + cgru + fitted_values + tuning_pipelines + diff --git a/docs/source/examples/simple/tuning_pipelines.rst b/docs/source/examples/simple/time_series_forecasting/tuning_pipelines.rst similarity index 100% rename from docs/source/examples/simple/tuning_pipelines.rst rename to docs/source/examples/simple/time_series_forecasting/tuning_pipelines.rst diff --git a/docs/source/examples/simple/ts_pipelines.rst b/docs/source/examples/simple/ts_pipelines.rst deleted file mode 100644 index c31d3dede6..0000000000 --- a/docs/source/examples/simple/ts_pipelines.rst +++ /dev/null @@ -1,398 +0,0 @@ -.. _ts_pipelines_doc: - -Time Series Pipelines Documentation -=============================================================== - -This documentation provides an overview and detailed explanation of various time series analysis pipelines implemented using the `fedot.core.pipelines.pipeline_builder` module. Each pipeline is designed to handle different aspects of time series data, including preprocessing, feature engineering, and model training. - -.. note:: - Ensure you have the necessary dependencies installed to run these pipelines. - -.. _ts_ets_pipeline: - -1. Exponential Smoothing Pipeline ---------------------------------- - -.. code-block:: python - - def ts_ets_pipeline(): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_ets_pipeline.png - :width: 55% - - Where cut - cut part of dataset and ets - exponential smoothing - """ - pip_builder = PipelineBuilder().add_node('cut').add_node('ets', - params={'error': 'add', - 'trend': 'add', - 'seasonal': 'add', - 'damped_trend': False, - 'seasonal_periods': 20}) - pipeline = pip_builder.build() - return pipeline - -This pipeline starts with a 'cut' operation to select a portion of the dataset, followed by an 'ets' (Exponential Smoothing) node for time series forecasting. The 'ets' node is configured with parameters specifying additive error, trend, and seasonal components. - -.. _ts_ets_ridge_pipeline: - -2. Exponential Smoothing with Ridge Pipeline --------------------------------------------- - -.. code-block:: python - - def ts_ets_ridge_pipeline(): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_ets_ridge_pipeline.png - :width: 55% - - Where cut - cut part of dataset, ets - exponential smoothing - """ - pip_builder = PipelineBuilder() \ - .add_sequence(('cut', {'cut_part': 0.5}), - ('ets', {'error': 'add', 'trend': 'add', 'seasonal': 'add', - 'damped_trend': False, 'seasonal_periods': 20}), - branch_idx=0) \ - .add_sequence('lagged', 'ridge', branch_idx=1).join_branches('ridge') - - pipeline = pip_builder.build() - return pipeline - -This pipeline includes a sequence of operations starting with a 'cut' node to reduce the dataset size, followed by an 'ets' node. A separate branch with 'lagged' and 'ridge' nodes is then joined at the 'ridge' node. - -.. _ts_glm_pipeline: - -3. Generalized Linear Model Pipeline ------------------------------------- - -.. code-block:: python - - def ts_glm_pipeline(): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_glm_pipeline.png - :width: 55% - - Where glm - Generalized linear model - """ - pipeline = PipelineBuilder().add_node('glm', params={'family': 'gaussian'}).build() - return pipeline - -This simple pipeline uses a 'glm' (Generalized Linear Model) node with a Gaussian family for modeling. - -.. _ts_glm_ridge_pipeline: - -4. Generalized Linear Model with Ridge Pipeline ------------------------------------------------ - -.. code-block:: python - - def ts_glm_ridge_pipeline(): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_glm_ridge_pipeline.png - :width: 55% - - Where glm - Generalized linear model - """ - pip_builder = PipelineBuilder() \ - .add_sequence('glm', branch_idx=0) \ - .add_sequence('lagged', 'ridge', branch_idx=1).join_branches('ridge') - - pipeline = pip_builder.build() - return pipeline - -This pipeline includes a 'glm' node in one branch and a sequence of 'lagged' and 'ridge' nodes in another, which are joined at the 'ridge' node. - -.. _ts_polyfit_pipeline: - -5. Polynomial Interpolation Pipeline ------------------------------------- - -.. code-block:: python - - def ts_polyfit_pipeline(degree): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_polyfit_pipeline.png - :width: 55% - - Where polyfit - Polynomial interpolation - """ - pipeline = PipelineBuilder().add_node('polyfit', params={'degree': degree}).build() - return pipeline - -This pipeline uses a 'polyfit' node for polynomial interpolation, with the degree of the polynomial specified as a parameter. - -.. _ts_polyfit_ridge_pipeline: - -6. Polynomial Interpolation with Ridge Pipeline ------------------------------------------------ - -.. code-block:: python - - def ts_polyfit_ridge_pipeline(degree): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_polyfit_ridge_pipeline.png - :width: 55% - - Where polyfit - Polynomial interpolation - """ - pip_builder = PipelineBuilder() \ - .add_sequence(('polyfit', {'degree': degree}), branch_idx=0) \ - .add_sequence('lagged', 'ridge', branch_idx=1).join_branches('ridge') - - pipeline = pip_builder.build() - return pipeline - -This pipeline includes a 'polyfit' node in one branch and a sequence of 'lagged' and 'ridge' nodes in another, which are joined at the 'ridge' node. - -.. _ts_complex_ridge_pipeline: - -7. Complex Ridge Pipeline -------------------------- - -.. code-block:: python - - def ts_complex_ridge_pipeline(): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_complex_ridge_pipeline.png - :width: 55% - - """ - pip_builder = PipelineBuilder() \ - .add_sequence('lagged', 'ridge', branch_idx=0) \ - .add_sequence('lagged', 'ridge', branch_idx=1).join_branches('ridge') - - pipeline = pip_builder.build() - return pipeline - -This pipeline consists of two branches, each containing a 'lagged' and 'ridge' node, which are joined at the 'ridge' node. - -.. _ts_complex_ridge_smoothing_pipeline: - -8. Complex Ridge with Smoothing Pipeline ----------------------------------------- - -.. code-block:: python - - def ts_complex_ridge_smoothing_pipeline(): - """ - Pipeline looking like this - - .. image:: img_ts_pipelines/ts_complex_ridge_smoothing_pipeline.png - :width: 55% - - Where smoothing - rolling mean - """ - pip_builder = PipelineBuilder() \ - .add_sequence('smoothing', 'lagged', 'ridge', branch_idx=0) \ - .add_sequence('lagged', 'ridge', branch_idx=1).join_branches('ridge') - - pipeline = pip_builder.build() - return pipeline - -This pipeline includes a 'smoothing' node (rolling mean) followed by 'lagged' and 'ridge' nodes in one branch, and a 'lagged' and 'ridge' sequence in another, which are joined at the 'ridge' node. - -.. _ts_complex_dtreg_pipeline: - -9. Complex Decision Tree Regressor Pipeline -------------------------------------------- - -.. code-block:: python - - def ts_complex_dtreg_pipeline(first_node='lagged'): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_complex_dtreg_pipeline.png - :width: 55% - - Where dtreg = tree regressor, rfr - random forest regressor - """ - pip_builder = PipelineBuilder() \ - .add_sequence(first_node, 'dtreg', branch_idx=0) \ - .add_sequence(first_node, 'dtreg', branch_idx=1).join_branches('rfr') - - pipeline = pip_builder.build() - return pipeline - -This pipeline includes two branches, each starting with the specified 'first_node' followed by a 'dtreg' (Decision Tree Regressor) node, which are joined at the 'rfr' (Random Forest Regressor) node. - -.. _ts_multiple_ets_pipeline: - -10. Multiple Exponential Smoothing Pipeline -------------------------------------------- - -.. code-block:: python - - def ts_multiple_ets_pipeline(): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_multiple_ets_pipeline.png - :width: 55% - - Where ets - exponential_smoothing - """ - pip_builder = PipelineBuilder() \ - .add_sequence('ets', branch_idx=0) \ - .add_sequence('ets', branch_idx=1) \ - .add_sequence('ets', branch_idx=2) \ - .join_branches('lasso') - - pipeline = pip_builder.build() - return pipeline - -This pipeline includes three 'ets' (Exponential Smoothing) nodes in separate branches, which are joined at the 'lasso' node. - -.. _ts_ar_pipeline: - -11. Auto Regression Pipeline ----------------------------- - -.. code-block:: python - - def ts_ar_pipeline(): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_ar_pipeline.png - :width: 55% - - Where ar - auto regression - """ - pipeline = PipelineBuilder().add_node('ar').build() - return pipeline - -This simple pipeline uses an 'ar' (Auto Regression) node for time series forecasting. - -.. _ts_arima_pipeline: - -12. ARIMA Pipeline ------------------- - -.. code-block:: python - - def ts_arima_pipeline(): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_arima_pipeline.png - :width: 55% - - """ - pipeline = PipelineBuilder().add_node("arima").build() - return pipeline - -This pipeline uses an 'arima' node for time series forecasting, implementing the AutoRegressive Integrated Moving Average model. - -.. _ts_stl_arima_pipeline: - -13. STL-ARIMA Pipeline ----------------------- - -.. code-block:: python - - def ts_stl_arima_pipeline(): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/ts_stl_arima_pipeline.png - :width: 55% - - """ - pipeline = PipelineBuilder().add_node("stl_arima").build() - return pipeline - -This pipeline uses an 'stl_arima' node, which combines Seasonal and Trend decomposition using Loess with the ARIMA model for time series forecasting. - -.. _ts_locf_ridge_pipeline: - -14. LOCF Ridge Pipeline ------------------------ - -.. code-block:: python - - def ts_locf_ridge_pipeline(): - """ - Pipeline with naive LOCF (last observation carried forward) model - and lagged features - - .. image:: img_ts_pipelines/ts_locf_ridge_pipeline.png - :width: 55% - - """ - pip_builder = PipelineBuilder() \ - .add_sequence('locf', branch_idx=0) \ - .add_sequence('ar', branch_idx=1) \ - .join_branches('ridge') - - pipeline = pip_builder.build() - return pipeline - -This pipeline includes a 'locf' node for handling missing values using the Last Observation Carried Forward method, followed by an 'ar' node for auto regression, which is then joined with a 'ridge' node. - -.. _ts_naive_average_ridge_pipeline: - -15. Naive Average Ridge Pipeline --------------------------------- - -.. code-block:: python - - def ts_naive_average_ridge_pipeline(): - """ - Pipeline with simple forecasting model (the forecast is mean value for known - part) - - .. image:: img_ts_pipelines/ts_naive_average_ridge_pipeline.png - :width: 55% - - """ - pip_builder = PipelineBuilder() \ - .add_sequence('ts_naive_average', branch_idx=0) \ - .add_sequence('lagged', branch_idx=1) \ - .join_branches('ridge') - - pipeline = pip_builder.build() - return pipeline - -This pipeline starts with a 'ts_naive_average' node for simple forecasting based on the mean value of known data, followed by a 'lagged' node, which is then joined with a 'ridge' node. - -.. _cgru_pipeline: - -16. Convolutional GRU Pipeline ------------------------------- - -.. code-block:: python - - def cgru_pipeline(window_size=200): - """ - Return pipeline with the following structure: - - .. image:: img_ts_pipelines/cgru_pipeline.png - :width: 55% - - Where cgru - convolutional long short-term memory model - """ - pip_builder = PipelineBuilder() \ - .add_sequence('lagged', 'ridge', branch_idx=0) \ - .add_sequence(('lagged', {'window_size': window_size}), 'cgru', branch_idx=1) \ - .join_branches('ridge') - - pipeline = pip_builder.build() - return pipeline - -This pipeline includes a 'lagged' node with a specified window size followed by a 'cgru' (Convolutional GRU) node in one branch, and a 'lagged' and 'ridge' sequence in another, which are joined at the 'ridge' node. - -This documentation provides a comprehensive guide to the various time series analysis pipelines available, each tailored to specific needs and scenarios. Users can copy and adapt these pipelines for their own projects, ensuring they understand the underlying logic and configuration of each node. \ No newline at end of file From afd1f2a38be2c551629e1799dd8f4318a3a9e43e Mon Sep 17 00:00:00 2001 From: valer1435 Date: Sat, 8 Jun 2024 22:08:58 +0300 Subject: [PATCH 4/6] fix --- docs/source/examples/simple/classification/index.rst | 3 --- 1 file changed, 3 deletions(-) diff --git a/docs/source/examples/simple/classification/index.rst b/docs/source/examples/simple/classification/index.rst index c2b912747e..26cd3f9344 100644 --- a/docs/source/examples/simple/classification/index.rst +++ b/docs/source/examples/simple/classification/index.rst @@ -8,11 +8,8 @@ This section provides basic examples of FEDOT usage: :maxdepth: 1 api_classification - - classification_with_api_builder classification_with_tuning image_classification_problem multiclass_prediction resample_example - ts_pipelines From 9afe8b99f7cdccf44dbd01898a7698d81745ea0f Mon Sep 17 00:00:00 2001 From: valer1435 Date: Sat, 8 Jun 2024 22:18:01 +0300 Subject: [PATCH 5/6] fix --- docs/source/examples/simple/index.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/source/examples/simple/index.rst b/docs/source/examples/simple/index.rst index c8f6bc6ec1..6e4dd7ab92 100644 --- a/docs/source/examples/simple/index.rst +++ b/docs/source/examples/simple/index.rst @@ -11,6 +11,7 @@ This section provides basic examples of FEDOT usage: classification/index cli_application/index interpretable/index + time_series_forecasting/index regression/index pipeline_and_history_visualization pipeline_import_export From 7ff925b7770234773c63a8dbd300f91395b83ff0 Mon Sep 17 00:00:00 2001 From: valer1435 Date: Sat, 8 Jun 2024 22:30:14 +0300 Subject: [PATCH 6/6] fix --- docs/source/examples/simple/api_builder/index.rst | 2 +- docs/source/examples/simple/classification/index.rst | 2 +- docs/source/examples/simple/cli_application/index.rst | 2 +- docs/source/examples/simple/interpretable/index.rst | 2 +- docs/source/examples/simple/regression/index.rst | 2 +- docs/source/examples/simple/time_series_forecasting/index.rst | 4 ++-- 6 files changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/source/examples/simple/api_builder/index.rst b/docs/source/examples/simple/api_builder/index.rst index 14ec3d52ff..876ceb63f3 100644 --- a/docs/source/examples/simple/api_builder/index.rst +++ b/docs/source/examples/simple/api_builder/index.rst @@ -1,4 +1,4 @@ -Simple example +Api builder ====================== This section provides basic examples of FEDOT usage: diff --git a/docs/source/examples/simple/classification/index.rst b/docs/source/examples/simple/classification/index.rst index 26cd3f9344..a13adf345b 100644 --- a/docs/source/examples/simple/classification/index.rst +++ b/docs/source/examples/simple/classification/index.rst @@ -1,4 +1,4 @@ -Simple example +Classification ====================== This section provides basic examples of FEDOT usage: diff --git a/docs/source/examples/simple/cli_application/index.rst b/docs/source/examples/simple/cli_application/index.rst index fb26ba9658..2508a6580e 100644 --- a/docs/source/examples/simple/cli_application/index.rst +++ b/docs/source/examples/simple/cli_application/index.rst @@ -1,4 +1,4 @@ -Simple example +Cli application ====================== This section provides basic examples of FEDOT usage: diff --git a/docs/source/examples/simple/interpretable/index.rst b/docs/source/examples/simple/interpretable/index.rst index d9a82ac88f..cd04c8fb18 100644 --- a/docs/source/examples/simple/interpretable/index.rst +++ b/docs/source/examples/simple/interpretable/index.rst @@ -1,4 +1,4 @@ -Simple example +Interpretability ====================== This section provides basic examples of FEDOT usage: diff --git a/docs/source/examples/simple/regression/index.rst b/docs/source/examples/simple/regression/index.rst index ce6ac2f5fb..ab2397f014 100644 --- a/docs/source/examples/simple/regression/index.rst +++ b/docs/source/examples/simple/regression/index.rst @@ -1,4 +1,4 @@ -Simple example +Regression ====================== This section provides basic examples of FEDOT usage: diff --git a/docs/source/examples/simple/time_series_forecasting/index.rst b/docs/source/examples/simple/time_series_forecasting/index.rst index 1010b987de..00a0a00684 100644 --- a/docs/source/examples/simple/time_series_forecasting/index.rst +++ b/docs/source/examples/simple/time_series_forecasting/index.rst @@ -1,5 +1,5 @@ -Simple example -====================== +Time series forecasting +========================= This section provides basic examples of FEDOT usage: