Skip to content

Commit

Permalink
Merge pull request #638 from GitHubber17/300645-c
Browse files Browse the repository at this point in the history
Freshness - Machine Learning how-to and concepts
  • Loading branch information
v-dirichards authored Oct 7, 2024
2 parents 34f671d + 1c247d8 commit eb04bbc
Show file tree
Hide file tree
Showing 8 changed files with 286 additions and 293 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ The many models [components](concept-component.md) in AutoML enable you to train

The many models training component applies AutoML's [model sweeping and selection](concept-automl-forecasting-sweeping.md) independently to each store in this example. This model independence aids scalability and can benefit model accuracy especially when the stores have diverging sales dynamics. However, a single model approach may yield more accurate forecasts when there are common sales dynamics. See the [distributed DNN training](#distributed-dnn-training-preview) section for more details on that case.

You can configure the data partitioning, the [AutoML settings](how-to-auto-train-forecast.md#configure-experiment) for the models, and the degree of parallelism for many models training jobs. For examples, see our guide section on [many models components](how-to-auto-train-forecast.md#forecasting-at-scale-many-models).
You can configure the data partitioning, the [AutoML settings](how-to-auto-train-forecast.md#configure-experiment) for the models, and the degree of parallelism for many models training jobs. For examples, see our guide section on [many models components](how-to-auto-train-forecast.md#forecast-at-scale-many-models).

## Hierarchical time series forecasting

Expand All @@ -49,7 +49,7 @@ AutoML supports the following features for hierarchical time series (HTS):
* **Retrieving quantile/probabilistic forecasts for levels at or "below" the training level**. Current modeling capabilities support disaggregation of probabilistic forecasts.

HTS components in AutoML are built on top of [many models](#many-models), so HTS shares the scalable properties of many models.
For examples, see our guide section on [HTS components](how-to-auto-train-forecast.md#forecasting-at-scale-hierarchical-time-series).
For examples, see our guide section on [HTS components](how-to-auto-train-forecast.md#forecast-at-scale-hierarchical-time-series).

## Distributed DNN training (preview)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ In the table, $n_{\text{input}} = n_{\text{features}} + 1$, the number of predic

## TCNForecaster in AutoML

TCNForecaster is an optional model in AutoML. To learn how to use it, see [enable deep learning](./how-to-auto-train-forecast.md#enable-deep-learning).
TCNForecaster is an optional model in AutoML. To learn how to use it, see [enable deep learning](./how-to-auto-train-forecast.md#enable-learning-for-deep-neural-networks).

In this section, we describe how AutoML builds TCNForecaster models with your data, including explanations of data preprocessing, training, and model search.

Expand All @@ -90,9 +90,9 @@ AutoML executes several preprocessing steps on your data to prepare for model tr
|--|--|
Fill missing data|[Impute missing values and observation gaps](./concept-automl-forecasting-methods.md#missing-data-handling) and optionally [pad or drop short time series](./how-to-auto-train-forecast.md#short-series-handling)|
|Create calendar features|Augment the input data with [features derived from the calendar](./concept-automl-forecasting-calendar-features.md) like day of the week and, optionally, holidays for a specific country/region.|
|Encode categorical data|[Label encode](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html) strings and other categorical types; this includes all [time series ID columns](./how-to-auto-train-forecast.md#forecasting-job-settings).|
|Encode categorical data|[Label encode](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html) strings and other categorical types; this includes all [time series ID columns](./how-to-auto-train-forecast.md#forecast-job-settings).|
|Target transform|Optionally apply the natural logarithm function to the target depending on the results of certain statistical tests.|
|Normalization|[Z-score normalize](https://en.wikipedia.org/wiki/Standard_score) all numeric data; normalization is performed per feature and per time series group, as defined by the [time series ID columns](./how-to-auto-train-forecast.md#forecasting-job-settings).
|Normalization|[Z-score normalize](https://en.wikipedia.org/wiki/Standard_score) all numeric data; normalization is performed per feature and per time series group, as defined by the [time series ID columns](./how-to-auto-train-forecast.md#forecast-job-settings).

These steps are included in AutoML's transform pipelines, so they're automatically applied when needed at inference time. In some cases, the inverse operation to a step is included in the inference pipeline. For example, if AutoML applied a $\log$ transform to the target during training, the raw forecasts are exponentiated in the inference pipeline.

Expand All @@ -104,7 +104,7 @@ The following table lists and describes input settings and parameters for TCNFor

|Training input|Description|Value|
|--|--|--|
|Validation data|A portion of data that is held out from training to guide the network optimization and mitigate over fitting.| [Provided by the user](./how-to-auto-train-forecast.md#training-and-validation-data) or automatically created from training data if not provided.|
|Validation data|A portion of data that is held out from training to guide the network optimization and mitigate over fitting.| [Provided by the user](./how-to-auto-train-forecast.md#prepare-training-and-validation-data) or automatically created from training data if not provided.|
|Primary metric|Metric computed from median-value forecasts on the validation data at the end of each training epoch; used for early stopping and model selection.|[Chosen by the user](./how-to-auto-train-forecast.md#configure-experiment); normalized root mean squared error or normalized mean absolute error.|
|Training epochs|Maximum number of epochs to run for network weight optimization.|100; automated early stopping logic may terminate training at a smaller number of epochs.
|Early stopping patience|Number of epochs to wait for primary metric improvement before training is stopped.|20|
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ show_latex: true

This article introduces concepts related to model inference and evaluation in forecasting tasks. For instructions and examples for training forecasting models in AutoML, see [Set up AutoML to train a time-series forecasting model with SDK and CLI](./how-to-auto-train-forecast.md).

After you use AutoML to train and select a best model, the next step is to generate forecasts. Then, if possible, evaluate their accuracy on a test set held out from the training data. To see how to setup and run forecasting model evaluation in automated machine learning, see [Orchestrating training, inference, and evaluation](how-to-auto-train-forecast.md#orchestrating-training-inference-and-evaluation-with-components-and-pipelines).
After you use AutoML to train and select a best model, the next step is to generate forecasts. Then, if possible, evaluate their accuracy on a test set held out from the training data. To see how to setup and run forecasting model evaluation in automated machine learning, see [Orchestrating training, inference, and evaluation](how-to-auto-train-forecast.md#orchestrate-training-inference-and-evaluation-with-components-and-pipelines).

## Inference scenarios

Expand Down Expand Up @@ -58,7 +58,7 @@ Suppose that after you train a model, you want to use it to make predictions fro

:::image type="content" source="media/concept-automl-forecasting-evaluation/forecasting-with-gap-diagram.png" alt-text="Diagram demonstrating a forecast with a gap between the training and inference periods.":::

AutoML supports this inference scenario, but you need to provide the context data in the gap period, as shown in the diagram. The prediction data passed to the [inference component](how-to-auto-train-forecast.md#orchestrating-training-inference-and-evaluation-with-components-and-pipelines) needs values for features and observed target values in the gap and missing values or `NaN` values for the target in the inference period. The following table shows an example of this pattern:
AutoML supports this inference scenario, but you need to provide the context data in the gap period, as shown in the diagram. The prediction data passed to the [inference component](how-to-auto-train-forecast.md#orchestrate-training-inference-and-evaluation-with-components-and-pipelines) needs values for features and observed target values in the gap and missing values or `NaN` values for the target in the inference period. The following table shows an example of this pattern:

:::image type="content" source="media/concept-automl-forecasting-evaluation/forecasting-with-gap-table.png" alt-text="Table showing an example of prediction data when there's a gap between the training and inference periods.":::

Expand Down Expand Up @@ -86,7 +86,7 @@ The context advances along with the forecasting window. Actual values from the t

:::image type="content" source="media/concept-automl-forecasting-evaluation/rolling-evaluation-table.png" alt-text="Diagram shows example output table from a rolling forecast.":::

With a table like this, you can visualize the forecasts versus the actuals and compute desired evaluation metrics. AutoML pipelines can generate rolling forecasts on a test set with an [inference component](how-to-auto-train-forecast.md#orchestrating-training-inference-and-evaluation-with-components-and-pipelines).
With a table like this, you can visualize the forecasts versus the actuals and compute desired evaluation metrics. AutoML pipelines can generate rolling forecasts on a test set with an [inference component](how-to-auto-train-forecast.md#orchestrate-training-inference-and-evaluation-with-components-and-pipelines).

> [!NOTE]
> When the test period is the same length as the forecast horizon, a rolling forecast gives a single window of forecasts up to the horizon.
Expand Down
Loading

0 comments on commit eb04bbc

Please sign in to comment.