Skip to content

This is a no-code tutorial project that involves training a machine learning model using Automated ML/AutoML in Azure Machine Learning Studio.

Notifications You must be signed in to change notification settings

m3mentomor1/Automated-Model-Training_with_Azure-ML-Studio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Automated Model Training with Azure ML Studio

🧐 I. Overview

This is a no-code tutorial project that involves training a machine learning model using Automated ML/AutoML in Azure Machine Learning Studio.

For demonstration purposes, this tutorial will train a simple classification model to predict whether a client will subscribe to a fixed-term deposit with a financial institution or not.

πŸ“‹ II. Prerequisites

  • Azure account & subscription.
  • Familiarity with basic Azure concepts such as resource groups, subscriptions, Azure Storage, Azure Compute, & some familiarity with Azure Machine Learning Studio is recommended.
  • Basic understanding of machine learning concepts such as supervised learning, classification, regression, & evaluation metrics.

πŸŽ“ III. Tutorial

Contents:

1. Create a workspace
2. Upload a dataset as a data asset
3. Create an automated machine learning job
4. Explore trained model/s
5. Deploy & test trained model/s

1. Create a workspace

Option 1: From the Azure Machine Learning Studio

Workspace Creation

  • Go to https://ml.azure.com.

  • In the left navigation pane, go to Workspaces, then click + New.

  • In the Name section, enter a name for your workspace. (Note: You can choose any name.)

  • In the Subscription section, select the subscription you want to use for the workspace. (Note: You can choose any subscription you have.)

  • In the Resource group section, select an existing resource group from your Azure account or create a new one by clicking Create new. This resource group will store the instance for your Azure ML Studio workspace.

  • In the Region section, choose the region where you want your workspace to be deployed. (Note: Select a region based on accessibility & availability. For this project, it will be deployed in "East US 2" due to its high availability.)

  • Click Create.

Option 2: From the Azure Portal

Option 2 - Workspace Creation

  • Go to https://portal.azure.com.

  • Under Azure services, click Create a resource.

  • Search for "Azure Machine Learning" & click on the Azure Machine Learning service, then click Create.

  • In the Subscription section, select the subscription you want to use for the workspace. (Note: You can choose any subscription you have.)

  • In the Resource group section, select an existing resource group from your Azure account or create a new one by clicking Create new. This resource group will store the instance for your Azure ML Studio workspace.

  • In the Name section, enter a name for your workspace. (Note: You can choose any name.)

  • In the Region section, choose the region where you want your workspace to be deployed. (Note: Select a region based on accessibility & availability. For this project, it will be deployed in "East US 2" due to its high availability.)

  • Click Review + create.

  • After passing the validation, click Create.

  • After successful deployment, go to https://ml.azure.com.

  • In the left navigation pane, go to Workspaces.

2. Upload a dataset as a data asset

Upload dataset

  • Go to the workspace you created.

  • In the left navigation pane, go to Data.

  • In the Data page, click + Create.

Data type to data source

  • In the Name section, enter a name for your data asset (Note: You can choose any name, but for this project, it will be named "bankmarketing".). Next, in the Type section, select the type of data stored in your dataset. (Note: For this project, the type of data we'll use is "Tabular".). After that, click Next.

  • Select the source from which your dataset will be imported by choosing a source for your data asset. (Note: For this project, we'll select "From local files".). After that, click Next.

demo

  • In the Datastore type section, select the type of storage where your dataset will be stored. (Note: For this project, we'll use the default option, which is "Azure Blob Storage".)

  • Select a datastore from the list of existing datastores or create a new one by clicking Create new datastore. (Note: For this project, we'll use the default option, which is "workspaceblobstore".). After that, click Next.

  • Select "Upload files" from the Upload files or folder drop-down, then upload your . (Note: For this project, I recommend downloading & using this dataset, as this is what we'll use.). After the dataset has been uploaded, click Next.

demo22

  • In the Data preview section, verify that the data in the dataset is populated as follows.

  • Verify that the data is properly formatted. After you verify that the data is populated & properly formatted, click Next. (Note: For this project, keep the data format as it is: File format set to "Delimited", Delimiter set to "Comma", Column headers set to "All files have same headers", Encoding set to "UTF-8", Skip rows set to "None", leave the Dataset contains multi-line data un-ticked)

  • On Schema, ensure that the data Type of each column in the dataset is correct & modify the columns you want to include. After everything is verified, click Next.

last part

  • On Review, ensure that all information matches what was previously configured for your data asset. Once everything is verified, click Create.

3. Create an automated machine learning job

  • In the left navigation pane, go to Automated ML, then click + New Automated ML job.

  • In the Job name section, enter a name for your training job. (Note: You can choose any name, but for this project, you can simply name it "Deposit-Subscription-Prediction".)

  • In the Experiment name section, select "Create new". (Note: If this is your first time creating an experiment/job or you don't have any existing experiments, this section might be grayed out & defaulted to "Create new". If this is the case, leave it as is.)

  • In the New experiment name section, enter a name for your experiment. (Note: You can choose any name, but for this project, you can name it "Binary-Classification" since the model we will train is a binary classification model.)

  • In the Description section, you can also put a description about your experiment. (Optional)

  • Click Next.

  • Select "Classification" from the Select task type drop-down. Then, in the Select data section, select the "bankmarketing" dataset we uploaded. After that, click Next.

  • In the Target column section, select "y (string)" as this column indicates whether the client subscribed to a term deposit or not, which corresponds to what we want our model to predict.

  • Select View additional configuration settings & ensure the following configurations to better control the training job: set Primary metric to "AUCWeighted", enable Explain best model & Use all supported models, ensure no models are checked in the Blocked models section, & leave the Positive class label section blank. After configuring these settings, click Save.

  • In the Limits dropdown, ensure the following configuration: set Max nodes to "6", set Metric score threshold to "0.8" (equivalent to 80%) since the model we're training is only a baseline model.

  • In the Validation type dropdown, select "k-fold cross-validation", then enter "2" as your Number of cross validations. (Note: This section is optional & may be left as is, but for this project, a k-fold cross-validation will be implemented.)

  • Click Next.

  • In the Select compute type dropdown, select "Compute cluster".

  • In the Select Azure ML compute cluster section, either create a new compute cluster or select an existing one. If no compute cluster is available, you can create a new one by clicking + New
  • If you decide to create one, select "East US 2" in the Location dropdown. (Note: Select a region based on accessibility & availability. For this project, it will be deployed in "East US 2" due to its high availability.)

  • In the Virtual machine tier section, select "Dedicated", then choose "CPU" as your Virtual machine type. (Note: You can opt for a "GPU" for faster training, but this may result in higher costs & quicker consumption of your Azure credits.)

  • In the Virtual machine size section, select "Select from all options", then find & choose "Standard_DS12_v2" from the options. (Note: You may choose other virtual machine sizes based on your requirements, but for this tutorial, we will use "Standard_DS12_v2" as it offers a balanced combination of CPU, memory, & storage for most workloads.)

  • Click Next.

  • In the Compute name section, enter a name for your compute & leave the other configurations as they are. (Note: You can choose any name, but for this project, you can simply name it "automl-compute".)

  • Click Create.

  • Once you have successfully created a new compute cluster, select the newly created cluster in the Select Azure ML compute cluster dropdown & click Next.

  • Click Submit training job.

  • To monitor the training progress, navigate to the Jobs tab under Assets in the left navigation pane, then click on "Deposit-Subscription-Prediction." You can check the training status under the Status section. If it says "Completed," your model/s have finished training. (Note: With our current training job setup & dataset, the training process could take anywhere from 15 minutes to 1 hour under typical conditions. However, these are general estimates, & the actual time may vary.)

4. Explore trained model/s

  • To explore the models you've trained, go to the Jobs tab in the Assets section of the left navigation pane, then select the "Deposit-Subscription-Prediction" job.

  • First, we can check the evaluation metrics of the model/s in the Models & Child Jobs tab. Then, click on the name of the model's algorithm, which is "MaxAbsScaler, LightGBM".

  • Under Model Summary > AUC Weighted, you can see the score that represents the model's overall performance in distinguishing between positive & negative classes. Since "AUC Weighted" is the primary metric used in this job, a higher AUC value indicates better model performance.

  • Aside from AUC Weighted, you can also view other metrics by clicking View all other metrics or go to the Metrics tab. (Note: If you don't see any metric in the Metrics tab just click πŸ—˜ Refresh.)

  • In the Metrics tab, you can filter & view only the metrics you want by using the Select Metrics panel on the left side. (Note: Click the double-arrow button pointing to the right next to the "Select Metrics" text to access the filtering options.)

  • Back to the Model tab, you can also view the hyperparameters used to improve the performance of the model/s under Model Summary > AUC Weighted > View hyperparameters.

  • You can also view an explanation of the model/s & see which data features (raw or engineered) influenced a particular model's predictions in the Explanations (preview) tab.

  • Test predictions can also be performed in the Test results (preview) tab. In this tab, click Test model (preview) & configure the following settings: set Select compute type to "Compute cluster," set Select Azure ML compute cluster to "automl-compute," & choose the dataset you want to use under Select a dataset, then click Test. (Note: Ensure you have a test dataset available for making predictions. It is recommended to use a different dataset than the one used to train the model.)

  • You can monitor the progress of the testing in the table within the Test results (preview) tab. If the table is empty, simply click πŸ—˜ Refresh to update the view. Once the testing is complete, you will see the testing job's AUC score result in the "AUC Weighted" column, & the status will indicate "Completed".

5. Deploy & test trained model/s

  • Using the Automated ML/AutoML interface, you can also deploy the models you trained by clicking β–· Deploy. (Note: Ensure you are still in Assets > Jobs > "Deposit-Subscription-Prediction" > Models and Child Jobs > "MaxAbsScaler, LightGBM" to see the β–· Deploy dropdown button.)

  • In the β–· Deploy dropdown button, select "Real-time endpoint". (Note: For this tutorial, we will select the "Real-time endpoint" option to enable individual real-time predictions.)

  • If you don't have an existing endpoint, select the "New" option & leave the configured settings as they are, then click Deploy. If you have an existing endpoint, select the "Existing" option & choose the desired endpoint from the Endpoint name dropdown.

  • To check the endpoint's deployment status, navigate to Assets > Endpoints & click the name of the endpoint you just created. If it is successfully deployed the Provisioning state under Endpoint attributes section will indicate "Succeeded".

  • After the deployment of the endpoint, you can also check check model's deployment status by navigating to Assets > Endpoints & click the name of the endpoint you just created. If it is successfully deployed, the Provisioning state under the Deployment <deployment_name> section will indicate "Succeeded".

  • To test the deployed model & predict whether a client will subscribe to a fixed-term deposit with a financial institution or not, navigate to the Test tab & input data in JSON format in the Sample inference > Input editor. (Note: For your convenience, you can use this example input data. This also includes an explanation of each column to give you an idea on why each input data is used.)

  • To view the prediction results, scroll-down to the bottom & see the results in the jsonOutput section.

6. Clean used resources

Option 1: Delete only the deployment instance & keep the resource group and workspace

  • Go to https://ml.azure.com.

  • In the left navigation pane, go to Assets > Endpoints, select the deployment instance you created for this tutorial, & then click Delete.

  • Click Delete.

Option 2: Delete all resources used in this tutorial

  • Go to https://portal.azure.com.

  • Under Azure services, select Resource groups.

  • In Resource groups, click resource group you created for this tutorial.

  • Select Delete resource group.

  • Enter the name of the resource group in the Enter resource group name to confirm deletion field & click Delete.

πŸ” IV. References/Source Materials

About

This is a no-code tutorial project that involves training a machine learning model using Automated ML/AutoML in Azure Machine Learning Studio.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published