-
Notifications
You must be signed in to change notification settings - Fork 34
Setup and Running the code
Rajdeep Biswas edited this page Jun 11, 2020
·
1 revision
- Create a free Azure account. Refer: Azure Account or use an existing subscription.
- Create a storage account and a container. Refer: Create Blob Storage And Create Blob Container Note: You need to change the name of the Sink Blob Account Name and Sink Blob Container Name in the SparkRNotebook [Step01a_Setup] (https://github.com/microsoft/A-TALE-OF-THREE-CITIES/blob/master/dbc/Step01a_Setup.dbc) in Step 9
- Create a Shared Access Signature and copy the query string. Refer to the steps below.
More information here: Create SAS token
- From Azure portal create a key vault and then create a secret with the sas token retrieved from previous step. Refer Create Azure KeyVault
- Create a Azure databricks workspace and a spark cluster.
Refer: Create Azure Databricks workspace and cluster
- Create an Azure Key Vault backed secret scope (note that you should have contributor access on the KeyVault instance).
Refer: Azure Key Vault backed secret scope
- Load the requisite libraries in the azure databricks spark cluster.
Refer: Install Libraries
Please find the list of libraries in the image below:
- Import the dbc archive using the link https://github.com/microsoft/A-TALE-OF-THREE-CITIES/blob/master/dbc/all_dbc_archive/311_Analytics_OpenSource.dbc
Refer: Import notebook
- Update and validate the Sink configuration section (Line 8 to 12 in Cmd 3 section) and copy paste the value of the source sas token from line 6 in Step01a_Setup in your Azure databricks workspace.
- Start running the sample from Step02a_Data_Wrangling in your Azure databricks workspace.