Setup and Running the code

Create a free Azure account. Refer: Azure Account or use an existing subscription.
Create a storage account and a container. Refer: Create Blob Storage And Create Blob Container Note: You need to change the name of the Sink Blob Account Name and Sink Blob Container Name in the SparkRNotebook [Step01a_Setup] (https://github.com/microsoft/A-TALE-OF-THREE-CITIES/blob/master/dbc/Step01a_Setup.dbc) in Step 9
Create a Shared Access Signature and copy the query string. Refer to the steps below. More information here: Create SAS token
From Azure portal create a key vault and then create a secret with the sas token retrieved from previous step. Refer Create Azure KeyVault
Create a Azure databricks workspace and a spark cluster. Refer: Create Azure Databricks workspace and cluster
Create an Azure Key Vault backed secret scope (note that you should have contributor access on the KeyVault instance). Refer: Azure Key Vault backed secret scope
Load the requisite libraries in the azure databricks spark cluster. Refer: Install Libraries Please find the list of libraries in the image below:
Import the dbc archive using the link https://github.com/microsoft/A-TALE-OF-THREE-CITIES/blob/master/dbc/all_dbc_archive/311_Analytics_OpenSource.dbc Refer: Import notebook
Update and validate the Sink configuration section (Line 8 to 12 in Cmd 3 section) and copy paste the value of the source sas token from line 6 in Step01a_Setup in your Azure databricks workspace.
Start running the sample from Step02a_Data_Wrangling in your Azure databricks workspace.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setup and Running the code

Setup and Running the code

Clone this wiki locally