This repo will show a basic Demo of how to use Delta file format from AML

Note: Azure ML has an updated method of consuming Delta files: https://learn.microsoft.com/en-us/azure/databricks/mlflow/tracking-ex-delta

This repo will show a basic Demo of how to use Delta file format from AML

With the new announcement from Databricks of relasing DeltaLake for standalone compute, we can now easily integrate the AML CI/Cluster with Delta file format generated/saved from Spark

https://delta.io/news/delta-lake-1-0-0-released/

1) Use Databricks to import the notebook: Databricks_Delta_Load.ipynb

The databricks notebook will show a demo of how to load sample safe_driver data and save it as a spark dataframe in delta file format. Than we will register/create a datastore and upload the delta files and than create a file dataset referencing that datastore .

2) Verify that the Datastore and Dataset have been registered in AML, you should see a list of parquet files and the json transaction log file for delta:

2) Use AML Notebook to import the notebook: Delta_AML_Read_Demo.ipynb and run it on a AML compute instance

This notebook will show how to install the detla table and than use AML Datastore and Dataset to download the delta table and convert the delta table to pandas frames:

from deltalake import DeltaTable
dt = DeltaTable("/mnt/batch/tasks/shared/LS_root/mounts/clusters/deltademocpu/code/delta_driver/")

table= dt.to_pyarrow_table()
# Convert back to pandas
df_pandas = table.to_pandas()

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
Databricks_Delta_Load.ipynb		Databricks_Delta_Load.ipynb
Delta_AML_Read_Demo.ipynb		Delta_AML_Read_Demo.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

This repo will show a basic Demo of how to use Delta file format from AML

1) Use Databricks to import the notebook: Databricks_Delta_Load.ipynb

2) Verify that the Datastore and Dataset have been registered in AML, you should see a list of parquet files and the json transaction log file for delta:

2) Use AML Notebook to import the notebook: Delta_AML_Read_Demo.ipynb and run it on a AML compute instance

About

Releases

Packages

Languages

License

azeltov/aml-delta

Folders and files

Latest commit

History

Repository files navigation

This repo will show a basic Demo of how to use Delta file format from AML

1) Use Databricks to import the notebook: Databricks_Delta_Load.ipynb

2) Verify that the Datastore and Dataset have been registered in AML, you should see a list of parquet files and the json transaction log file for delta:

2) Use AML Notebook to import the notebook: Delta_AML_Read_Demo.ipynb and run it on a AML compute instance

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages