Skip to content

Commit

Permalink
Update databricks.md
Browse files Browse the repository at this point in the history
  • Loading branch information
sonalgoyal authored Sep 19, 2023
1 parent 288b435 commit c6e5405
Showing 1 changed file with 4 additions and 5 deletions.
9 changes: 4 additions & 5 deletions docs/running/databricks.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@ title: Running on Databricks
parent: Running Zingg on Cloud
nav_order: 6
---
There are several ways to run Zingg on Databricks
There are several ways to run Zingg on Databricks. All [file formats and data sources and sinks](../dataSourcesAndSinks) are supported within Databricks.

# Running directly within Databricks using the notebook interface
# Running directly within Databricks using the Databricks notebook interface
This uses the Zingg Python API and an [example notebook is available here](https://github.com/zinggAI/zingg/blob/main/examples/databricks/FebrlExample.ipynb)

# Running using Databricks Connect
# Running using Databricks Connect from your local machine
1. Configure databricks connect 11.3 and create correspoding workspace/cluster as per the [Databricks docs](https://docs.databricks.com/dev-tools/databricks-connect-legacy.html). Please makre sure that you run `databricks-connect configure`

2. Set env variable ZINGG_HOME to the path where latest zingg release jar is e.g. location of zingg-0.4.0-SNAPSHOT.jar
Expand All @@ -24,8 +24,7 @@ Please refer to the [different options](https://docs.zingg.ai/zingg/stepbystep/z


# Running on Databricks using Spark Submit Jobs

The cloud environment does not have the system console for the labeler to work. Zingg is run as a Spark Submit Job along with a python notebook-based labeler specially created to run within the Databricks cloud.
Zingg is run as a Spark Submit Job along with a python notebook-based labeler specially created to run within the Databricks cloud since the cloud environment does not have the system console for the labeler to work.

Please refer to the [Databricks Zingg tutorial](https://medium.com/@sonalgoyal/identity-resolution-on-databricks-for-customer-360-591661bcafce) for a detailed tutorial.

Expand Down

0 comments on commit c6e5405

Please sign in to comment.