Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Databricks E2E] Changes to zip REST API and Integrate with ADF. #1041

Merged
merged 42 commits into from
Feb 18, 2025

Conversation

bsherwin
Copy link
Contributor

Type of PR

  • Documentation changes
  • Code changes
  • Test changes

Purpose

This PR is to integrate the REST API into the Azure Data Factory process. We removed the binary ZIP package for deployment and build that as part of the deployment process.

In addition to modifying the Linked Services to use the new REST endpoints, we also modified the Databricks activity to use the ddo_cluster that was set up during the deployment. This allows the cluster to be shared rather than spinning up a cluster for each databricks task to "standardize" and "transform" the data.

Author pre-publish checklist

  • No PII in logs
  • Made corresponding changes to the documentation

Validation steps

  • Run a deployment from the devcontainer
  • In ADF, navigate to the Linked Services and validate that the Databricks linked service correctly references the ddo_cluster
  • Validate that the ADF pipeline "Copy Data" activities correctly point to the correct REST Endpoints
  • Validate that the ADF pipeline "Databricks" activities correctly point to the correct binaries and are correctly set to a Wheel file.

Issues Closed or Referenced

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@bsherwin
Copy link
Contributor Author

@ydaponte pointed out that since we are using the existing cluster, we don't need the wheel as an appended library in the ADF Pipeline. I removed it as unnecessary.

Copy link
Collaborator

@ydaponte ydaponte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggesting some changes - there are 2 things that won't work. The shebang is missing on your app service bash script and the CI pipeline deploy-adf-job is going to fail because is referring to an old field of the databricks linked service definition. Thanks for the great work so far!

@ydaponte ydaponte removed the request for review from thesqlpro January 23, 2025 13:28
Copy link
Contributor

@elenaterenzi elenaterenzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in order to approve I need to finish testing, but I have some comments and figured it was worth sharing now in case you have time.
Some are not that important or are just questions, others would be changes that I would request you implement in order for me to approve.

Copy link
Contributor

@elenaterenzi elenaterenzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found issues with idempotency

Copy link
Collaborator

@ydaponte ydaponte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting a couple of changes

Copy link
Collaborator

@ydaponte ydaponte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor changes.

bsherwin and others added 7 commits February 14, 2025 13:03
Co-authored-by: Yennifer Santos <[email protected]>
Co-authored-by: Yennifer Santos <[email protected]>
Co-authored-by: Yennifer Santos <[email protected]>
Co-authored-by: Yennifer Santos <[email protected]>
Copy link
Collaborator

@ydaponte ydaponte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Job! Thanks for the patience and effort!

@bsherwin bsherwin merged commit d73017b into e2e/databricks/parking-sensors-V1 Feb 18, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants