-
Notifications
You must be signed in to change notification settings - Fork 960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python models Dataproc Serverless setup with packages #5920
base: current
Are you sure you want to change the base?
Python models Dataproc Serverless setup with packages #5920
Conversation
Add description on how to setup dataproc serverless with a custom image in order to use third-party packages.
Hello!👋 Thanks for contributing to the dbt product documentation and opening this pull request! ✨ |
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@matthewshaver is attempting to deploy a commit to the dbt-labs Team on Vercel. A member of the Team first needs to authorize it. |
Thank you @LouisAuneau ! Just waiting for an SME on our side to review. Hope to have that shortly |
Add description on how to setup python models with Dataproc Serverless using a custom image in order to use third-party packages.
What are you changing in this pull request and why?
In the context of running Python models in Spark using Dataproc, the documentation (python-models.md) says:
I dug and found it is possible to run python models using third-party packages in dataproc serverless. It requires to use a custom docker image. This is very well documented on GCP's end. We currently run this in prod without any issue. I added this in the documentation. Let me know if you need more details on how to set this up.
Checklist
Adding or removing pages (delete if not applicable):
N/A