-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retry factory to consolidate retry strategies across dbt-bigquery #1395
Merged
Merged
Changes from 37 commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
2b01804
fix imports
mikealfare 8b45594
create a retry factory and move relevant objects from connections
mikealfare 391099d
add on_error method for deadline retries
mikealfare 7872a58
remove dependency on retry_and_handle from cancel_open
mikealfare 42a8869
remove dependencies on retry_and_handle
mikealfare 900dcac
remove timeout methods from connection manager
mikealfare 81bfa0c
add retry to get_bq_table
mikealfare 3e32872
fix mocks in unit tests
mikealfare 89e2a50
rebase on main
mikealfare 3f79642
reorder this tuple to make the pr review easier to understand
mikealfare f300080
move client factory to credentials module so that on_error can be mov…
mikealfare c3065e5
move on_error factory to retry module
mikealfare ad74114
move client factories from python_submissions module to credentials m…
mikealfare 9029c49
create a clients module
mikealfare bc0fbea
retry all client factories by default
mikealfare 9a9f87e
move polling from manual check in python_submissions module into retr…
mikealfare 136ea77
move load_dataframe logic from adapter to connection manager, use the…
mikealfare 90d5308
move upload_file logic from adapter to connection manager, use the bu…
mikealfare 9211e1c
move the retry to polling for done instead of create
mikealfare e90c24d
fix broken import in tests from code migration
mikealfare a2db35b
align new retries with original methods, simplify retry factory
mikealfare b8408c2
fix seed load result
mikealfare 5b896ee
create a method for the dataproc endpoint
mikealfare 43c10f1
add some readability updates
mikealfare 4256682
add some readability updates
mikealfare 5644509
add some readability updates, simplify submit methods
mikealfare df2971b
make imports explicit, remove unused constant
mikealfare 0beaac6
changelog
mikealfare 6e2f4b4
add community member who contributed a solution and research to the c…
mikealfare b560554
Merge branch 'main' into add-retry-factory
mikealfare 6354483
Merge branch 'main' into add-retry-factory
colin-rogers-dbt f72da43
update names in clients.py to follow the naming convention
mikealfare 9fb25bc
update names in connections.py to follow the naming convention
mikealfare e99d857
update names in credentials.py to follow the naming convention
mikealfare f8ad953
update names in python_submissions.py to follow the naming convention
mikealfare 5f3a456
update names in retry.py to follow the naming convention
mikealfare 7c4388f
run linter and update unit test mocks
mikealfare 5928098
update types on retry factory
mikealfare 02385bb
update inputs on retry factory
mikealfare 51cc87f
update predicate class name
mikealfare eaab976
add retry strategy back to copy table
mikealfare a81289f
linting and fix unit test for new argument
mikealfare 76d6979
fix whitespace
mikealfare File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
kind: Under the Hood | ||
body: Create a retry factory to simplify retry strategies across dbt-bigquery | ||
time: 2024-11-07T14:38:56.210445-05:00 | ||
custom: | ||
Author: mikealfare osalama | ||
Issue: "1395" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
from google.api_core.client_info import ClientInfo | ||
from google.api_core.client_options import ClientOptions | ||
from google.api_core.retry import Retry | ||
from google.auth.exceptions import DefaultCredentialsError | ||
from google.cloud.bigquery import Client as BigQueryClient | ||
from google.cloud.dataproc_v1 import BatchControllerClient, JobControllerClient | ||
from google.cloud.storage import Client as StorageClient | ||
|
||
from dbt.adapters.events.logging import AdapterLogger | ||
|
||
import dbt.adapters.bigquery.__version__ as dbt_version | ||
from dbt.adapters.bigquery.credentials import ( | ||
BigQueryCredentials, | ||
create_google_credentials, | ||
set_default_credentials, | ||
) | ||
|
||
|
||
_logger = AdapterLogger("BigQuery") | ||
|
||
|
||
def create_bigquery_client(credentials: BigQueryCredentials) -> BigQueryClient: | ||
try: | ||
return _create_bigquery_client(credentials) | ||
except DefaultCredentialsError: | ||
_logger.info("Please log into GCP to continue") | ||
set_default_credentials() | ||
return _create_bigquery_client(credentials) | ||
|
||
|
||
@Retry() # google decorator. retries on transient errors with exponential backoff | ||
def create_gcs_client(credentials: BigQueryCredentials) -> StorageClient: | ||
return StorageClient( | ||
project=credentials.execution_project, | ||
credentials=create_google_credentials(credentials), | ||
) | ||
|
||
|
||
@Retry() # google decorator. retries on transient errors with exponential backoff | ||
def create_dataproc_job_controller_client(credentials: BigQueryCredentials) -> JobControllerClient: | ||
return JobControllerClient( | ||
credentials=create_google_credentials(credentials), | ||
client_options=ClientOptions(api_endpoint=_dataproc_endpoint(credentials)), | ||
) | ||
|
||
|
||
@Retry() # google decorator. retries on transient errors with exponential backoff | ||
def create_dataproc_batch_controller_client( | ||
credentials: BigQueryCredentials, | ||
) -> BatchControllerClient: | ||
return BatchControllerClient( | ||
credentials=create_google_credentials(credentials), | ||
client_options=ClientOptions(api_endpoint=_dataproc_endpoint(credentials)), | ||
) | ||
|
||
|
||
@Retry() # google decorator. retries on transient errors with exponential backoff | ||
def _create_bigquery_client(credentials: BigQueryCredentials) -> BigQueryClient: | ||
return BigQueryClient( | ||
credentials.execution_project, | ||
create_google_credentials(credentials), | ||
location=getattr(credentials, "location", None), | ||
client_info=ClientInfo(user_agent=f"dbt-bigquery-{dbt_version.version}"), | ||
client_options=ClientOptions(quota_project_id=credentials.quota_project), | ||
) | ||
|
||
|
||
def _dataproc_endpoint(credentials: BigQueryCredentials) -> str: | ||
return f"{credentials.dataproc_region}-dataproc.googleapis.com:443" |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These client methods used to live in BigQueryConnectionsManager and python_submissions. Centralizing them here reduced the interface for credentials and removed noise from those other classes, making troubleshooting easier.