-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Behavior: Get column info from information_schema Part I #808
Conversation
@@ -116,26 +119,6 @@ class DatabricksConfig(AdapterConfig): | |||
merge_with_schema_evolution: Optional[bool] = None | |||
|
|||
|
|||
def check_not_found_error(errmsg: str) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to prevent circular dependency.
@@ -175,6 +158,19 @@ class DatabricksAdapter(SparkAdapter): | |||
} | |||
) | |||
|
|||
get_column_behavior: GetColumnsBehavior | |||
|
|||
def __init__(self, config: Any, mp_context: SpawnContext) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I don't control the invocation of init, and thus can't pass the behaviors in, doing this during init ensures we only check the behavior flag once. I tried to make this a little more functional but couldn't figure out how to override an existing function definition that is inherited from a parent, hence the goofy class-based strategy.
@@ -28,6 +28,25 @@ | |||
{% do return(load_result('get_columns_comments').table) %} | |||
{% endmacro %} | |||
|
|||
{% macro get_columns_comments_via_information_schema(relation) -%} | |||
{% call statement('repair_table', fetch_result=False) -%} | |||
REPAIR TABLE {{ relation|lower }} SYNC METADATA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure information_schema is up to date prior to using this method.
Partial fix for #779
Description
This is my second attempt at addressing the issue that describe extended can truncate complex types. With the release of dbt-core 1.8.7, we can now process behavior flags; this PR introduces the choice of using information_schema for grabbing column information in get_columns_for_relation. I'm hiding behind a behavior flag because given the current state of UC information_schema, we have to run repair to trust that columns of recently created or altered table will be present in the information_schema, which adds overhead. Furthermore, this trick only works for Delta tables at this time. It is hoped that in time the sync issue with information_schema will be solved, but in the mean time, users can use this flag when they have complex types that describe extended truncates.
Checklist
CHANGELOG.md
and added information about my change to the "dbt-databricks next" section.