Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add column comments for DLT tables using dbt schema #794

Closed

Conversation

Amin-Siddique
Copy link

@Amin-Siddique Amin-Siddique commented Sep 17, 2024

Resolves #793

Description

Add support for column comments in Delta Live Tables (ST/MV) from dbt model schema descriptions

Checklist

  • [] I have run this code in development and it appears to resolve the stated issue
  • [] This PR includes tests, or tests are not required/relevant for this PR
  • [] I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

@Amin-Siddique Amin-Siddique marked this pull request as ready for review September 17, 2024 16:04
{%- if model.columns -%}
{%- set column_definitions = [] -%}
{%- for column_name, column in model.columns.items() -%}
{%- set data_type = column.data_type or 'STRING' -%}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually work? In general we don't know the column data type, and I'm concerned that specifying as string will break when the columns are string type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're right, it breaks! Unfortunately, for this to work it we need to explicitly define data_types in the schema file of the respective model.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benc-db I have added a workaround where we initially create a temp view to infer the schema and utilize that to get the data type if undefined. Please let me know your thoughts! 😄 I have tested it in my local setup (1. with schema file and defined data_types 2. with schema file without defining any data types 3. without schema file) and works for both ST/MV

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting approach. I want to talk to dbt Labs about this to get their perspective, and I definitely will want to put behind a behavior flag (new tech recently introduced into dbt) because this will add some latency, but might be worth the trade off.

@benc-db
Copy link
Collaborator

benc-db commented Sep 17, 2024

Please add or edit the functional tests to demonstrate that this functionality works. I'm fairly suspicious of setting STRING as type when the column type is unknown.

@Amin-Siddique Amin-Siddique deleted the add_column_properties branch October 17, 2024 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add persist_docs for MV/ST
2 participants