Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Spark-compatible tmp_table_suffix for incremental materializations #759

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

chenp-eleos
Copy link

Description

This PR addresses the issue related to unique_tmp_table_suffix when working with Spark. While the proposed changes in issue #668 worked for part of the scenarios, they did not account for Spark’s restrictions on table and view names. Specifically:
• Spark does not allow hyphens (-) in table/view names, causing failures when creating temporary tables.
• This PR modifies the logic to replace hyphens in the generated suffix with underscores (_), ensuring Spark compatibility.

Additionally:
• Ensures unique_tmp_table_suffix is respected and functional when set to True for all table types and strategies.
• Supports concurrent runs of the same model by avoiding naming conflicts.

Models used to test - Optional

•	incremental materialization tests with Spark, Hive, and Iceberg configurations.
•	Test cases included concurrent runs of the same model.

Checklist

  • You followed contributing section
  • You kept your Pull Request small and focused on a single feature or bug fix.
  • You added unit testing when necessary
  • You added functional testing when necessary

Generate Spark-compatible tmp_table_suffix for incremental materializations with iceberg
Generate a unique tmp table suffix if required for incremental materializations.
Generate Spark-compatible tmp_table_suffix for incremental materializations.
@chenp-eleos chenp-eleos requested a review from a team as a code owner November 21, 2024 19:25
@chenp-eleos chenp-eleos changed the title Patch 1 Fix: Spark-compatible tmp_table_suffix for incremental materializations Nov 21, 2024
{% if unique_tmp_table_suffix == True and strategy == 'insert_overwrite' and table_type == 'hive' %}
{% set tmp_table_suffix = adapter.generate_unique_temporary_table_suffix() %}
-- Generate a unique tmp table suffix if required
{% if unique_tmp_table_suffix == True %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you remove the check on strategy == 'insert_overwrite' and table_type == 'hive'?

for sure we can simplify the statement checks and do something like this:

{% if unique_tmp_table_suffix == True and ((strategy == 'insert_overwrite' and table_type == 'hive') or table_type == 'iceberg') %}

But I will avoid to don't check the strategy type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've some doubts about my comment, your idea might not be bad overall, but to guarantee that works as expected, consider to add some Integrations tests for different scenarios.

-- Generate a unique tmp table suffix if required
{% if unique_tmp_table_suffix == True %}
{% set raw_suffix = adapter.generate_unique_temporary_table_suffix() %}
{% set tmp_table_suffix = raw_suffix.replace('-', '_') %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move this inside the generate_unique_temporary_table_suffix.
replacing - with _ make sense also for an athena/trino SQL prospective.

Also, remember to adapt the functional tests, otherwise they will fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#754 was merged, therefore please remove this, as not needed anymore

@nicor88
Copy link
Contributor

nicor88 commented Nov 22, 2024

This PR: #754 aims to address the same issue, but there is still some problem with it.
Have a look at my comment please, and if you are fast enough to address them, this PR can replace #754.

@nicor88
Copy link
Contributor

nicor88 commented Nov 23, 2024

#754 was merged, please readapt this PR based on my comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants