-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support s3_data_dir and s3_data_naming #4
Conversation
@thenaturalist @nicor88 what is preventing this from behind merged? My team has been waiting for this for a very long time ;) |
if creds.s3_data_dir is not None: | ||
return creds.s3_data_dir | ||
else: | ||
return f"{creds.s3_staging_dir}tables/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If s3_staging_dir
has a trailing slash or it miss a / we can have issues. We can do simply:
os.path.join(creds.s3_staging_dir, 'tables/')
@mattiamatrix we wanted to full test the feature with all supported cases, the implementation looks actually good to me, I left a tiny comment. Also the README needs to be updated a bit more to make the usage a bit smoother. |
We might also want to ping the people (@chronitis @fbrueck) who were initially involved if they are still interested. |
I originally opened this, and we used it in production for maybe 6 months before we moved away from Athena. So I'm not in a position to test it further now, but if it's useful to anyone else, please feel free to merge. It was workable and having predictable S3 paths made organising the resulting data much easier, although one issue encountered was that (at the time, I haven't checked if this has now changed) the temporary table used when building an incremental table always had a fixed name (table + |
nb, one other feature I wrote which might be of interest: using the Glue API for listing tables, schemas, columns instead of INFORMATION_SCHEMA queries, which were generally a lot slower. I don't think I ever opened a PR for it, but again, you're welcome to pull it if it looks useful: northvolt/dbt-athena@49c7111 |
Regarding the not unique path, I also add a workaround in the Iceberg implementation, I think that needs to be revisit a bit, but was similar to what you suggested. Regarding #4 (comment) definitely worth to add it. @mattiamatrix and @chronitis I'm wondering if you both want to be added as maintainers to help us out to deliver feature faster, we fork the repo to enable faster merge, but also me and @thenaturalist are quite busy. Feel free to reach out to me in Slack dbt. |
@nicor88 could you please provide a hint to why this PR has been closed? |
@rumbin just a mistake when we relesed the 1.0.4, the branch was deleted and the pr closed, we will re-open again. @thenaturalist in case you have time, you can re-open it. |
Hey @nicor88, what is the plan for this PR? |
@mattiamatrix feel free to re-open it. |
Ported PR from Tomme/dbt-athena#50 by @chronitis