-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synapse #708
Conversation
❌ Deploy Preview for dlt-hub-docs failed.
|
resolves 'str' mapping error
… to pull from base class
|
||
|
||
@contextmanager | ||
def begin_transaction(self) -> Iterator[DBTransaction]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what exactly happens when we use the same tx code we have in the mssql implementation? why can't we use that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shifted back to mssql implementation. Current approach helped with error handling; however, we aren't facing transaction errors anymore and can switch back.
original_sql = sql | ||
|
||
# Check if it's a multi-row insert | ||
is_multi_row_insert = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should implement a custom InsertValuesLoadJob that does these conversions, and not hack the execute_sql.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the feedback, @sh-rp. How about a subclass of InsertValuesLoadJob tailored for Microsoft Synapse that lives in synapse.py?
Given that DLT handles multiple destinations and insert_job_client.py provides a foundation for all of them, I can isolate Synapse-specific conversions and avoid execute_sql
SynapseInsertValuesLoadJob
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sh-rp circling back to this, please let me know if this is in right direction. generate_insert_query() now lives in synapse.py
Instead of taking a SQL string (vulnerable to injection) it now takes comparable params to InsertValuesLoadJob
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @eryanRM, this PR is looking much better now. I have some requests:
- Please Make sure the branch is up to date with devel, there are some small changes in the way the loader resolved which tables are needed on staging etc. You can probably just use the settings from the mssql client for that
- Please make sure the tests are actually executed so I can see what passes and what does not. This includes removing debug/test code and making sure the poetry synapse extra works.
- Generally speaking it seems like there is a lot of code shared between mssql and synapse. I would be better to have common baseclasses for the two of them instead of having the same code in separate places.
…est incremental, test inference, test utils
…e_sql TODO: customValuesubclass
note: temporarily updating insert_job_client to use Synapse insert job. Need to add handling
Hi @sh-rp, I appreciate the feedback. This has been an illuminating process
|
Refreshing Synapse branch w/ 'devel'. Continuing as new PR |
Description
Initial PR bringing Synapse functions into subclass
Note: new errors (ex. 'str' mapping) popped up when running updated tests
ex 1. resolved: 'str' mapping errors in several test_job_client.py
ex 2. tests/load/test_job_client.py::test_load_with_all_types now failing due to column count not matching table
Related Issues
Additional Context
This PR is for David's review to confirm we are on correct path