Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cross-database macros where needed to ensure SQL queries work on all OHDSI-supported database platforms #39

Open
katy-sadowski opened this issue Jun 11, 2024 · 6 comments
Assignees

Comments

@katy-sadowski
Copy link
Collaborator

katy-sadowski commented Jun 11, 2024

Cross-database macros are offered by dbt to compile sql depending on the target DBMS: https://docs.getdbt.com/reference/dbt-jinja-functions/cross-database-macros

The goal of this issue is to assess what SQL statements in the project will not translate across all DBMS, and replace them with a cross-DB macro (if possible).

OHDSI supported platforms:
https://ohdsi.github.io/Hades/supportedPlatforms.html

@katy-sadowski katy-sadowski self-assigned this Jun 16, 2024
@lawrenceadams
Copy link
Collaborator

I've done a bit of playing to discover issues we may run into when not using duckdb/postgres. Here are some of my findings:

System Finding
SQLite Does not support multiple schemas (treats as separate physical databases) Differences in how views are required caused issues
Snowflake Unable to get very far as START is a protected keyword (used in a lot of synthea tables) and so issues thrown when using seed. → will probably need a custom data loader to get around this. More work needed.
SQL Server Found fiddly to get setup locally. Not yet successful - from memory due to keyword utilization but may be wrong.

Probably all easily overcome with a few hours - but not as straightforward as I would have thought! DuckDb is almost too friendly in comparison! 😆

@vvcb
Copy link
Collaborator

vvcb commented Oct 4, 2024

Getting dbt to work across database engines is not worth the hassle and you will quickly hit a tech debt ceiling.

Cross-database Synthea-OMOP transformation is already mostly solved here - https://github.com/lsc-sde/sqlmesh_synthea

@lawrenceadams , please take a look at https://sqlmesh.com/ and https://sqlglot.com/sqlglot.html

We are using SQLMesh for all new OMOP projects for this reason and many others. We build locally on DuckDb, and deploy to on-prem SQL Server and Azure Databricks because we are 'migrating' and are likely to be migrating for the next decade :-D. So, important for us to be able to manage a single codebase for multiple database engines and for several people to work on the same data warehouse/ lakehouse at the same time.

@lawrenceadams
Copy link
Collaborator

Yeah came to the same conclusion! Was an interesting bit of discovery!

Yep well aware of SQLMesh/Glot, is a nice tool!

@vvcb
Copy link
Collaborator

vvcb commented Oct 4, 2024

And, if we want to continue to use dbt with some cross-platform support, then targeting the engines that are as close to being ANSI-compliant as possible and refactoring any existing SQL to ANSI-SQL will be a good start.

One way to do this would be to take all the compiled code that dbt generates currently and transpiling it to ANSI using SQLGlot and looking at the diffs.

@katy-sadowski
Copy link
Collaborator Author

One way to do this would be to take all the compiled code that dbt generates currently and transpiling it to ANSI using SQLGlot and looking at the diffs.

I like the idea, would be a cool part of this exploration, and a great learning opportunity!

@lawrenceadams regarding Snowflake and the START keyword, would this help: https://docs.getdbt.com/reference/resource-configs/quote_columns

@lawrenceadams
Copy link
Collaborator

@lawrenceadams regarding Snowflake and the START keyword, would this help: https://docs.getdbt.com/reference/resource-configs/quote_columns

Excellent shout - this resolved that issue! I'll have more of a play now that's unblocked!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants