-
Notifications
You must be signed in to change notification settings - Fork 227
feat: Add support for Microsoft Fabric Warehouse #4751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
4e7d6c7
to
b679716
Compare
Thanks for creating this PR draft, so I can try it out 😃 I tried the models creating by ProgrammingError:
('42000', '[42000] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]An object or column name is missing or empty. For SELECT INTO statements, verify each column
has a name. For other statements, look for empty alias names. Aliases defined as "" or [] are not allowed. Change the alias to a valid name. (1038) (SQLExecDirectW)') The log show some interesting stuff:
This in particular looks suspect:
Here's my model: MODEL (
name data_according_to_business.hook.frame__northwind__customers,
kind FULL
);
SELECT
*
FROM data_according_to_business.dbo.raw__northwind__customers And the rendered SQL works just fine when evaluating: $ uv run sqlmesh evaluate hook.frame__northwind__customers
customer_id company_name contact_name contact_title ... fax _dlt_load_id _dlt_id region
0 ALFKI Alfreds Futterkiste Maria Anders Sales Representative ... 030-0076545 1750078533.6436024 xpfDb7mcWB5ijQ None
1 ANATR Ana Trujillo Emparedados y helados Ana Trujillo Owner ... (5) 555-3745 1750078533.6436024 Pr3sRmDpwu66mA None
2 ANTON Antonio Moreno Taquería Antonio Moreno Owner ... None 1750078533.6436024 X206DXOYfUMhMA None
3 AROUT Around the Horn Thomas Hardy Sales Representative ... (171) 555-6750 1750078533.6436024 UvMQUiuIwfPMVw None
4 BERGS Berglunds snabbköp Christina Berglund Order Administrator ... 0921-12 34 67 1750078533.6436024 sPupxoT/AS8XXA None
.. ... ... ... ... ... ... ... ... ...
86 WARTH Wartian Herkku Pirkko Koskitalo Accounting Manager ... 981-443655 1750078533.6436024 sbnEuPm0vmJbTw None
87 WELLI Wellington Importadora Paula Parente Sales Manager ... None 1750078533.6436024 JUEwhfkd5hbtYQ SP
88 WHITC White Clover Markets Karl Jablonski Owner ... (206) 555-4115 1750078533.6436024 iwjZC43nTrqgKg WA
89 WILMK Wilman Kala Matti Karttunen Owner/Marketing Assistant ... 90-224 8858 1750078533.6436024 LTCR7N1bsPyuhw None
90 WOLZA Wolski Zajazd Zbyszek Piestrzeniewicz Owner ... (26) 642-7012 1750078533.6436024 fDzC3tFHAgLfPQ None
[91 rows x 13 columns] |
Thanks. I will investigate later - perhaps we need |
I've made some progress... it fails later now:
But I can't find where the failing part is actually generated. |
@mattiasthalen the information schema query is generated in SQLGlot (source code) for "create if not exists" expressions, which are constructed in SQLMesh when trying to materialize the model (create physical table, etc). |
@georgesittas, would you say that most of these changes would be more suitable in sqlglot, a fabric-tsql dialect, if you will. Seeing as there are more differences between tsql and the version in fabric. |
Could you summarize what the differences are? I thought fabric used t-sql under the hood, but if the two diverge then what you say is reasonable. I'd start with this information schema example and then see if there are more examples besides that. |
@mattiasthalen yeah that's the conclusion I came to when I first started investigating this. Like all abstractions, the Fabric TSQL abstraction is leaky enough to be subtly different from the TSQL supported by SQL Server and not a drop-in replacement. @fresioAS thanks for giving this a go! The general process for adding new engines to SQLMesh is:
I know this is an early draft, but rather than implementing two separate adapters for The connection config could take a Note that the lakehouse side can just throw |
@erindru, I don't think there should be any separation between warehouse and Lakehouse. Both use the same type of sql endpoint, the "fabric flavored t-sql". The only difference I can think of is wether or not the Lakehouse supports schemas. As of now, you get the option to activate schemas when creating a Lakehouse. And that comes with its own issues, e.g., a weaker API. This might merit a parameter to tell if the catalog/database is a Lakehouse with/without schema, or a warehouse. But I agree that a different engine is overkill. With that said, the current code in this PR can actually query a Lakehouse. The host/endpoint used is the same for LH & WH, and you specify which one by the catalog/database. Same thing happens with the sql database object, they share host/endpoint and you select the object by setting the database. |
In that case, a coherent MS has probably improved this since I last looked, but isn't Lakehouse based on Spark SQL and Warehouse based on the "Polaris" flavour of TSQL? |
Well, yeah. Spark SQL is used in a Lakehouse to create tables. But you can use the SQL endpoint to query it, and I think you can create views with it. The warehouse can use both tsql and spark. |
The latest commit including the dialect found with this sqlglot fork allows me to reference lakehouse external data. Now there is most likely some overlaps between the engine and the dialect now, and also there is a good amount of generated code that is probably irrelevant. Try it out @mattiasthalen and check if we get a bit closer towards the goal |
You're fast! I haven't even fired up a codespace for sqlglot yet. Did a quick test, but all I got was that there is no fabric dialect. Not sure if the error comes from sqlmesh or sqlglot. |
Did my own attempt at creating a fabric dialect (https://github.com/mattiasthalen/sqlglot/tree/add-fabric-tsql-dialect), so far only ensures
|
test it on datetime2 - i had to do some changes there to get it to work |
605badf
to
332ea32
Compare
@georgesittas / @erindru, what's needed for the config to be available for Python configs? |
Do you mean Nothing - if you're using Python config, you're just manually instantiating the same classes that Yaml config would automatically instantiate |
You should also add |
I got it all working here: https://github.com/mattiasthalen/sqlmesh/tree/fabric It requires the SQLGlot from main, hopefully that will be released soon. |
Makes the codebase simpler - probably a standalone PR to change the mssql adapter |
ff0219c
to
6895570
Compare
Submitted #4795 |
158059b
to
9c0a2dd
Compare
11e2952
to
f40fc4d
Compare
2ee2ad6
to
5cc30ab
Compare
51a87cd
to
d5f7aa7
Compare
@fresioAS & @georgesittas I'm curious, how is the fabric dialect from sqlglot included here? I thought sqlglot needed a new tag and the version bumped in sqlmesh. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
I have a pending PR in SQLGlot that will sort the datetime2 errors
Add Microsoft Fabric Engine Support
Overview
This PR adds support for Microsoft Fabric as a new execution engine in SQLMesh. Users can now connect to and execute queries on Microsoft Fabric Data Warehouses.
Changes
Documentation:
docs/integrations/engines/fabric.md
with Fabric connection options, installation, and configuration instructions.docs/guides/configuration.md
anddocs/integrations/overview.md
.mkdocs.yml
to include the new Fabric documentation page.Core Implementation:
FabricConnectionConfig
, inheriting fromMSSQLConnectionConfig
, with Fabric-specific defaults and validation.FabricAdapter
) in the registry.sqlmesh/core/engine_adapter/fabric.py
with Fabric-specific logic, including the use ofDELETE
/INSERT
for overwrite operations.Testing:
tests/core/engine_adapter/test_fabric.py
for adapter logic, table checks, insert/overwrite, and replace query tests.tests/core/test_connection_config.py
for config validation and ODBC connection string generation.Configuration:
pyproject.toml
to add afabric
test marker.