-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't read a Delta table from Azure Unity Catalog #1628
Comments
😆 me too The Unity support in delta-rs is young I would say. I have access to a Unity environment but not an Azure specific Databricks+Unity environment. I'm not honestly sure how to start here, I assume the URL that was spit out to you is at a legitimate hostname that might otherwise respond to connections from wherever you are running this Python code? |
It looks like the current implementation works for storage location retrieval, but will require additional creds for data access. (in addition to Azure I also tried in AWS - similar story) I suspect this could work if the application is running on a cloud VM with certain rights but I didn't test that (That In addition to being a metadata provider, Unity on Databricks also acts as an access token provider so it can enforce ACLs, etc. Using the same pattern on local/non-Databricks compute would provide a similar experience but I don't know if it's achievable at the moment(or will ever be). A possible quick fix could be providing additional credentials that allow access to the storage managed by UC. For example, when I specify I guess this workaround may also work in Azure with a right secret/key/token/... |
Actually, this looks like an expected behavior, mentioned in #1331 (comment) |
@rtyler maybe you could include me in those future conversations given I work for Databricks atm :grin |
The Unity Catalog in my org is becoming a huge roadblock to use Delta-RS in a broad scope outside of internal team use. No one wants to provide read credentials anymore to the storage which obliterates the use of Delta-RS within this context. Besides the possible vendor lock-in 😄, it makes interoperability with databricks not ideal, currently for any data reads we revert back to databricks-sql connector. |
I have the same problem: OSError: Generic MicrosoftAzure error: Error performing token request: response error "request error", after 10 retries: error sending request for url (http://169.254.169.254/metadata/identity/oauth2/token?api-version=2019-08-01&resource=https%3A%2F%2Fstorage.azure.com): error trying to connect: tcp connect error: Se ha intentado una operación de socket en una red no accesible. (os error 10051) The 169.254.169.254 is used to retrieve the authentication token But I don't understand why this is needed, as the Databricks documentation says we need to get a short-lived token and a signed URL: |
Interesting, so UC by design gives a token to read the data from storage. Then this token should just be returned when you query databricks REST APIs get table |
Hi @ion-elgreco, is this a good time to address this again, now that Unity Catalog OSS version 0.2.0 is released with credential vending support? Does this make it easier/clearer to implement? https://github.com/unitycatalog/unitycatalog/releases/tag/v0.2.0 |
Sure, feel free to take a jab at it |
Well this could be good excuse for me to learn Rust indeed, I might do that. |
Environment
Environment:
Bug
What happened:
I am trying to replicate this example from the documentation to read a Delta Table from Databricks Unity Catalog:
but I get the following error:
Stacktrace:
What you expected to happen:
I wish I could read the Delta Table
More details:
The text was updated successfully, but these errors were encountered: