-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract sheet names using pyspark #856
Comments
Does this help? #196 (comment) |
Oh, I tested it on "legacy" Databricks Cluster and it works. My code:
In Unity Catalog environment i'm getting error (it's directly related to Cluster Mode, it cannot be changed in my case):
Is there any other way to get sheet names, without WorkbookReader constructor? I'd rather not mixing crealytics spark code with pandas or any other library. |
We are having the same issue with our Scala code in Unity Catalog (DBR 14.3 LTS) As per this documentation: https://learn.microsoft.com/en-us/azure/databricks/compute/access-mode-limitations#spark-api-limitations-and-requirements-for-unity-catalog-shared-access-mode sparkContext (and therefore hadoopConfiguration) can't be accessed in DBR 14.0 and newer. So, even if there's a workaround for 13.3 for now, newer runtimes won't be able to support it. |
Hmm, that would require a bigger refactoring then because we also need a HadoopConfiguration in the standard use case (even without reading sheet names): |
Am I using the newest version of the library?
Is there an existing issue for this?
Current Behavior
I have problem with class WorkbookReader. Code in Python looks like:
reader = spark._jvm.com.crealytics.spark.excel.WorkbookReader( {"path": "Worktime.xlsx"}, spark.sparkContext._jsc.hadoopConfiguration() ) sheetnames = reader.sheetNames()
My problems:
py4j.Py4JException: Constructor com.crealytics.spark.excel.WorkbookReader([class java.util.HashMap]) does not exist
In PR #196 there's a discussion about using apply method but I don't know how to call it.
Is there anyone who made working it on PySpark? I can't use Scala, because is blocked by administrator in my environment.
Expected Behavior
No response
Steps To Reproduce
No response
Environment
Anything else?
No response
The text was updated successfully, but these errors were encountered: