Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatible jackson-core version with Spark 3.0 #277

Closed
qxzzxq opened this issue Aug 5, 2020 · 4 comments
Closed

Incompatible jackson-core version with Spark 3.0 #277

qxzzxq opened this issue Aug 5, 2020 · 4 comments

Comments

@qxzzxq
Copy link

qxzzxq commented Aug 5, 2020

Hi 👋

Recently I was trying to use this library with Spark 3.0 but I got the following exception:

Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/core/exc/InputCoercionException
	at com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<init>(ScalaNumberDeserializersModule.scala:48)
	at com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<clinit>(ScalaNumberDeserializersModule.scala)
	at com.fasterxml.jackson.module.scala.deser.ScalaNumberDeserializersModule.$init$(ScalaNumberDeserializersModule.scala:60)
	at com.fasterxml.jackson.module.scala.DefaultScalaModule.<init>(DefaultScalaModule.scala:18)
	at com.fasterxml.jackson.module.scala.DefaultScalaModule$.<init>(DefaultScalaModule.scala:36)
	at com.fasterxml.jackson.module.scala.DefaultScalaModule$.<clinit>(DefaultScalaModule.scala)
	at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
	at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
	at org.apache.spark.SparkContext.withScope(SparkContext.scala:751)
	at org.apache.spark.SparkContext.parallelize(SparkContext.scala:768)
	at com.crealytics.spark.excel.ExcelRelation.parallelize(ExcelRelation.scala:98)
	at com.crealytics.spark.excel.ExcelRelation.$anonfun$buildScan$3(ExcelRelation.scala:78)
	at com.crealytics.spark.excel.WorkbookReader.withWorkbook(WorkbookReader.scala:15)
	at com.crealytics.spark.excel.WorkbookReader.withWorkbook$(WorkbookReader.scala:13)
	at com.crealytics.spark.excel.DefaultWorkbookReader.withWorkbook(WorkbookReader.scala:46)
	at com.crealytics.spark.excel.ExcelRelation.buildScan(ExcelRelation.scala:62)
	at org.apache.spark.sql.execution.datasources.DataSourceStrategy.$anonfun$apply$6(DataSourceStrategy.scala:305)
        ...

In Spark 3.0, the version of the jackson-core library has been upgrade from 2.6.7 to 2.10.0. I think it may cause compatibility problem with the version 2.8.8 (which is used by spark-excel).

I've also tried to exclude jackson-core from spark-excel. But it doesn't work neither. I got this exception

java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: scala.Some is not a valid external type for schema of string

Possible Solution

Upgrade the jackson library to the same version as Spark

Steps to Reproduce (for bugs)

pom:

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.12</artifactId>
            <version>3.0.0</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.12</artifactId>
            <version>3.0.0</version>
            <scope>provided</scope>
        </dependency>
        <!-- https://mvnrepository.com/artifact/com.crealytics/spark-excel -->
        <dependency>
            <groupId>com.crealytics</groupId>
            <artifactId>spark-excel_2.12</artifactId>
            <version>0.13.1</version>
        </dependency>

Code:

val spark = SparkSession.builder().master("local").getOrCreate()

println(spark.sparkContext.version)

spark.read
  .format("com.crealytics.spark.excel")
  .option("header", "true")
  .load("path/to/excel")
  .show()

Your Environment

  • Spark version and language (Scala, Java, Python, R, ...): Scala 2.12, Spark 3.0.0
  • Spark-Excel version: 0.13.1
  • Operating System and version, cluster environment, ...: Mac OS
@nightscape
Copy link
Owner

Hi @qxzzxq, thanks for the detailed bug report!
We're actually shading Jackson in order to prevent such problems.
I think your first problem is due to this bug in the SBT plugin we're using for shading: hammerlab/sbt-parent#32
The scala.Some bug is due to this bug in spark-excel: #181
It was fixed a while ago, but due to a misconfiguration in our Github Actions the release process didn't work.
I have hopefully just fixed the release process and 0.13.4 including the bug-fix should be on its way.
You still need to use your first approach of excluding jackson-core.

@qxzzxq
Copy link
Author

qxzzxq commented Aug 5, 2020

Thank you @nightscape for your response! Can't wait to test 0.13.4 😄

@nightscape
Copy link
Owner

It seems to be published properly although the Github Action is still running. Go give it a try! 😃

@qxzzxq
Copy link
Author

qxzzxq commented Aug 6, 2020

It works in 0.13.4 👍

@qxzzxq qxzzxq closed this as completed Aug 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants