-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] spark 3.5.0 shim spark-shell is broken in spark-rapids 23.10 and 23.12 #9498
Comments
This issue seems to be isolated to spark-shell, so if we had something like this in place: #9497 we could have seen this before the user. |
There is a workaround for this (disabling parallel worlds) but I see log messages in the executor log that would lead me to be concerned as a user. So I filed this: #9499 |
Can you add an exact command/config for the repro @abellina please? The following works for me: JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64 \
~/dist/spark-3.5.0-bin-hadoop3/bin/spark-shell \
--jars rapids-4-spark_2.12-23.10.0-cuda11.jar \
--conf spark.plugins=com.nvidia.spark.SQLPlugin \
--conf spark.rapids.sql.explain=ALL |
Sure, sorry should have mentioned this was for standalone. In your case that is local, which I assume has its own host of issues/differences:
|
this was cuased by apache/spark@1486835 in 3.5.0. It changed the package of the ExecutorClassLoader and our ShimLoader is explicitly looking for the old package name:
We need to also check for the new classname of |
making the change described above it works and find the mutable classloader from the ExecutorClassLoader
|
The minimum repro that can be used for the test is ~/dist/spark-3.5.0-bin-hadoop3/bin/spark-shell \
--master local-cluster[1,1,1024] \
--jars rapids-4-spark_2.12-23.10.0-cuda11.jar \
--conf spark.plugins=com.nvidia.spark.SQLPlugin --conf spark.rapids.sql.explain=ALL |
fixed by #9500 |
A user reported seeing an issue trying to launch spark-shell with Spark 3.5.0 and 23.12. I have reproed the issue and confirmed I see it for 23.10. User report: NVIDIA/spark-rapids-ml#453 (comment)
Here are tests I ran:
My repro locally. I used JDK 17 like the user, but running with JDK 8 also reproes it:
The text was updated successfully, but these errors were encountered: