Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump Spark 3.5.4 RC2 #11731

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Bump Spark 3.5.4 RC2 #11731

wants to merge 4 commits into from

Conversation

pan3793
Copy link
Member

@pan3793 pan3793 commented Dec 9, 2024

@github-actions github-actions bot added the build label Dec 9, 2024
@pan3793
Copy link
Member Author

pan3793 commented Dec 10, 2024

the upgrading causes a core dump in iceberg-spark-3.5's tests

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f8529a34563, pid=83775, tid=83793
#
# JRE version: OpenJDK Runtime Environment Zulu17.54+21-CRaC-CA (17.0.13+11) (build 17.0.13+11-LTS)
# Java VM: OpenJDK 64-Bit Server VM Zulu17.54+21-CRaC-CA (17.0.13+11-LTS, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xc34563]  ProtectionDomainEntry::object_no_keepalive()+0x3
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/runner/work/iceberg/iceberg/spark/v3.5/spark/core.83775)
#
# An error report file with more information is saved as:
# /home/runner/work/iceberg/iceberg/spark/v3.5/spark/hs_err_pid83775.log
#
# Compiler replay data is saved as:
# /home/runner/work/iceberg/iceberg/spark/v3.5/spark/replay_pid83775.log
#
# If you would like to submit a bug report, please visit:
#   http://www.azul.com/support/
#
> Task :iceberg-spark:iceberg-spark-3.5_2.12:test FAILED

@pan3793
Copy link
Member Author

pan3793 commented Dec 10, 2024

the failure seems related to arrow

minimal reproduce command

./gradlew -DsparkVersions=3.5 -DscalaVersion=2.12 -DhiveVersions= -DflinkVersions= -DkafkaVersions= \
    :iceberg-spark:iceberg-spark-3.5_2.12:test --tests=TestRewriteDataFilesAction \
    -Pquick=true -x javadoc

@pan3793
Copy link
Member Author

pan3793 commented Dec 10, 2024

update: the failure is related to SPARK-50235, and the test passed on Spark 3.5.4 RC1 with reverting that patch

@viirya
Copy link
Member

viirya commented Dec 10, 2024

Please check for some discussion there apache/spark#49131 (comment)

@LuciferYang
Copy link

Please check for some discussion there apache/spark#49131 (comment)

Yes, I tested it locally and found that after adding a no-op closeIfFreeable method to the IcebergArrowColumnVector.java in version 3.5, all the test cases in TestRewriteDataFilesAction were able to pass. We can further test it after the release of 3.5.4 RC2.

@pan3793
Copy link
Member Author

pan3793 commented Dec 11, 2024

@viirya @LuciferYang thank you for the guidance, let me try

@github-actions github-actions bot added the spark label Dec 11, 2024
// If a column vector is writable or constant, it should override this method and do nothing.
// See more details at SPARK-50235, SPARK-50463 (Fixed in Spark 3.5.4)
@Override
public void closeIfNotWritable() {}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will rename to closeIfFreeable after RC2 available

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a column vector is writable or constant, it should override this method and do nothing.

we have ConstantColumnVector as well, should we do it vector classes which absolutely requires it ?

Copy link
Member Author

@pan3793 pan3793 Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the codebase, other vectors overwrite void close() {}, so they are not affected

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch @pan3793 !

// If a column vector is writable or constant, it should override this method and do nothing.
// See more details at SPARK-50235, SPARK-50463 (Fixed in Spark 3.5.4)
@Override
public void closeIfNotWritable() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a column vector is writable or constant, it should override this method and do nothing.

we have ConstantColumnVector as well, should we do it vector classes which absolutely requires it ?

@pan3793
Copy link
Member Author

pan3793 commented Dec 11, 2024

FYI @LuciferYang @viirya @dongjoon-hyun @wForget

CI is green now, and I think the current Spark branch-3.5 is in good shape for Iceberg.

@dongjoon-hyun
Copy link
Member

Thank you for checking Apache Iceberg side, @pan3793 .

@pan3793 pan3793 changed the title Bump Spark 3.5.4 RC1 Bump Spark 3.5.4 RC2 Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants