fix: Fix Spark SQL AQE exchange reuse test failures #1811

coderfender · 2025-05-29T00:04:30Z

Which issue does this PR close?

Rationale for this change

The AQE feature in spark relies on a cache (where key is canonicalized version of a plan and value is the actual SparkPlan object itself) to avoid recomputation of stages and improve stage efficiency. However, with comet and spark plans' canonicalized version being the same, the AQE incorrectly fetches Comet Plan instead of native Spark ones causing Class type exceptions as mentioned in the related github issue. The goal of this change is to prevent that by handling AQE use case between the final handoff between Comet and Spark systems. Although this change solves the AQE usecase, the goal of the PR is to come up with a better suited strategy which includes but not limited to changing the canonicalization logic of Comet plans so that Spark can differentiate them

What changes are included in this PR?

Changes to return the right object in EliminateRedundantTransitions.scala and also make sure that doExecute method inovkes Comet's method directly.

How are these changes tested?

Local testing - Running all AQE unit tests
Integration testing. - Workflow to run Spark , OS level and TPCH tests

codecov-commenter · 2025-05-29T00:59:47Z

Codecov Report

Attention: Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.

Project coverage is 59.40%. Comparing base (f09f8af) to head (63e93ef).
Report is 234 commits behind head on main.

Files with missing lines	Patch %	Lines
...he/comet/rules/EliminateRedundantTransitions.scala	0.00%	1 Missing and 2 partials ⚠️
...t/execution/shuffle/CometShuffleExchangeExec.scala	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #1811      +/-   ##
============================================
+ Coverage     56.12%   59.40%   +3.28%     
- Complexity      976     1151     +175     
============================================
  Files           119      129      +10     
  Lines         11743    12631     +888     
  Branches       2251     2368     +117     
============================================
+ Hits           6591     7504     +913     
+ Misses         4012     3919      -93     
- Partials       1140     1208      +68

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

andygrove · 2025-05-30T00:04:16Z

Can we enable some of the Spark SQL tests as part of this PR by updating the diff (at least for Spark 3.5.5)?

coderfender · 2025-06-02T11:12:42Z

Updated diff file to unignore AQE tests in Spark (v3.5.5)

andygrove

Thanks @coderfender!

coderfender added 3 commits May 28, 2025 17:04

fix_aqe_tests

050302a

fix_aqe_tests

52d9379

fix_aqe_tests

2cfee07

fix_aqe_tests

0a4be24

andygrove changed the title ~~Spark : Fix AQE Tests~~ fix: Fix Spark SQL AQE Tests May 29, 2025

coderfender marked this pull request as ready for review May 29, 2025 22:26

coderfender force-pushed the fix_breaking_tests_spark branch from a3e6a35 to 0a4be24 Compare June 2, 2025 01:14

enable_support_AQE_spark

63e93ef

coderfender mentioned this pull request Jun 2, 2025

[EPIC] Spark SQL test failures when Comet JVM shuffle is used #1254

Open

andygrove approved these changes Jun 2, 2025

View reviewed changes

andygrove changed the title ~~fix: Fix Spark SQL AQE Tests~~ fix: Fix Spark SQL AQE exchange reuse test failures Jun 2, 2025

andygrove merged commit db7d59a into apache:main Jun 2, 2025
79 checks passed

andygrove mentioned this pull request Jun 3, 2025

Enabling Test "Runtime bloom filter join: do not add bloom filter if dpp filter exists on the same column" fails with IllegalStateException in AdaptiveSparkPlanExec.newQueryStage #1831

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Fix Spark SQL AQE exchange reuse test failures #1811

fix: Fix Spark SQL AQE exchange reuse test failures #1811

Uh oh!

coderfender commented May 29, 2025 •

edited

Loading

Uh oh!

codecov-commenter commented May 29, 2025 •

edited

Loading

Uh oh!

andygrove commented May 30, 2025

Uh oh!

coderfender commented Jun 2, 2025

Uh oh!

andygrove left a comment

Uh oh!

Uh oh!

Uh oh!

fix: Fix Spark SQL AQE exchange reuse test failures #1811

fix: Fix Spark SQL AQE exchange reuse test failures #1811

Uh oh!

Conversation

coderfender commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

codecov-commenter commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

andygrove commented May 30, 2025

Uh oh!

coderfender commented Jun 2, 2025

Uh oh!

andygrove left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderfender commented May 29, 2025 •

edited

Loading

codecov-commenter commented May 29, 2025 •

edited

Loading