Skip to content

Conversation

anlowee
Copy link
Member

@anlowee anlowee commented Oct 7, 2025

Description

This PR must be merged after #78 gets merged.

This PR refactored the unit test framework for the pushdown. Previously we skipped other plan optmizers and directly used our own plan optimizers, which caused the expression like from_unittime(1234) which is supposed to be a constant but turns out to be a call expression when ClpFilterToKqlConverter handles it, thus it will be skipped due to that it is not considered as a constant. This PR uses the ClpQueryRunner which is already used in end-to-end testing to test the KQL generation, which makes more sense and can translate all call expressions which are actually constants before the execution enters ClpFilterToKqlConverter.

Most test cases in TestClpPushDown are identical with the TestClpFilterToKql, but there are some minor changes in some of them. For example:

testPushDown("NOT CLP_WILDCARD_FLOAT_COLUMN() > 0", "NOT * > 0.0", null);

changed to:

testPushDown("NOT CLP_WILDCARD_FLOAT_COLUMN() > 0", "* <= 0.0", null);

These changes are essentially just different representations for the same logical expression. This difference is caused by now we apply all optimizers not just our own. So the actual expression node handled by ClpFilterToKqlConverter is a bit different from before.

There will be another PR to adapt the ClpQueryRunner to add the split-filter config, so that we can move the split-filter related test cases to TestClpPushDown as well, then we can get rid of TestClpFilterToKql class.

Finally when CLP is refactored to only store nanoseconds for TIMESTAMP, we can merge this PR: #76.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Passed the CI.

Summary by CodeRabbit

  • Tests

    • Added comprehensive CLP plugin tests covering compute pushdown scenarios (string/substring, numeric comparisons, BETWEEN, AND/OR/NOT, IN, IS NULL, complex predicates, wildcard UDF) and a new test session-context helper to improve reliability and prevent regressions.
  • Chores

    • Added a test-scoped dependency to support the new CLP test scenarios; no production behaviour changes.

Copy link

coderabbitai bot commented Oct 7, 2025

Walkthrough

Adds a test-scoped dependency on com.facebook.presto:presto-main to presto-clp. Adds TestingClpSessionContext, a SessionContext wrapper delegating to Session with defaults. Adds TestClpComputePushDown, a new test class exercising CLP pushdown planning, KQL fragment extraction, and predicate verification.

Changes

Cohort / File(s) Summary
Build/Test Dependency
presto-clp/pom.xml
Adds test-scoped dependency com.facebook.presto:presto-main.
Testing utility
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestingClpSessionContext.java
New class TestingClpSessionContext implementing SessionContext, delegating to a Session, providing null-safe defaults for catalog/schema and exposing session properties.
CLP pushdown tests
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java
New test class TestClpComputePushDown that sets up a distributed query runner and mock metadata, registers table/column metadata, plans queries, extracts CLP table layout handles, obtains KQL pushdown fragments and remaining predicates, normalizes expressions, and asserts expected results across multiple pushdown scenarios.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Test as TestClpComputePushDown
  participant Runner as DistributedQueryRunner
  participant Planner as QueryPlanner
  participant Plan as PlanNodes
  participant Extractor as KqlExtractor

  Test->>Runner: create and configure runner, register mock metadata
  Test->>Runner: submit query for planning
  Runner->>Planner: build plan
  Planner-->>Plan: return PlanNodes
  Test->>Extractor: scan PlanNodes for CLP layout handles
  Extractor-->>Test: return KQL pushdown fragment + remaining predicate
  Test->>Test: normalize expressions and assert expectations
  Test-->>Runner: cleanup / cancel queries
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The PR description does not follow the required template: it uses incorrect heading levels and omits key sections such as Motivation and Context, Impact, Test Plan, and Release Notes, which makes it difficult to assess the change rationale and testing approach. Please update the PR description to match the repository template by adding the Motivation and Context, Impact, Test Plan, and Release Notes sections (or indicate “NO RELEASE NOTE”) and ensure all headings use the correct “##” level formatting.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The title clearly and concisely reflects the primary change of refactoring the TestClpFilterToKql unit tests to use the new pushdown framework while excluding split-filter pushdown, matching the PR’s objectives.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@anlowee anlowee requested a review from wraymo October 7, 2025 20:20
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bea15e7 and a929cd6.

📒 Files selected for processing (4)
  • presto-clp/pom.xml (1 hunks)
  • presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java (9 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpPushDown.java (1 hunks)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestingClpSessionContext.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpPushDown.java (5)
presto-main-base/src/main/java/com/facebook/presto/dispatcher/DispatchManager.java (1)
  • DispatchManager (70-472)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1)
  • ClpMockMetadataDatabase (49-288)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
  • ColumnMetadataTableRows (30-63)
presto-tests/src/main/java/com/facebook/presto/tests/DistributedQueryRunner.java (1)
  • DistributedQueryRunner (108-1224)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)
  • ClpQueryRunner (28-101)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestingClpSessionContext.java (1)
presto-main-base/src/main/java/com/facebook/presto/Session.java (1)
  • Session (76-950)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-plan-determinism)
  • GitHub Check: test (17.0.13, :presto-main-base)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-aggregation-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-local-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-tpch-distributed-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P presto-tests-general)
  • GitHub Check: test (8.0.442, :presto-main-base)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-execution-memory)
  • GitHub Check: test (8.0.442, :presto-main)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
  • GitHub Check: maven-checks (17.0.13)
  • GitHub Check: maven-checks (8.0.442)
  • GitHub Check: prestissimo-worker-images-build

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about renaming it to TestClpComputePushDown

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@anlowee anlowee marked this pull request as draft October 7, 2025 20:39
@anlowee anlowee force-pushed the xwei/ensure-ts-nanoseconds-part2 branch from 45398b1 to 8cc4449 Compare October 7, 2025 20:44
@anlowee anlowee marked this pull request as ready for review October 8, 2025 15:26
@anlowee anlowee requested a review from wraymo October 8, 2025 15:26
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java (1)

107-121: Close the query runner during teardown

The teardown still omits queryRunner.close(), so the embedded cluster keeps threads and HTTP servers alive after the suite finishes, leaking resources into later tests. Please close the runner (guarded by a null check and ideally in a finally block) once the wait loop completes, and keep the metadata teardown afterwards.

     @AfterClass
     public void tearDown()
             throws InterruptedException
     {
-        queryRunner.cancelAllQueries();
+        if (queryRunner == null) {
+            return;
+        }
+
+        try {
+            queryRunner.cancelAllQueries();
+            long maxCleanUpTime = 5 * 1000L;    // 5 seconds
+            long currentCleanUpTime = 0L;
+            while (!queryManager.getQueries().isEmpty() && currentCleanUpTime < maxCleanUpTime) {
+                Thread.sleep(1000L);
+                currentCleanUpTime += 1000L;
+            }
+        }
+        finally {
+            try {
+                queryRunner.close();
+            }
+            catch (Exception e) {
+                fail(e.getMessage());
+            }
+        }
-        long maxCleanUpTime = 5 * 1000L;    // 5 seconds
-        long currentCleanUpTime = 0L;
-        while (!queryManager.getQueries().isEmpty() && currentCleanUpTime < maxCleanUpTime) {
-            Thread.sleep(1000L);
-            currentCleanUpTime += 1000L;
-        }
         if (null != mockMetadataDatabase) {
             mockMetadataDatabase.teardown();
         }
     }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 58f885e and c9d78d1.

📒 Files selected for processing (1)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java (5)
presto-main-base/src/main/java/com/facebook/presto/dispatcher/DispatchManager.java (1)
  • DispatchManager (70-472)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1)
  • ClpMockMetadataDatabase (49-288)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
  • ColumnMetadataTableRows (30-63)
presto-tests/src/main/java/com/facebook/presto/tests/DistributedQueryRunner.java (1)
  • DistributedQueryRunner (108-1224)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)
  • ClpQueryRunner (28-101)

public class TestClpComputePushDown
{
private static final String TABLE_NAME = "test_pushdown";
private static final Pattern BASE_PTR =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Copy link
Member Author

@anlowee anlowee Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to ensure the row expression can be compared even the Slice's pointers are different. For example, the string constant in the LIKE will generate a Slice object, so that even the actualRemainingExpression and expectedRemainingExpression have the same value, they will create two Slice instances and they have different pointers. This is to use regex to find out all Slice's pointers and replace them with IGNORE so we can compare them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find a better way to do this, because the Slice's equal() function is in the third party library so we can't make any changes. Actually, in our current unit test, we didn't compare the remaining expression at all if the expected KQL pushdow is null. And those can be compared don't have string contants so they don't have any Slice instance and can be compared.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we directly compare two RowExpressions without converting them into a string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directly comparing the RowExpression will eventually compare the Slice object (if you check how the RowExpression's equal works), and there is no way to modify this process afaik. So the best way to "ignore" the Slice object maybe converting to String then manually ignore it by some String operations.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java (2)

107-121: Close the query runner to prevent resource leaks.

The queryRunner is never closed, leaving embedded cluster threads and server resources running after tests complete. This can cause interference with subsequent tests and leave the JVM in a dirty state.

This issue was previously flagged. Please apply the suggested fix to call queryRunner.close() in a finally block or after the wait loop.


385-392: Handle queries that reach terminal states immediately.

The wait loop only exits when state == RUNNING, but simple queries (e.g., SELECT ... LIMIT 1) may transition directly to FINISHED, causing the test to timeout after 60 seconds. This makes tests flaky.

This issue was previously flagged. Please apply the suggested fix to break the loop when the query reaches RUNNING or any terminal state (FINISHED, FAILED, CANCELED).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c9d78d1 and c99be8f.

📒 Files selected for processing (1)
  • presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java (1 hunks)
🔇 Additional comments (2)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java (2)

338-359: LGTM! Proper assertion for null remaining filter.

The code now correctly asserts actualRemainingExpression is null when expectedRemaining is null (lines 357-359), ensuring the test fails if a filter unexpectedly remains after pushdown. This addresses the concern raised in previous reviews.


123-314: LGTM! Comprehensive pushdown test coverage.

The test suite thoroughly exercises CLP pushdown scenarios including:

  • String operations (exact match, LIKE, substring)
  • Numeric comparisons and BETWEEN
  • Boolean logic (AND, OR, NOT, IN, IS NULL)
  • Complex nested conditions
  • Wildcard UDF functions

Test expectations are clear and properly validate both pushed-down KQL fragments and remaining filters.

Comment on lines +60 to +61
private static final Pattern BASE_PTR =
Pattern.compile("(?<=base=)\\[B@[0-9a-fA-F]+");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Document the purpose of BASE_PTR pattern.

The Pattern matches base pointer addresses (e.g., [B@a1b2c3d4) in byte array toString() output. When comparing RowExpression objects via their string representations, memory addresses are non-deterministic and would cause spurious test failures. The pattern is used in equalsIgnoreBase() to normalize these addresses before comparison.

Consider adding a comment to clarify this:

-    private static final Pattern BASE_PTR =
-            Pattern.compile("(?<=base=)\\[B@[0-9a-fA-F]+");
+    // Pattern to strip non-deterministic byte-array base pointers from RowExpression.toString()
+    // output (e.g., "base=[B@a1b2c3d4" → "base=[B@IGNORED") for reliable test assertions
+    private static final Pattern BASE_PTR =
+            Pattern.compile("(?<=base=)\\[B@[0-9a-fA-F]+");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
private static final Pattern BASE_PTR =
Pattern.compile("(?<=base=)\\[B@[0-9a-fA-F]+");
// Pattern to strip non-deterministic byte-array base pointers from RowExpression.toString()
// output (e.g., "base=[B@a1b2c3d4" → "base=[B@IGNORED") for reliable test assertions
private static final Pattern BASE_PTR =
Pattern.compile("(?<=base=)\\[B@[0-9a-fA-F]+");
🤖 Prompt for AI Agents
In
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java
around lines 60-61, add a concise comment above the BASE_PTR Pattern explaining
that it matches base pointer addresses from byte[] toString() output (e.g.,
“[B@a1b2c3d4”) so equalsIgnoreBase() can normalize non-deterministic memory
addresses in RowExpression string comparisons to avoid spurious test failures;
include an example of the matched string and a short note that this is solely
for deterministic test comparisons.

@anlowee anlowee requested a review from wraymo October 14, 2025 13:44
public class TestClpComputePushDown
{
private static final String TABLE_NAME = "test_pushdown";
private static final Pattern BASE_PTR =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we directly compare two RowExpressions without converting them into a string?

Comment on lines +406 to +425
private ClpTableLayoutHandle tryGetClpTableLayoutHandleFromFilterNode(PlanNode node)
{
if (!(node instanceof FilterNode)) {
return null;
}
return (tryGetClpTableLayoutHandleFromTableScanNode(((FilterNode) node).getSource()));
}

private ClpTableLayoutHandle tryGetClpTableLayoutHandleFromTableScanNode(PlanNode node)
{
if (!(node instanceof TableScanNode)) {
return null;
}
ConnectorTableLayoutHandle tableLayoutHandle = ((TableScanNode) node).getTable().getLayout().orElse(null);
if (!(tableLayoutHandle instanceof ClpTableLayoutHandle)) {
return null;
}
return (ClpTableLayoutHandle) tableLayoutHandle;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private ClpTableLayoutHandle tryGetClpTableLayoutHandleFromFilterNode(PlanNode node)
{
if (!(node instanceof FilterNode)) {
return null;
}
return (tryGetClpTableLayoutHandleFromTableScanNode(((FilterNode) node).getSource()));
}
private ClpTableLayoutHandle tryGetClpTableLayoutHandleFromTableScanNode(PlanNode node)
{
if (!(node instanceof TableScanNode)) {
return null;
}
ConnectorTableLayoutHandle tableLayoutHandle = ((TableScanNode) node).getTable().getLayout().orElse(null);
if (!(tableLayoutHandle instanceof ClpTableLayoutHandle)) {
return null;
}
return (ClpTableLayoutHandle) tableLayoutHandle;
}
}
private ClpTableLayoutHandle tryGetClpTableLayoutHandle(PlanNode node)
{
PlanNode cursor = node;
if (cursor instanceof FilterNode) {
cursor = ((FilterNode) cursor).getSource();
}
if (!(cursor instanceof TableScanNode)) {
return null;
}
ConnectorTableLayoutHandle handle =
((TableScanNode) cursor).getTable().getLayout().orElse(null);
return (handle instanceof ClpTableLayoutHandle)
? (ClpTableLayoutHandle) handle
: null;
}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually these two functions cannot be merged. At L350, here we only want to capture the FilterNode because the expectedRemainingExpression is fetched by getPredicate. After merging these two functions, the new function will also capture the TableScanNode, which is not desired here.

@kirkrodrigues kirkrodrigues requested a review from wraymo October 15, 2025 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants