-
Notifications
You must be signed in to change notification settings - Fork 2
refactor: Rework TestClpFilterToKql
unit tests related to pushdown, excluding split-filter pushdown.
#77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release-0.293-clp-connector
Are you sure you want to change the base?
Conversation
… TestClpFilterToKql to TestClpPushDown
WalkthroughAdds a test-scoped dependency on com.facebook.presto:presto-main to presto-clp. Adds TestingClpSessionContext, a SessionContext wrapper delegating to Session with defaults. Adds TestClpComputePushDown, a new test class exercising CLP pushdown planning, KQL fragment extraction, and predicate verification. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Test as TestClpComputePushDown
participant Runner as DistributedQueryRunner
participant Planner as QueryPlanner
participant Plan as PlanNodes
participant Extractor as KqlExtractor
Test->>Runner: create and configure runner, register mock metadata
Test->>Runner: submit query for planning
Runner->>Planner: build plan
Planner-->>Plan: return PlanNodes
Test->>Extractor: scan PlanNodes for CLP layout handles
Extractor-->>Test: return KQL pushdown fragment + remaining predicate
Test->>Test: normalize expressions and assert expectations
Test-->>Runner: cleanup / cancel queries
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
presto-clp/pom.xml
(1 hunks)presto-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java
(9 hunks)presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpPushDown.java
(1 hunks)presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestingClpSessionContext.java
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpPushDown.java (5)
presto-main-base/src/main/java/com/facebook/presto/dispatcher/DispatchManager.java (1)
DispatchManager
(70-472)presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1)
ClpMockMetadataDatabase
(49-288)presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
ColumnMetadataTableRows
(30-63)presto-tests/src/main/java/com/facebook/presto/tests/DistributedQueryRunner.java (1)
DistributedQueryRunner
(108-1224)presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)
ClpQueryRunner
(28-101)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestingClpSessionContext.java (1)
presto-main-base/src/main/java/com/facebook/presto/Session.java (1)
Session
(76-950)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-plan-determinism)
- GitHub Check: test (17.0.13, :presto-main-base)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-aggregation-queries)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-local-queries)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-queries)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-tpch-distributed-queries)
- GitHub Check: test (17.0.13, :presto-tests -P presto-tests-general)
- GitHub Check: test (8.0.442, :presto-main-base)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
- GitHub Check: test (8.0.442, :presto-tests -P presto-tests-execution-memory)
- GitHub Check: test (8.0.442, :presto-main)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
- GitHub Check: maven-checks (17.0.13)
- GitHub Check: maven-checks (8.0.442)
- GitHub Check: prestissimo-worker-images-build
...o-clp/src/main/java/com/facebook/presto/plugin/clp/optimization/ClpFilterToKqlConverter.java
Outdated
Show resolved
Hide resolved
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about renaming it to TestClpComputePushDown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
45398b1
to
8cc4449
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java (1)
107-121
: Close the query runner during teardownThe teardown still omits
queryRunner.close()
, so the embedded cluster keeps threads and HTTP servers alive after the suite finishes, leaking resources into later tests. Please close the runner (guarded by a null check and ideally in afinally
block) once the wait loop completes, and keep the metadata teardown afterwards.@AfterClass public void tearDown() throws InterruptedException { - queryRunner.cancelAllQueries(); + if (queryRunner == null) { + return; + } + + try { + queryRunner.cancelAllQueries(); + long maxCleanUpTime = 5 * 1000L; // 5 seconds + long currentCleanUpTime = 0L; + while (!queryManager.getQueries().isEmpty() && currentCleanUpTime < maxCleanUpTime) { + Thread.sleep(1000L); + currentCleanUpTime += 1000L; + } + } + finally { + try { + queryRunner.close(); + } + catch (Exception e) { + fail(e.getMessage()); + } + } - long maxCleanUpTime = 5 * 1000L; // 5 seconds - long currentCleanUpTime = 0L; - while (!queryManager.getQueries().isEmpty() && currentCleanUpTime < maxCleanUpTime) { - Thread.sleep(1000L); - currentCleanUpTime += 1000L; - } if (null != mockMetadataDatabase) { mockMetadataDatabase.teardown(); } }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java (5)
presto-main-base/src/main/java/com/facebook/presto/dispatcher/DispatchManager.java (1)
DispatchManager
(70-472)presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/ClpMockMetadataDatabase.java (1)
ClpMockMetadataDatabase
(49-288)presto-clp/src/test/java/com/facebook/presto/plugin/clp/mockdb/table/ColumnMetadataTableRows.java (1)
ColumnMetadataTableRows
(30-63)presto-tests/src/main/java/com/facebook/presto/tests/DistributedQueryRunner.java (1)
DistributedQueryRunner
(108-1224)presto-clp/src/test/java/com/facebook/presto/plugin/clp/ClpQueryRunner.java (1)
ClpQueryRunner
(28-101)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java
Show resolved
Hide resolved
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java
Show resolved
Hide resolved
public class TestClpComputePushDown | ||
{ | ||
private static final String TABLE_NAME = "test_pushdown"; | ||
private static final Pattern BASE_PTR = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to ensure the row expression can be compared even the Slice
's pointers are different. For example, the string constant in the LIKE
will generate a Slice
object, so that even the actualRemainingExpression
and expectedRemainingExpression
have the same value, they will create two Slice
instances and they have different pointers. This is to use regex to find out all Slice
's pointers and replace them with IGNORE
so we can compare them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find a better way to do this, because the Slice
's equal()
function is in the third party library so we can't make any changes. Actually, in our current unit test, we didn't compare the remaining expression at all if the expected KQL pushdow is null. And those can be compared don't have string contants so they don't have any Slice
instance and can be compared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we directly compare two RowExpression
s without converting them into a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Directly comparing the RowExpression will eventually compare the Slice object (if you check how the RowExpression's equal works), and there is no way to modify this process afaik. So the best way to "ignore" the Slice object maybe converting to String then manually ignore it by some String operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java (2)
107-121
: Close the query runner to prevent resource leaks.The
queryRunner
is never closed, leaving embedded cluster threads and server resources running after tests complete. This can cause interference with subsequent tests and leave the JVM in a dirty state.This issue was previously flagged. Please apply the suggested fix to call
queryRunner.close()
in a finally block or after the wait loop.
385-392
: Handle queries that reach terminal states immediately.The wait loop only exits when
state == RUNNING
, but simple queries (e.g.,SELECT ... LIMIT 1
) may transition directly toFINISHED
, causing the test to timeout after 60 seconds. This makes tests flaky.This issue was previously flagged. Please apply the suggested fix to break the loop when the query reaches
RUNNING
or any terminal state (FINISHED
,FAILED
,CANCELED
).
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java
(1 hunks)
🔇 Additional comments (2)
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java (2)
338-359
: LGTM! Proper assertion for null remaining filter.The code now correctly asserts
actualRemainingExpression
is null whenexpectedRemaining
is null (lines 357-359), ensuring the test fails if a filter unexpectedly remains after pushdown. This addresses the concern raised in previous reviews.
123-314
: LGTM! Comprehensive pushdown test coverage.The test suite thoroughly exercises CLP pushdown scenarios including:
- String operations (exact match, LIKE, substring)
- Numeric comparisons and BETWEEN
- Boolean logic (AND, OR, NOT, IN, IS NULL)
- Complex nested conditions
- Wildcard UDF functions
Test expectations are clear and properly validate both pushed-down KQL fragments and remaining filters.
private static final Pattern BASE_PTR = | ||
Pattern.compile("(?<=base=)\\[B@[0-9a-fA-F]+"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Document the purpose of BASE_PTR pattern.
The Pattern matches base pointer addresses (e.g., [B@a1b2c3d4
) in byte array toString()
output. When comparing RowExpression
objects via their string representations, memory addresses are non-deterministic and would cause spurious test failures. The pattern is used in equalsIgnoreBase()
to normalize these addresses before comparison.
Consider adding a comment to clarify this:
- private static final Pattern BASE_PTR =
- Pattern.compile("(?<=base=)\\[B@[0-9a-fA-F]+");
+ // Pattern to strip non-deterministic byte-array base pointers from RowExpression.toString()
+ // output (e.g., "base=[B@a1b2c3d4" → "base=[B@IGNORED") for reliable test assertions
+ private static final Pattern BASE_PTR =
+ Pattern.compile("(?<=base=)\\[B@[0-9a-fA-F]+");
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
private static final Pattern BASE_PTR = | |
Pattern.compile("(?<=base=)\\[B@[0-9a-fA-F]+"); | |
// Pattern to strip non-deterministic byte-array base pointers from RowExpression.toString() | |
// output (e.g., "base=[B@a1b2c3d4" → "base=[B@IGNORED") for reliable test assertions | |
private static final Pattern BASE_PTR = | |
Pattern.compile("(?<=base=)\\[B@[0-9a-fA-F]+"); |
🤖 Prompt for AI Agents
In
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java
around lines 60-61, add a concise comment above the BASE_PTR Pattern explaining
that it matches base pointer addresses from byte[] toString() output (e.g.,
“[B@a1b2c3d4”) so equalsIgnoreBase() can normalize non-deterministic memory
addresses in RowExpression string comparisons to avoid spurious test failures;
include an example of the matched string and a short note that this is solely
for deterministic test comparisons.
presto-clp/src/test/java/com/facebook/presto/plugin/clp/TestClpComputePushDown.java
Show resolved
Hide resolved
public class TestClpComputePushDown | ||
{ | ||
private static final String TABLE_NAME = "test_pushdown"; | ||
private static final Pattern BASE_PTR = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we directly compare two RowExpression
s without converting them into a string?
private ClpTableLayoutHandle tryGetClpTableLayoutHandleFromFilterNode(PlanNode node) | ||
{ | ||
if (!(node instanceof FilterNode)) { | ||
return null; | ||
} | ||
return (tryGetClpTableLayoutHandleFromTableScanNode(((FilterNode) node).getSource())); | ||
} | ||
|
||
private ClpTableLayoutHandle tryGetClpTableLayoutHandleFromTableScanNode(PlanNode node) | ||
{ | ||
if (!(node instanceof TableScanNode)) { | ||
return null; | ||
} | ||
ConnectorTableLayoutHandle tableLayoutHandle = ((TableScanNode) node).getTable().getLayout().orElse(null); | ||
if (!(tableLayoutHandle instanceof ClpTableLayoutHandle)) { | ||
return null; | ||
} | ||
return (ClpTableLayoutHandle) tableLayoutHandle; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private ClpTableLayoutHandle tryGetClpTableLayoutHandleFromFilterNode(PlanNode node) | |
{ | |
if (!(node instanceof FilterNode)) { | |
return null; | |
} | |
return (tryGetClpTableLayoutHandleFromTableScanNode(((FilterNode) node).getSource())); | |
} | |
private ClpTableLayoutHandle tryGetClpTableLayoutHandleFromTableScanNode(PlanNode node) | |
{ | |
if (!(node instanceof TableScanNode)) { | |
return null; | |
} | |
ConnectorTableLayoutHandle tableLayoutHandle = ((TableScanNode) node).getTable().getLayout().orElse(null); | |
if (!(tableLayoutHandle instanceof ClpTableLayoutHandle)) { | |
return null; | |
} | |
return (ClpTableLayoutHandle) tableLayoutHandle; | |
} | |
} | |
private ClpTableLayoutHandle tryGetClpTableLayoutHandle(PlanNode node) | |
{ | |
PlanNode cursor = node; | |
if (cursor instanceof FilterNode) { | |
cursor = ((FilterNode) cursor).getSource(); | |
} | |
if (!(cursor instanceof TableScanNode)) { | |
return null; | |
} | |
ConnectorTableLayoutHandle handle = | |
((TableScanNode) cursor).getTable().getLayout().orElse(null); | |
return (handle instanceof ClpTableLayoutHandle) | |
? (ClpTableLayoutHandle) handle | |
: null; | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually these two functions cannot be merged. At L350, here we only want to capture the FilterNode
because the expectedRemainingExpression
is fetched by getPredicate
. After merging these two functions, the new function will also capture the TableScanNode
, which is not desired here.
Description
This PR must be merged after #78 gets merged.
This PR refactored the unit test framework for the pushdown. Previously we skipped other plan optmizers and directly used our own plan optimizers, which caused the expression like
from_unittime(1234)
which is supposed to be a constant but turns out to be a call expression whenClpFilterToKqlConverter
handles it, thus it will be skipped due to that it is not considered as a constant. This PR uses theClpQueryRunner
which is already used in end-to-end testing to test the KQL generation, which makes more sense and can translate all call expressions which are actually constants before the execution entersClpFilterToKqlConverter
.Most test cases in
TestClpPushDown
are identical with theTestClpFilterToKql
, but there are some minor changes in some of them. For example:changed to:
These changes are essentially just different representations for the same logical expression. This difference is caused by now we apply all optimizers not just our own. So the actual expression node handled by
ClpFilterToKqlConverter
is a bit different from before.There will be another PR to adapt the
ClpQueryRunner
to add the split-filter config, so that we can move the split-filter related test cases toTestClpPushDown
as well, then we can get rid ofTestClpFilterToKql
class.Finally when CLP is refactored to only store nanoseconds for TIMESTAMP, we can merge this PR: #76.
Checklist
breaking change.
Validation performed
Passed the CI.
Summary by CodeRabbit
Tests
Chores