Skip to content

Commit

Permalink
[Substrait] Run CSE in emit deduplication patterns. (#838)
Browse files Browse the repository at this point in the history
This commits adds calls to two patterns that may create common
subexpressions during emit deduplication. By deduplicating their
regions, field references to duplicated fields end up referring to the
same field, thus creating common subexpressions. The patterns, thus,
call CSE right away.

This can, in the case of `project`, create duplicate *output* fields,
which the remainder of the emit duplication pass can pick up. If CSE
were not run as part of emit deduplication but as alternating passes,
then each of the two passes could unlock optimization potential for the
other, such that we'd need as many iterations of them as the longest
use-def chain is long.

Another design choice consists in calling CSE from an *existing*
pattern. An alternative would be to call CSE in one or several decicated
patterns but that would either require (1) to test for duplicate field
references before application or (2) run CSE blindly and return
`success()` iff that made changes.

Signed-off-by: Ingo Müller <[email protected]>
  • Loading branch information
ingomueller-net authored Jul 23, 2024
1 parent d9138c6 commit 5acf08e
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 5 deletions.
1 change: 1 addition & 0 deletions lib/Dialect/Substrait/Transforms/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,6 @@ add_mlir_dialect_library(MLIRSubstraitTransforms
MLIRPass
MLIRRewrite
MLIRSubstraitDialect
MLIRTransforms
MLIRTransformUtils
)
16 changes: 16 additions & 0 deletions lib/Dialect/Substrait/Transforms/EmitDeduplication.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@

#include "structured/Dialect/Substrait/Transforms/Passes.h"

#include "mlir/IR/Dominance.h"
#include "mlir/IR/PatternMatch.h"
#include "mlir/Transforms/CSE.h"
#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
#include "structured/Dialect/Substrait/IR/Substrait.h"

Expand Down Expand Up @@ -257,6 +259,13 @@ struct PushDuplicatesThroughFilterPattern : public OpRewritePattern<FilterOp> {
deduplicateRegionArgs(newOp.getCondition(), emitOp.getMapping(),
newInput.getType(), rewriter);

// Deduplicating block args may create common subexpressions. Eliminate
// them immediately.
{
DominanceInfo domInfo;
mlir::eliminateCommonSubExpressions(rewriter, domInfo, newOp);
}

// Replace the old `filter` op with a new `emit` op that maps back to the
// original emit order.
ArrayAttr reverseMappingAttr = rewriter.getI64ArrayAttr(reverseMapping);
Expand Down Expand Up @@ -318,6 +327,13 @@ struct PushDuplicatesThroughProjectPattern
deduplicateRegionArgs(newOp.getExpressions(), emitOp.getMapping(),
newInput.getType(), rewriter);

// Deduplicating block args may create common subexpressions. Eliminate
// them immediately.
{
DominanceInfo domInfo;
mlir::eliminateCommonSubExpressions(rewriter, domInfo, newOp);
}

// Compute output indices for the expressions added by the region.
int64_t numTotalIndices = numDedupIndices + terminator->getNumOperands();
append_range(reverseMapping, seq(numDedupIndices, numTotalIndices));
Expand Down
7 changes: 2 additions & 5 deletions test/Transforms/Substrait/emit-deduplication.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -161,13 +161,11 @@ substrait.plan version 0 : 42 : 1 {
// CHECK-NEXT: %[[V2:.*]] = filter %[[V1]] : {{.*}} {
// CHECK-NEXT: ^{{.*}}(%[[ARG0:.*]]: [[TYPE:.*]]):
// CHECK-NEXT: %[[V3:.*]] = field_reference %[[ARG0]]{{\[}}[0]] : [[TYPE]]
// CHECK-NEXT: %[[V4:.*]] = field_reference %[[ARG0]]{{\[}}[0]] : [[TYPE]]
// CHECK-NEXT: %[[V5:.*]] = field_reference %[[ARG0]]{{\[}}[1, 0]] : [[TYPE]]
// CHECK-NEXT: %[[V6:.*]] = field_reference %[[ARG0]]{{\[}}[1]] : [[TYPE]]
// CHECK-NEXT: %[[V7:.*]] = field_reference %[[V6]]{{\[}}[1]] :
// CHECK-NEXT: %[[V8:.*]] = field_reference %[[ARG0]]{{\[}}[0]] : [[TYPE]]
// CHECK-NEXT: %[[V9:.*]] = field_reference %[[ARG0]]{{\[}}[2]] : [[TYPE]]
// CHECK-NEXT: %[[Va:.*]] = func.call @f(%[[V3]], %[[V4]], %[[V5]], %[[V7]], %[[V8]], %[[V9]])
// CHECK-NEXT: %[[Va:.*]] = func.call @f(%[[V3]], %[[V3]], %[[V5]], %[[V7]], %[[V3]], %[[V9]])
// CHECK-NEXT: yield %[[Va]] : si1
// CHECK-NEXT: }
// CHECK-NEXT: %[[Vb:.*]] = emit [0, 0, 1, 0, 2] from %[[V2]]
Expand Down Expand Up @@ -209,8 +207,7 @@ substrait.plan version 0 : 42 : 1 {
// CHECK-NEXT: %[[V2:.*]] = project %[[V1]] : tuple<si32> -> tuple<si32, si1> {
// CHECK-NEXT: ^{{.*}}(%[[ARG0:.*]]: [[TYPE:.*]]):
// CHECK-NEXT: %[[V3:.*]] = field_reference %[[ARG0]]{{\[}}[0]] : [[TYPE]]
// CHECK-NEXT: %[[V4:.*]] = field_reference %[[ARG0]]{{\[}}[0]] : [[TYPE]]
// CHECK-NEXT: %[[V5:.*]] = func.call @f(%[[V3]], %[[V4]]) :
// CHECK-NEXT: %[[V5:.*]] = func.call @f(%[[V3]], %[[V3]]) :
// CHECK-NEXT: yield %[[V5]] : si1
// CHECK-NEXT: }
// CHECK-NEXT: %[[V6:.*]] = emit [0, 0, 1] from %[[V2]]
Expand Down

0 comments on commit 5acf08e

Please sign in to comment.