-
Notifications
You must be signed in to change notification settings - Fork 173
[CIR][ThroughMLIR] Lower WhileOp with break. #1942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add mlir Vec support to elementTypeIfVector
This commit un-xfails some tests that were affected by upstream GEP changes. llvm#1497 The changes are likely introduced by inference of `inbounds` and `nuw` flags llvm/llvm-project@10f315d. It seems to allow LLVM to enable some constant foldings, whose effects include re-calculating the address using i8 and a single offset. (This change seems to arise from https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699/39)
Reapplying llvm@e66b670 which was reverted during rebase (after fixing some conflicts). Un-xfails the test `cc1.cir` (llvm#1497). --------- Co-authored-by: Nathan Lanza <[email protected]>
This commit attempts to un-xfail `globals-ref-globals` test (llvm#1497), which was causing a crash in addition to GEP changes. The crash was caused by incorrect offsets being calculated for global view indices. The original calculation did not take paddings into consideration, hence triggering a crash. This commit adds type alignment when calculating the offset, which will take care of paddings. Further modification to `globals-ref-globals` test includes fixing some struct names that got changed, as well as replacing the expects with constant folded GEP.
Previously, when emitting a global for a string literal, we were creating a GlobalOp, building a GlobalView attr for it, and looking up the global from the symbol associated with the attr. This change splits out the function that creates the global so that the global is returned directly and the GlobalView attribute is only created in the case where it is needed. This also updates the mechanism used for uniquing the global name used for the strings so that if different base names are used the uniquing numbers each base name separately. The mangling of the global used for strings is not implemented, but the uniquing was happening prior to the mangling. This change drops the uniquing below the placeholder for mangling.
This change corrects the alignment of store operations and fixes a related problem with calculation of member offsets (we weren't accounting for the alignment of the field whose offset we were calculating. Many tests are affected by this, but most just needed a wildcard match to ignore the explicit alignment which wasn't present before. In cases where I updated a check for a specific alignment value, I compared against classic codegen to verify that we are now producing the same alignment. Two new tests are added align-store.c and alignment.cpp. The second of these partially copies a test of the same name from clang/test/CodeGen. It's testing globals and isn't directly related to the code changes here, but we didn't seem to have a test for this. I put the store alignment tests in a different file because inconsistency between CIR and LLVM IR in placement of globals would have made a combined test difficult to follow. This addresses llvm#1204
Currently the ForOp handling ignores everything except load, store and arithmetic in the step region. It does not detect whether the step and induction variable has already been assigned, either. That might result to wrong behaviour: ```cpp // Ignores printf for (int i = 0; i < n; i++, printf("\n")); // Only increments once for (int i = 0; i < n; i++, i++); ``` I choose to rewrite the detection and do an exact match of the instruction sequence for `i++` and `i += n`. It doesn't seem easy to detect a more general pattern without phi nodes. The new test case is xfailed, because ForOp hits an unreachable when it meets a non-canonicalized loop. We can implement that functionality later. Co-authored-by: Yue Huang <[email protected]>
…m#1638) This reverts commit 0c944f9. Reverting due to build failure reported in llvm#1635 (comment)
This adds an explicit alignment to load instructions where needed in order to make CIR load alignment consistent with classic codegen. As with aligned stores, I have updated tests that were failing with wildcard checks for cir.load, except where alignment was being specifically checked. Where tests failed because of changed alignment, I verified that the new alignment matches what is produced by classic codegen. The new test for proper alignment behavior is align-load.c.
…valuate unconditionally (llvm#1642) This came up during the review of llvm/llvm-project#138156 During codegen we check whether the LHS and RHS of the conditional operator are cheap enough to evaluate uncondionally. Unlike classic codegen we still emit `TernaryOp` instead of `SelectOp` and defer that optimization to cir-simplify. This patch changes codegen to directly emit `SelectOp` for `cond ? constant : constant` expressions.
The verification used to require a runtime type query of `ConditionOp` which is not available outside the interface package; whereas the dialect package depends on the interface package, causing a circular dependency. This commit adds a singleton interface for `ConditionOp` to break off the circular dependency, and replaces the runtime type query using the newly added interface. Tested locally with a build with shared library enabled and verified that the build succeeds.
…m#1643) Backport improvements in ShiftOp for vectors from llvm/llvm-project#141111.
The transformation functions are all named `transferToXXXOp`. Are those typos? Co-authored-by: Yue Huang <[email protected]>
…llvm#1645) - Uses `getI<bitwidth>IntegerAttr` builder method instead of explicit attribute and its type creation. - Adds few helper functions `getAlignmentAttr` to build alignment representing `mlir::IntegerAttr`. - Removes duplicit type parameters, that are inferred from `mlir::IntegerAttr`.
…lvm#1646) Currently, the following code snippet crashes during flattening, before lowering to llvm: ``` struct S { int a, b; }; void foo() { try { S s{1, 2}; } catch (...) { } } ``` Command to reproduce: ``` clang tmp.cpp -Xclang -fclangir -Xclang -emit-cir-flat -S -o - ``` The crash happens when flattening a TryOp with an empty catch region and [building catchers](https://github.com/llvm/clangir/blob/791c327da623e4cb1c193422f4b7a555f572b70a/clang/lib/CIR/Dialect/Transforms/FlattenCFG.cpp#L423). Something like: ``` "cir.try"() ({ }, { }) : () -> () ``` the crash happens at [`tryOp.isCatchAllOnly()`](https://github.com/llvm/clangir/blob/791c327da623e4cb1c193422f4b7a555f572b70a/clang/lib/CIR/Dialect/Transforms/FlattenCFG.cpp#L441C39-L441C61) to be specific, because the catch types attribute list is empty. The fix is simple - adding a check for an empty/non-existent catch region before building the catch clauses. This PR adds this fix and one test. **Side-note:** This enables `push_back` for `std::vector` to be lowered to llvm, for example: ``` #include <vector> void foo() { std::vector<int> v; v.push_back(1); } ```
The current implementation incorrectly uses `mlir::IntegerAttr::get` with the unsupported type `cir::IntType`, which is not compatible and was never tested. As discussed in PR llvm#1645, this is to be marked as NYI until a proper implementation is provided.
Backport the calculation of maskbits in the lowering from `N - 1` to `NextPowerOf2(numElements - 1) - 1`, similar to Clang CG. Backport from [#141411](llvm/llvm-project#141411)
…lvm#1651) We used to insert a continue Block at the end of a flattened ternary op that only contained a branch to the remaing operation of the remaining Block. This patch removes that continue block and changes the true/false blocks to directly jump to the remaining ops. With this patch the CIR now generates exactly the same LLVM IR as the original codegen.
This PR fixes [Issue#1647](llvm#1647). It just takes the implementation from [`emitRethrow`](https://github.com/llvm/clangir/blob/105d898b9898d224f0baca4b161a84bdcf817617/clang/lib/CIR/CodeGen/CIRGenItaniumCXXABI.cpp#L2273C1-L2276C77) and extends the same logic to `emitThrow`. The only nitpick about the fix is same as before - we have this [redundant ScopeOp](https://github.com/llvm/clangir/blob/105d898b9898d224f0baca4b161a84bdcf817617/clang/lib/CIR/CodeGen/CIRGenItaniumCXXABI.cpp#L2298C1-L2303C37) which acts as a placeholder, so there are some redundant yield blocks in some cases. Aside that, I believe this fix is okay for now. I have added the tests from the issue to confirm everything works as intended. cc: @mmha, @bcardosolopes.
From the [comment](llvm#1844 (comment)) on PR review llvm#1844, it seems like we're missing the flags for GEP. I'm opening the PR to add the flags. The first commit is just a prototype to gather opinions and reviews to see if I'm heading to the right direction with this.
…vm#1900) After llvm#1878, we introduced a dependency from the LoopOpInterface on BreakOp. While here add the BreakOp handling, which will be tests by a pass coming soon. Co-authored-by: Tommy McMichen <[email protected]>
Backport fixes in VisitAbstractConditionalOperator to handle OpaqueValueExpr from the upstream llvm/llvm-project#157331
…vm#1905) This PR adds to the implementation of `maybeEmitThunks` in `clang/lib/CIR/CodeGen/CIRGenVTables.cpp` . Newly declared/defined functions are ported from OG. Some missings are `Type::canLosslesslyBitCastTo` and `setDLLStorageClass`. No tests are added since the implementation is not finished yet.
This PR adds support for the new `BlockAddressOp`, used for GCC labels as values. Support for indirect `goto` and `ConstantLValueEmitter::VisitAddrLabelExpr` will be added in a future PR.
This PR implements some missing blocks that allow us to effectively allow us to launch kernels from the host. All of the tests stated in this [commit](llvm@69f2099) are now resolved. I spent half a day figuring the following: I tried experiementing performing host compilation(`-fcuda-is-device`) with target triple: `nvptx64-nvidia-cuda` but was getting a module verification error that, to keep it simple looked like: `error: 'cir.call' op calling convention mismatch: expected ptx_kernel, but provided c`. I thought that was expected given that we're essentially using the device to compile on the host, which doesn't make a lot of sense. until I tried to replicate the same in OG and didn't really run into any problem in that regard. Are the calling conventions enforced in CIR much more strict as compared to OG? Or is that simply a bug from OG?
Fixes llvm#1818 - Implement createVecCompare, getCIRIntOrFloatBitWidth, getVectorFCmpIR helper for VecCmp op creation. - Add clang/test/CIR/CodeGen/builtin-fcmp-sse.c test. in OG, there is a sext from bool to int before casting to float vector since fcmp's result in llvm ir is boolean-like, while VecCmpOp in CIR returns int in the form of 0 or -1. There is also a boolean `shouldInvert` in CIR since CIR doesn't contain optimized unordered comparison, for example: OLE is the inverse predicate of UGT. So if we need UGT, we have to pass in OLE and `shouldInvert = true`
Backport using ArrayOf constraints from the upstream and the test file for invalid type info
This PR adds lowering of `BlockAddressOp`. It uses two maps, `blockInfoToTagOp` and `unresolvedBlockAddressOp`, to defer matching `mlir::LLVM::BlockAddressOp` to its corresponding `mlir::LLVM::BlockTagOp` in cases where the matching label has not yet been emitted. If the `BlockTagOp` is not emitted, a placeholder value `std::numeric_limits<uint32_t>::max()` is used, which will later be resolved in `resolveBlockAddressOp`. Support for indirect goto and label differences will be added in a future PR.
This PR adds supports for __builtin_ia32_cmpnltps/cmpnltpd. Depends on llvm#1893.
`optimizeOnCertainBreak`.
// CHECK: memref.alloca_scope { | ||
// CHECK: %[[IV:.+]] = memref.load %alloca[] | ||
// CHECK: %[[HUNDRED:.+]] = arith.constant 100 | ||
// CHECK: %[[_:.+]] = arith.cmpi slt, %[[IV]], %[[HUNDRED]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking at the generated IR and I'm not sure this works right, the loop is gone but you still need to increment i
, what am I missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the 2nd i++
(operations after break
) that got removed, the 1st i++
is kept.
while (i < 100) {
i++;
break;
i++;
}
The PR1735 contains too many conflicts with the main branch.
Got it re-patched to the current main branch.