-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][Unity][TVMScript] return from IfNode #14176
Conversation
This PR implements a flexible register-based VM to execute relax programs with dynamic shape and control flow. Design: https://github.com/tlc-pack/relax/wiki/Relax-VM-Design. Co-Authored-by: Ziheng Jiang <[email protected]> Co-Authored-by: Ruihang Lai <[email protected]> Co-Authored-by: Sunghyun Park <[email protected]> Co-Authored-by: Junru Shao <[email protected]> Co-Authored-by: Prakalp Srivastava <[email protected]> Co-Authored-by: Yong Wu <[email protected]> Co-Authored-by: Steven S. Lyubomirsky <[email protected]> Co-Authored-by: Tianqi Chen <[email protected]> Co-Authored-by: Hongyi Jin <[email protected]>
* [Unity][IR] First-class StructInfo Relax tracks structural information (such as tensor shape) via `StructInfo` about the values in Relax. * Fix rust build --------- Co-authored-by: Junru Shao <[email protected]>
…pache#13910) This PR setup a unity specific jenkins with minimum jenkinsfile without sharding and disables most of the tests to reduce overall cost. We can add tests of unty branch by configuring the specific groovy file.
[Unity] Basic StructInfo Analysis and Expr construction. This PR adds struct info analysis and expr support. These are logics to construct the IR node and perform struct info related analysis. Testcases are added to cover the IR node construction and related struct info analysis checks. Co-authored-by: Tianqi Chen <[email protected]> Co-authored-by: Altan Haan <[email protected]> Co-authored-by: Andrew Liu <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Jiawei Liu <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Lesheng Jin <[email protected]> Co-authored-by: masahi <[email protected]> Co-authored-by: Prakalp Srivastava <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Steven S. <Lyubomirsky [email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Yixin Dong <[email protected]> Co-authored-by: Yong Wu <[email protected]> Co-authored-by: Ziheng Jiang <[email protected]>
This PR adds BlockBuilder: the core data structure to construct Relax AST, and ExprMutator: performs AST mutation for implementing transformation passes. Co-Authored-by: Tianqi Chen <[email protected]> Co-Authored-by: Altan Haan <[email protected]> Co-Authored-by: Andrew Liu <[email protected]> Co-Authored-by: Hongyi Jin <[email protected]> Co-Authored-by: Jiawei Liu <[email protected]> Co-Authored-by: Junru Shao <[email protected]> Co-Authored-by: Lesheng Jin <[email protected]> Co-Authored-by: masahi <[email protected]> Co-Authored-by: Prakalp Srivastava <[email protected]> Co-Authored-by: Ruihang Lai <[email protected]> Co-Authored-by: Siyuan Feng <[email protected]> Co-Authored-by: Steven S. <Lyubomirsky [email protected]> Co-Authored-by: Sunghyun Park <[email protected]> Co-Authored-by: Yixin Dong <[email protected]> Co-Authored-by: Yong Wu <[email protected]> Co-Authored-by: Ziheng Jiang <[email protected]>
This PR adds the TVMScript parser/ir_builder support based on the blockbuilder. Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Tianqi Chen <[email protected]> Co-authored-by: Yuchen Jin <[email protected]> Co-authored-by: Steven S. Lyubomirsky <[email protected]> Co-authored-by: Yong Wu <[email protected]>
This PR introduces Relax as a dialect supported by the TVMScript Printer. Some caveats: - Needs to rebase to mainline before merging. - Some tests are skiped because some operators are not upstreamed to the unity branch yet. Co-authored-by: Tianqi Chen <[email protected]> Co-authored-by: Yuchen Jin <[email protected]> Co-authored-by: Steven S. Lyubomirsky <[email protected]> Co-authored-by: Yong Wu <[email protected]> Co-authored-by: Prakalp Srivastava <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Siyuan Feng <[email protected]>
This PR introduces Relax `FunctionPass` and `DataflowBlockPass` API, and the `VMShapeLower` pass to lower the shape expression in Relax to TIR functions and VM shape heap builtin functions. Co-Authored-by: Ziheng Jiang <[email protected]> Co-Authored-by: Lesheng Jin <[email protected]> Co-Authored-by: Altan Haan <[email protected]> Co-Authored-by: Junru Shao <[email protected]> Co-Authored-by: Prakalp Srivastava <[email protected]> Co-Authored-by: Ruihang Lai <[email protected]> Co-Authored-by: Siyuan Feng <[email protected]> Co-Authored-by: Steven S. <Lyubomirsky [email protected]> Co-Authored-by: Sunghyun Park <[email protected]> Co-Authored-by: Tianqi Chen <[email protected]> Co-Authored-by: Yong Wu <[email protected]>
This PR introduces the e2e Relax lowering flow (`relax.vm.build`). Tests for each pass in the flow are added. Co-Authored-by: Altan Haan <[email protected]> Co-Authored-by: Andrew Liu <[email protected]> Co-Authored-by: Hongyi Jin <[email protected]> Co-Authored-by: Jiawei Liu <[email protected]> Co-Authored-by: Junru Shao <[email protected]> Co-Authored-by: Prakalp Srivastava <[email protected]> Co-Authored-by: Ruihang Lai <[email protected]> Co-Authored-by: Siyuan Feng <[email protected]> Co-Authored-by: Steven S. <Lyubomirsky [email protected]> Co-Authored-by: Sunghyun Park <[email protected]> Co-Authored-by: Tianqi Chen <[email protected]> Co-Authored-by: Yong Wu <[email protected]> Co-Authored-by: Ziheng Jiang <[email protected]>
As we've introduced `arg_sinfo` in CallNode, implicit shape constructor is not widely used in TVMScript. This PR removes the implicit shape since it may cause confusion between shape and tuple.
This PR is about the high-level tensor computation operators in Relax. This PR includes the tensor indexing operators.
This PR is about the high-level tensor computation operators in Relax. This PR includes the set operators. Co-authored-by: Prakalp Srivastava <[email protected]>
This PR is about the high-level tensor computation operators in Relax. This PR includes the image operators.
This PR is about the high-level tensor computation operators in Relax. This PR includes the unary, binary and ternary arithmetic and comparison operators. Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Chaofan Lin <[email protected]>
This PR is about the high-level tensor computation operators in Relax. This PR includes the statistical operators.
This PR is about the high-level tensor computation operators in Relax. This PR includes the neural network operators.
This PR is about the high-level tensor computation operators in Relax. This PR includes the tensor creation operators.
This PR is about the high-level tensor computation operators in Relax. This PR includes the linear algebra operators. Co-authored-by: Siyuan Fneg <[email protected]>
This PR is about the high-level tensor computation operators in Relax. This PR includes the search operators.
This PR is about the high-level tensor computation operators in Relax. This PR includes the tensor manipulation operators. Co-authored-by: Prakalp Srivastava <[email protected]>
This PR introduce NestedMsg to robustly handle nested-tuple analysis. Relax support nested tuple structures in the IR. Nested tuple structure is important to support advanced groupings in cases such as gradient calculation and other scenarios. The possible presence of nested tuple does mean that we need to to robustly handle analysis that contains nested tuple structures in a dataflow graph. This PR introduces a NestedMsg<T> class that corresponds to a possibly nested message tuple for a given leaf message class T. We also introduces various helper functions to compose and decompose messages. Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Yixin Dong <[email protected]> Co-authored-by: Ruihang Lai <[email protected]>
[Unity][Pass] Operator fusion passes This PR introduces three passes for operator fusion: 1. AnnotateTIROpPattern: analysis the operator kind from PrimFunc. 2. FuseOps: fuse operators for Relax functions, which adds a new fused relax primitive function. 3. FuseTIR: fuse corresponding TIR PrimFuncs for the fused relax.
[VM] Supporting "compiled" exec mode. This PR adds support of "compiled" mode to the VM. The compiled mode translate the relax function into TIR function and drive it through the TIR function. It is different from the micro AOT codegen, which generate TIR code that targets the micro C runtime environment and useful for resource limited settings with smaller set of features. Both leverages the low-level TIR build that is also shared with TensorIR. The current implementation targets full TVM (VM) runtime, that comes with PackedFunc, object, tuple, closure and all kinds of rich structure support. This also mean that we can leverage the full runtime support to handle things like allocation, dynamic shape, easy plugins and python interaction, which are not available in more limited runtime. The user directly use the same API to load the generated code regardless of compiled mode or bytecode. And just need to change one line ```python ex = relax.vm.build(mod, target, exec_mode="compiled") ``` The simplicity is thanks to the TVM runtime archiecture that allows us to compose things together in objects. The only difference is how the PackedFunc of high-level driving is being provided. In the case of bytecode it is normal interpretation and in the case of compiled mode it is TIR. It is a complete implementation Unit-testcases are added. All codegen build tests are updated to include two exec_modes and have passed locally. Co-authored-by: Junru Shao <[email protected]>
This PR introduces FoldConstant/BindParam passes. Co-authored-by: Yong Wu <[email protected]> Co-Authored-by: Hongyi Jin <[email protected]> Co-Authored-by: Siyuan Feng <[email protected]>
…pache#14014) Add TuningAPI and MetaSchedule tuning pass
This PR implements a Relay to Relax translator, which allows us to import Relay workloads to Relax for benchmarking and development purposes (tests and examples are added).
This PR introduces two high-level operators log_softmax and cross_entropy_with_logits, which are important when we are calculating CrossEntropyLoss (in torch). Co-authored-by: Yixin Dong <[email protected]>
…ock (apache#14075) This PR supports TIR pattern kind analysis for TIR blocks which write to multiple buffers, which is helpful for normalization operators like layernorm, groupnorm, etc. Prior to this PR, the analyzer does not support a blocks which write to multiple buffers. On seeing such a block, the analyzer simply sets the analysis result to "opaque". With this PR, on seeing a block which writes multiple buffers, the analyzer will check if all the BufferStores have the same indices. And it will only set the result to "opaque" when the BufferStores have different indices. By doing this, the analysis works for common cases where a block may write to multiple buffers, like layernorm or groupnorm. Besides the unit test for the analysis itself, this PR also adds a unit test for FuseOps pass, make sure that a "layernorm + relu" pattern can be fused together.
) The current FoldConstant pass does not support removing unused bindings in the post-folding function. Therefore, for large real-world models, the built executable will be overlarge because of the redundant unused constants. This PR removes the redundant unused constant bindings in FoldConstant by using the analysis function "RemoveAllUnused". Note that "RemoveAllUnused" only works at dataflow block level. Therefore FoldConstant will not remove unused bindings outside of dataflow block as well.
This PR refactors relax build so it get exposed at the opt-level. We also introduces an explicit jit functionality to handle live loading of compiled artifacts from cutlass. We also move relax vm to runtime so it can be clearly isolated from the rest of the compiler stack.
Set shape function to be host func.
This PR lowers shape_of op to a Relax VM builtin, and changes a utility function to take StructInfo as input. Co-authored-by: Steven S. Lyubomirsky <[email protected]>
This PR adds TVMScript local recursive function support. It also update lambda lifting pass. Removed CalledGlobalVars, it was not used anymore. It also updates well-form pass to allow un-defined vars for recursive call
…he#14066) * [Layout] Add layout transformation analysis for PrimFunc. This change adds a PrimFunc level analysis to suggest layout transformations to block and buffers in the PrimFunc based on the layout transformations to PrimFunc outputs. * Add support for multiple blocks such as split op. * Add negative tests and increase coverage. * fix warning message * fix lint * remove unused header * Address comments. Moved some utility functions to support/array.h improve doc * fix deprecation warn T.var("int64") to T.int64() * address comments
…4101) Remove the attributes of operators assert, print and unique. Use PrimValue as substitute. Co-authored-by: Steven S. Lyubomirsky [[email protected]](mailto:[email protected]) Co-authored-by: Prakalp Srivastava [[email protected]](mailto:[email protected])
* Add relax backend pattern registry * Add doc
…e#14115) * finished * fix * rollback merge_composite_functions
Currently `MergeCompositeFunctions` will modify the map while iterating over it, and that makes tests/python/relax/test_transform_merge_composite_functions.py does not pass. This PR fixes this bug.
…utput` (apache#14126) This PR adds `name_hint` argument for `emit` and `emit_output` API of Relax blockbuilder. The argument exists in the C++ side but not exposed to Python side (So user who use the Python bb.emit will let `name_hint` be `""` by default). Co-authored-by: Yixin Dong <[email protected]>
This PR brings initial relax vm support on web runtime
…ache#14128) Add transposed matmul support for Relax CUTLASS
This PR adds R.emit_te meta-programming mechanism to emit a topi operator from TVMScript
…ache#14139) * [Unity][BYOC] Assign group to unused bindings and ignroe PrimFunc * Update fuse_ops.cc
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment. Generated by tvm-bot |
The happy path tests are working well in my code actually. I close this PR since it should be fine to ask user to return in the end of the function at this moment, I can reopen and keep working on this if needed. |
I am trying to add return support from true/false branch of
IfNode
as we discussed in tlc-pack/relax#446.I introduced a
RelaxReturnGlobalInfo
(similar toDummyGlobalInfo
), which is used to save all theExpr
(body
ofSeqExpr
) that need to be returned. But this causes it only works at IR_Module level, it failed to work for relax function without explicitly defining in ir_module. Options I have at this moment.global_info
will be added into FunctionNodereturn_body
intoSeqExpr
(we don't want to change IR at this stage, list it here for consideration in the future)Any comments are truly appreciated! cc: @slyubomirsky @tqchen @junrushao @Hzfengsy @YuchenJin
update: If a pass modifies the return Expr (for example the original return expr is a CallNode, a var is created and bound to the CallNode in a pass, thus the var is the newly generated return expr), then the globally maintained return exprs should be updated in the pass as well. Looks it burdens the pass writing.