DRAFT: Unplaced Reconfigurable GEMM, Python Code Refactor, and Add Throughput Metric #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current GEMM is placed, while the rest of the examples are unplaced. Moreover, it's hard to determine how the
prio-accuracyfeature affects the execution of GEMM. Lastly, no metric for throughput is saved for GEMM. This PR tries to address the first and third points and improve the second point.Let me know if I should split this up into multiple PRs to make it easier to review.
Added
-Flag in
CMakeLists.txtto enable/disableprio-accuracyin GEMM, with the appropriate kernel used based on whether this flag is enabled or disabled-Throughput calculation and metric added to CI in GFLOP/s
Changed
-Placed reconfigurable GEMM design is now unplaced reconfigurable GEMM
-Clearer blocking regarding how the
prio-accuracyflag changes the designRemoved
-Placed reconfigurable GEMM design
PR Merge Checklist
develcommit and pointing todevel.