Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How is the Max Performance estimated? #400

Open
LorcaQAQ opened this issue Mar 4, 2025 · 2 comments
Open

How is the Max Performance estimated? #400

LorcaQAQ opened this issue Mar 4, 2025 · 2 comments

Comments

@LorcaQAQ
Copy link

LorcaQAQ commented Mar 4, 2025

I am reading your manuscript on Ara2, but I have a question about your Table 2, How do you estimate the Max Performance? Could you provide some examples to illustrate the calculation? Thanks!

@mp-17
Copy link
Collaborator

mp-17 commented Mar 4, 2025

Hello @LorcaQAQ,

I think you refer to the max. performance calculated to see the performance ideality on Ara2, i.e., how ideal is the throughput on the maximum achievable by that kernel on the vector processor. So, max. performance is the maximum number of operations per cycle achievable, limited by the number of FPUs if the kernel is memory bound on Ara2, or limited by the memory BW if the kernel is memory bound on Ara2.
Also, since we talk about operations, FMACC counts as 2 operations while FADD counts as one operation. So, if there are only FMACC and the kernel is computation bound, we have max 2 op/cycle in terms of throughput. If they are only FADD, we have 1 op/cycle max. If there is an intermediate situation, we need to calculate a weighted average. Let's suppose to only have 1 lane for the following examples:

Max 1 OP/cycle

loop:
 vfadd
 vfadd

Max 1.5 OP/cycle

loop:
 vfmacc
 vfadd

Max 2 OP/cycle

loop:
 vfmacc
 vfmacc

When the kernel is memory bound, the max. performance for performance ideality is limited by the memory bandwidth (in Ara, the memory BW is 32L-bit/cycle, while the compute works at 64L-bit/cycle).

Max 0.25 OP/cycle

loop:
 vld
 vld
 vadd

Hope it helps!

@LorcaQAQ
Copy link
Author

LorcaQAQ commented Mar 4, 2025

@mp-17 Thanks! I think I now understand the four examples. But I still have a question. In Table 2, under the first row for "matmul," what does "1 ×" represent in the expression "1 × 2.0 × L"? Additionally, in the "dropout" row, what does "2 ×" stand for?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants