Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unroll 8x8 matmul by hand #201

Merged
merged 1 commit into from
Apr 2, 2024
Merged

unroll 8x8 matmul by hand #201

merged 1 commit into from
Apr 2, 2024

Conversation

superlopuh
Copy link
Collaborator

No description provided.

@superlopuh superlopuh self-assigned this Mar 25, 2024
Copy link

kernels baseline linalg scf_xdsl snitch_stream snrt min_llvm_mlir
conv2d_d1_s1_3x3 1x1x8x8xf64 1566 1491 1515 1493 1491
ddot 128xf64 949 949 958 568 561 949
dense 8x8xf64 3234 3249 2742 2722 3234
dsum 8x16xf32 1198 786 2030 191 172 786
fill 16x16xf64 371 370 2664 298 299 370
matmul 8x8xf64 2522 2827 2358 2337 2522
pooling_nchw_max_d1_s2_3x3 1x1x16x16xf64 1432 1185 1115 1106 1185
pooling_nchw_sum_d1_s2_3x3 1x1x16x16xf64 1932 1932 1997 1988 1932
relu 16x16xf64 1081 1076 3469 327 340 1076

Copy link

kernels baseline linalg scf_xdsl snitch_stream snrt
conv2d_d1_s1_3x3 1x1x8x8xf64 0.23 0.22 0.21 0.22
ddot 128xf64 0.14 0.14 0.13 0.23 0.23
dense 8x8xf64 0.20 0.20 0.23 0.24
dsum 8x16xf32 0.11 0.16 0.06 0.68 0.75
fill 16x16xf64 0.00 0.00 0.00 0.87 0.86
matmul 8x8xf64 0.20 0.18 0.27 0.27
pooling_nchw_max_d1_s2_3x3 1x1x16x16xf64 0.31 0.37 0.40 0.40
pooling_nchw_sum_d1_s2_3x3 1x1x16x16xf64 0.23 0.23 0.22 0.22
relu 16x16xf64 0.24 0.24 0.07 0.79 0.76

@@ -4,7 +4,7 @@ ddot 128xf64,949,949,958,568,561,949
dense 8x8xf64,3234,3249,,2742,2722,3234
dsum 8x16xf32,1198,786,2030,191,172,786
fill 16x16xf64,371,370,2664,298,299,370
matmul 8x8xf64,2522,2827,,2339,2337,2522
matmul 8x8xf64,2522,2827,,2358,2337,2522
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok with new counters this makes more sense

Copy link
Contributor

@AntonLydike AntonLydike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noice!

Base automatically changed from nazavode/fpu-fence to main March 27, 2024 10:52
@superlopuh superlopuh changed the base branch from main to sasha/ipc March 27, 2024 11:37
Copy link

kernels baseline linalg scf_xdsl snitch_stream snrt min_llvm_mlir
conv2d_d1_s1_3x3 1x1x8x8xf64 1569 1491 1504 1488 1491
ddot 128xf64 964 956 941 560 568 956
dense 8x8xf64 3240 3257 2729 2737 3240
dsum 8x16xf32 1224 809 2023 177 198 809
fill 16x16xf64 349 349 2647 291 289 349
matmul 8x8xf64 2517 2829 2347 2325 2517
pooling_nchw_max_d1_s2_3x3 1x1x16x16xf64 1447 1194 1114 1099 1194
pooling_nchw_sum_d1_s2_3x3 1x1x16x16xf64 1940 1940 1996 1981 1940
relu 16x16xf64 1096 1086 3462 335 324 1086

Copy link

kernels baseline linalg scf_xdsl snitch_stream snrt
conv2d_d1_s1_3x3 1x1x8x8xf64 0.23 0.22 0.22 0.22
ddot 128xf64 0.13 0.13 0.14 0.23 0.23
dense 8x8xf64 0.20 0.20 0.24 0.23
dsum 8x16xf32 0.11 0.16 0.06 0.73 0.65
fill 16x16xf64 0.00 0.00 0.00 0.89 0.89
matmul 8x8xf64 0.20 0.18 0.27 0.28
pooling_nchw_max_d1_s2_3x3 1x1x16x16xf64 0.31 0.37 0.40 0.40
pooling_nchw_sum_d1_s2_3x3 1x1x16x16xf64 0.23 0.23 0.22 0.22
relu 16x16xf64 0.24 0.24 0.07 0.77 0.80

Base automatically changed from sasha/ipc to main March 27, 2024 18:46
Copy link

kernels baseline linalg scf_xdsl snitch_stream snrt min_llvm_mlir
conv2d_d1_s1_3x3 1x1x8x8xf64 1569 1491 1504 1488 1491
ddot 128xf64 964 956 2248 560 568 956
dense 8x8xf64 3240 3257 2729 2737 3240
dsum 8x16xf32 1224 809 5480 177 198 809
fill 16x16xf64 349 349 2647 291 289 349
matmul 8x8xf64 2517 2829 2347 2325 2517
pooling_nchw_max_d1_s2_3x3 1x1x16x16xf64 1447 1194 1114 1099 1194
pooling_nchw_sum_d1_s2_3x3 1x1x16x16xf64 1940 1940 1996 1981 1940
relu 16x16xf64 1096 1086 7556 335 324 1086

Copy link

kernels baseline linalg scf_xdsl snitch_stream snrt
conv2d_d1_s1_3x3 1x1x8x8xf64 0.23 0.22 0.22 0.22
ddot 128xf64 0.13 0.13 0.06 0.23 0.23
dense 8x8xf64 0.20 0.20 0.24 0.23
dsum 8x16xf32 0.11 0.16 0.02 0.73 0.65
fill 16x16xf64 0.00 0.00 0.00 0.89 0.89
matmul 8x8xf64 0.20 0.18 0.27 0.28
pooling_nchw_max_d1_s2_3x3 1x1x16x16xf64 0.31 0.37 0.40 0.40
pooling_nchw_sum_d1_s2_3x3 1x1x16x16xf64 0.23 0.23 0.22 0.22
relu 16x16xf64 0.24 0.24 0.03 0.77 0.80

@superlopuh superlopuh merged commit 8f3d31e into main Apr 2, 2024
2 checks passed
@superlopuh superlopuh deleted the sasha/8x8-unroll branch April 2, 2024 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants