[quidditch_snitch] Add padding capabilities to `start_tensor_copy` #116

zero9178 · 2024-08-19T08:58:20Z

We occasionally encounter shapes that are challenging to tile due to their prime factors involved. Attempting to distribute these (e.g. to compute cores or vector lanes) when the number of required tiles is not a factor of the dimension leads to generating dynamic dimensions which the microkernel compilation is unable to deal with. Similarly, once we are on f32, we are required to vectorize the kernel and have a restriction that the tile size of e.g. a matvec is a multiple of 4, 8 etc.

This PR therefore introduces optional padding to the start_dma_transfer op that can be added at the end of each tensor dimension. When tiled, the padding can be chosen to guarantee that a tensor is always of a given static shape, solving the issue noted above. For now, the value used for padding is always zero which works for any matmul, elementwise operation and convolution.

Note that the padding option is not yet used in the pipeline but will be lowered to from tensor.pad operations in a future PR.

We occasionally encounter shapes that are challenging to tile due to their prime factors involved. Attempting to distribute these (e.g. to compute cores or vector lanes) when the number of required tiles is not a factor of the dimension leads to generating dynamic dimensions which the microkernel compilation is unable to deal with. Similarly, once we are on `f32`, we are required to vectorize the kernel and have a restriction that the tile size of e.g. a matvec is a multiple of 4, 8 etc. This PR therefore introduces optional padding to the `start_dma_transfer` op that can be added at the end of each tensor dimension. When tiled, the padding can be chosen to guarantee that a tensor is always of a given static shape, solving the issue noted above. For now, the value used for padding is always zero which works for any matmul, elementwise operation and convolution. Note that the padding option is not yet used in the pipeline but will be lowered to from `tensor.pad` operations in a future PR.

zero9178 added 2 commits August 19, 2024 09:58

Merge branch 'main' into dma-transfer-pad

da3de93

zero9178 merged commit 812da1a into main Aug 19, 2024
1 check passed

zero9178 deleted the dma-transfer-pad branch August 19, 2024 09:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[quidditch_snitch] Add padding capabilities to `start_tensor_copy` #116

[quidditch_snitch] Add padding capabilities to `start_tensor_copy` #116

zero9178 commented Aug 19, 2024

[quidditch_snitch] Add padding capabilities to start_tensor_copy #116

[quidditch_snitch] Add padding capabilities to start_tensor_copy #116

Conversation

zero9178 commented Aug 19, 2024

[quidditch_snitch] Add padding capabilities to `start_tensor_copy` #116

[quidditch_snitch] Add padding capabilities to `start_tensor_copy` #116