Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEGEMMLowpMatrixMultiplyCore: set_pretranspose_A & set_pretranspose_B support #1127

Open
eshoguli opened this issue Jul 19, 2024 · 1 comment

Comments

@eshoguli
Copy link

eshoguli commented Jul 19, 2024

Model:

graph TD;
    Input1["Input
    src1: fp32"]
    Quantise1["NEQuantizationLayer
    q_src1: QASYMM8_SIGNED"]
    Input2["Input
    src2: fp32"]
    Quantise2["NEQuantizationLayer
    q_src2: QASYMM8_SIGNED"]
    MatMul["NEGEMMLowpMatrixMultiplyCore
    q_res: S8"]

    Input1-->Quantise1;
    Input2-->Quantise2;
    Quantise1-->MatMul;
    Quantise2-->MatMul;
    MatMul-->Result;
Loading

Can you confirm that NEGEMMLowpMatrixMultiplyCore doesn't support transposed matrix and fix validation in accordance with implementation (experiment 2 below), please?

Experiment 1 (reference): Without transpose of input tensors, everything works as expected:

size_t n = 1;
size_t c = 1;
// A matrix: a1 x a2
size_t a1 = 6;
size_t a2 = 3;
// B matrix: b1 x b2
size_t b1 = 3;
size_t b2 = 6;

// Allocate input tensors
src1.allocator()->init(TensorInfo(TensorShape(a1, a2, c, n), 1, DataType::F32));
src2.allocator()->init(TensorInfo(TensorShape(b1, b2, c, n), 1, DataType::F32));

// Allocate & fill matrices
...

// We now have the quantisation info and can configure the quantised tensors
q_src1.allocator()->init(TensorInfo(TensorShape(a1, a2, c, n), 1, DataType::QASYMM8_SIGNED, src1_qinfo));
q_src2.allocator()->init(TensorInfo(TensorShape(b1, b2, c, n), 1, DataType::QASYMM8_SIGNED, src2_qinfo));

// Configure low precision gemm and initialise result tensor
NEGEMMLowpMatrixMultiplyCore qgemm;
q_res.allocator()->init(TensorInfo(TensorShape(a2, b1, c, n), 1, DataType::S32));
qgemm.configure(&q_src1, &q_src2, nullptr, &q_res);

// Allocate all tensors & run
...

Experiment 2: input tensor dimensions were NOT updated but transpose of B matrix was added: gemm_info.set_pretranspose_B(true);.
Expectation: fail, matrix dimensions are not correct.
Result: works as reference - no validation, no fail, the same results as reference.

...
// Configure low precision gemm and initialise result tensor
NEGEMMLowpMatrixMultiplyCore qgemm;
q_res.allocator()->init(TensorInfo(TensorShape(a2, b1, c, n), 1, DataType::S32));
GEMMInfo gemm_info; // <= new line
gemm_info.set_pretranspose_B(true); // <= new line
qgemm.configure(&q_src1, &q_src2, nullptr, &q_res, gemm_info);
...

Experiment 3: Input tensor dimensions were updated and transpose of B matrix was added: gemm_info.set_pretranspose_B(true);.
Expectation: works as reference, the same results as reference.
Result: fail, error message: validation fail: terminating due to uncaught exception of type std::runtime_error: in validate src/cpu/operators/CpuGemmLowpMatrixMultiplyCore.cpp:351: The product AB is defined only if the number of columns in A is equal to the number of rows in B.

size_t n = 1;
size_t c = 1;
// A matrix: a1 x a2
size_t a1 = 6;
size_t a2 = 3;
// B matrix: b1 x b2
size_t b1 = 6; // <= updated here: previous value is 3
size_t b2 = 3; // <= updated here: previous value is 6
...
// Configure low precision gemm and initialise result tensor
NEGEMMLowpMatrixMultiplyCore qgemm;
q_res.allocator()->init(TensorInfo(TensorShape(a2, b1, c, n), 1, DataType::S32));
GEMMInfo gemm_info; // <= new line
gemm_info.set_pretranspose_B(true); // <= new line
qgemm.configure(&q_src1, &q_src2, nullptr, &q_res, gemm_info);
...
@morgolock
Copy link

Hi @eshoguli

This patch makes changes in the ::validate() method to return an error when any of the pretranspose options are set to true. The implementation of NEGEMMLowpMatrixMultiplyCore does not support pretranspose.

Hope this helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants