NEGEMMLowpMatrixMultiplyCore: set_pretranspose_A & set_pretranspose_B support #1127

eshoguli · 2024-07-19T12:32:08Z

Model:

graph TD;
    Input1["Input
    src1: fp32"]
    Quantise1["NEQuantizationLayer
    q_src1: QASYMM8_SIGNED"]
    Input2["Input
    src2: fp32"]
    Quantise2["NEQuantizationLayer
    q_src2: QASYMM8_SIGNED"]
    MatMul["NEGEMMLowpMatrixMultiplyCore
    q_res: S8"]

    Input1-->Quantise1;
    Input2-->Quantise2;
    Quantise1-->MatMul;
    Quantise2-->MatMul;
    MatMul-->Result;

Can you confirm that NEGEMMLowpMatrixMultiplyCore doesn't support transposed matrix and fix validation in accordance with implementation (experiment 2 below), please?

Experiment 1 (reference): Without transpose of input tensors, everything works as expected:

size_t n = 1;
size_t c = 1;
// A matrix: a1 x a2
size_t a1 = 6;
size_t a2 = 3;
// B matrix: b1 x b2
size_t b1 = 3;
size_t b2 = 6;

// Allocate input tensors
src1.allocator()->init(TensorInfo(TensorShape(a1, a2, c, n), 1, DataType::F32));
src2.allocator()->init(TensorInfo(TensorShape(b1, b2, c, n), 1, DataType::F32));

// Allocate & fill matrices
...

// We now have the quantisation info and can configure the quantised tensors
q_src1.allocator()->init(TensorInfo(TensorShape(a1, a2, c, n), 1, DataType::QASYMM8_SIGNED, src1_qinfo));
q_src2.allocator()->init(TensorInfo(TensorShape(b1, b2, c, n), 1, DataType::QASYMM8_SIGNED, src2_qinfo));

// Configure low precision gemm and initialise result tensor
NEGEMMLowpMatrixMultiplyCore qgemm;
q_res.allocator()->init(TensorInfo(TensorShape(a2, b1, c, n), 1, DataType::S32));
qgemm.configure(&q_src1, &q_src2, nullptr, &q_res);

// Allocate all tensors & run
...

Experiment 2: input tensor dimensions were NOT updated but transpose of B matrix was added: gemm_info.set_pretranspose_B(true);.
Expectation: fail, matrix dimensions are not correct.
Result: works as reference - no validation, no fail, the same results as reference.

...
// Configure low precision gemm and initialise result tensor
NEGEMMLowpMatrixMultiplyCore qgemm;
q_res.allocator()->init(TensorInfo(TensorShape(a2, b1, c, n), 1, DataType::S32));
GEMMInfo gemm_info; // <= new line
gemm_info.set_pretranspose_B(true); // <= new line
qgemm.configure(&q_src1, &q_src2, nullptr, &q_res, gemm_info);
...

Experiment 3: Input tensor dimensions were updated and transpose of B matrix was added: gemm_info.set_pretranspose_B(true);.
Expectation: works as reference, the same results as reference.
Result: fail, error message: validation fail: terminating due to uncaught exception of type std::runtime_error: in validate src/cpu/operators/CpuGemmLowpMatrixMultiplyCore.cpp:351: The product AB is defined only if the number of columns in A is equal to the number of rows in B.

size_t n = 1;
size_t c = 1;
// A matrix: a1 x a2
size_t a1 = 6;
size_t a2 = 3;
// B matrix: b1 x b2
size_t b1 = 6; // <= updated here: previous value is 3
size_t b2 = 3; // <= updated here: previous value is 6
...
// Configure low precision gemm and initialise result tensor
NEGEMMLowpMatrixMultiplyCore qgemm;
q_res.allocator()->init(TensorInfo(TensorShape(a2, b1, c, n), 1, DataType::S32));
GEMMInfo gemm_info; // <= new line
gemm_info.set_pretranspose_B(true); // <= new line
qgemm.configure(&q_src1, &q_src2, nullptr, &q_res, gemm_info);
...

The text was updated successfully, but these errors were encountered:

morgolock · 2024-10-03T12:39:30Z

Hi @eshoguli

This patch makes changes in the ::validate() method to return an error when any of the pretranspose options are set to true. The implementation of NEGEMMLowpMatrixMultiplyCore does not support pretranspose.

Hope this helps

morgolock added the Help wanted label Aug 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NEGEMMLowpMatrixMultiplyCore: set_pretranspose_A & set_pretranspose_B support #1127

NEGEMMLowpMatrixMultiplyCore: set_pretranspose_A & set_pretranspose_B support #1127

eshoguli commented Jul 19, 2024 •

edited

Loading

morgolock commented Oct 3, 2024

NEGEMMLowpMatrixMultiplyCore: set_pretranspose_A & set_pretranspose_B support #1127

NEGEMMLowpMatrixMultiplyCore: set_pretranspose_A & set_pretranspose_B support #1127

Comments

eshoguli commented Jul 19, 2024 • edited Loading

morgolock commented Oct 3, 2024

eshoguli commented Jul 19, 2024 •

edited

Loading