Add operators support for Ascend NPU (CANN backend) #3552

hipudding · 2023-08-17T08:40:28Z

CANN (Compute Architecture of Neural Networks), developped by Huawei, is a heterogeneous computing architecture for AI. Opencv DNN has already suppoted CANN backend #22634.

There are more and more users using Ascend NPU and programming with CANN, and the number is still growing rapidly. AI training and inference are inseparable from data preprocessing. When users use OpenCV to work with CANN backend, data preprocessing can only run on CPUs, resulting in inefficiency.

The purpose of this PR is to enable OpenCV operators on CANN backend. We also complete a E2E test on Ascend 310 for new added operators, see test results.

The usage of CANN backend is consistent, Please refer to OpenCV DNN: CANN backend manual:

The CANN backend is used in a similar way to CUDA:

Object	CANN	CUDA
Namespace	cv::cann	cv::cuda
Matrix	AscendMat	GpuMat
Stream	AscendStream	Stream
Event	AscendEvent	Event

The current PR provides CANN backend operator support framework, In order to make code viewing easy, only some basic interfaces are implemented, all of the following operators are tested and compared result with CPU backend:

More operators will continue implement in new independent PRs.

OpenCVFindCANN.cmake is modified in opencv#24488.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
N/A There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

Co-authored-by: CaoMengqing [email protected]

hipudding · 2023-08-30T08:26:47Z

Performance Test Reasult

VM from huawei cloud

CPU: Intel(R) Xeon(R) Gold 6278C CPU @ 2.60GHz
Memory: 32G
NPU: Ascend 310(driver version: 22.0.4)
CANN: 6.3.RC2.alpha003

opreator	size	type	NPU time(ms)	CPU time(ms)	Efficiency improvement (cpu_time/npu_time)
add	1920x1080	CV_32S	234	219	0.94
add	1920x1080	CV_32SC3	545	663	1.22
add	2048x2048	CV_32S	395	448	1.13
add	2048x2048	CV_32SC3	984	1564	1.59
add	3840x2160	CV_32S	672	1021	1.52
add	3840x2160	CV_32SC3	1867	3122	1.67
add	7680x4320	CV_32S	2454	4148	1.69
add	7680x4320	CV_32SC3	7196	12382	1.72

subtract	1920x1080	CV_32S	227	221	0.97
subtract	1920x1080	CV_32SC3	525	763	1.45
subtract	2048x2048	CV_32S	392	700	1.79
subtract	2048x2048	CV_32SC3	1020	1567	1.54
subtract	3840x2160	CV_32S	706	1024	1.45
subtract	3840x2160	CV_32SC3	1902	3738	1.97
subtract	7680x4320	CV_32S	2492	4164	1.67
subtract	7680x4320	CV_32SC3	7308	12478	1.71

multiply	1920x1080	CV_32S	236	220	0.93
multiply	1920x1080	CV_32SC3	539	667	1.24
multiply	2048x2048	CV_32S	392	448	1.14
multiply	2048x2048	CV_32SC3	983	1567	1.59
multiply	3840x2160	CV_32S	678	1020	1.50
multiply	3840x2160	CV_32SC3	1869	3129	1.67
multiply	7680x4320	CV_32S	2461	4172	1.70
multiply	7680x4320	CV_32SC3	7170	12497	1.74

divide	1920x1080	CV_32S	232	460	1.98
divide	1920x1080	CV_32SC3	530	1588	3.00
divide	2048x2048	CV_32S	394	929	2.36
divide	2048x2048	CV_32SC3	994	3046	3.06
divide	3840x2160	CV_32S	678	1988	2.93
divide	3840x2160	CV_32SC3	1912	5614	2.94
divide	7680x4320	CV_32S	2512	7488	2.98
divide	7680x4320	CV_32SC3	7290	22479	3.08

bitwise_and	1920x1080	CV_32S	223	218	0.98
bitwise_and	1920x1080	CV_32SC3	524	756	1.44
bitwise_and	2048x2048	CV_32S	389	520	1.34
bitwise_and	2048x2048	CV_32SC3	1022	1551	1.52
bitwise_and	3840x2160	CV_32S	676	1021	1.51
bitwise_and	3840x2160	CV_32SC3	1882	3103	1.65
bitwise_and	7680x4320	CV_32S	2471	4125	1.67
bitwise_and	7680x4320	CV_32SC3	7187	12366	1.72

bitwise_or	1920x1080	CV_32S	234	221	0.94
bitwise_or	1920x1080	CV_32SC3	564	667	1.18
bitwise_or	2048x2048	CV_32S	407	449	1.10
bitwise_or	2048x2048	CV_32SC3	1019	1573	1.54
bitwise_or	3840x2160	CV_32S	669	1027	1.54
bitwise_or	3840x2160	CV_32SC3	1847	3141	1.70
bitwise_or	7680x4320	CV_32S	2653	4176	1.57
bitwise_or	7680x4320	CV_32SC3	7181	12495	1.74

bitwise_xor	1920x1080	CV_32S	218	218	1.00
bitwise_xor	1920x1080	CV_32SC3	521	755	1.45
bitwise_xor	2048x2048	CV_32S	387	518	1.34
bitwise_xor	2048x2048	CV_32SC3	1022	1552	1.52
bitwise_xor	3840x2160	CV_32S	675	1013	1.50
bitwise_xor	3840x2160	CV_32SC3	1905	3098	1.63
bitwise_xor	7680x4320	CV_32S	2487	4117	1.66
bitwise_xor	7680x4320	CV_32SC3	7193	12321	1.71

hipudding · 2023-08-31T06:38:30Z

Hi @vpisarev, Could you please help me to review this PR or assign someone else to do it? This PR is mainly to enable Ascend NPU (CANN backend) to be used as an accelerated backend of OpenCV and implemented several simple arithmetic operators, these operators do seem to have a certain acceleration effect.

In addition, I have two more questions and would like to get your advice:

I have considered two implementation ways, one is the way of using namespace as this PR does, and the other is the way of hal replacement. They each have their own pros and cons, considering that we want to complete a new backend supporting, do you have any suggestions for these two implementations?
Using namespace way, I had to implement a new Mat class, which contained a lot of duplicate code with Mat and GpuMat. Moreover, I need to modify the InputArray and OutputArray to adapt to the new Mat class, and I need to modify the python binding code generator. I am wondering if there is a better way to achieve this goal with no modification or slight modification in OpenCV's core module, because it is not very good to modify these code for every new backend.

Thanks.

opencv-alalek · 2023-09-10T13:22:22Z

modules/cannarithm/include/opencv2/cann.hpp

+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+
+#ifndef OPENCV_CANN_HPP


Identifier should have the following format: OPENCV_<module>_<header_underscrore_subpath>_HPP

Thanks for the code review, will fix them in next commit.

Done. change all identifiers in hpp.

opencv-alalek · 2023-09-10T13:25:58Z

modules/cannarithm/include/opencv2/cann_arithm.hpp

+CV_EXPORTS_W void subtract(InputArray src1, InputArray src2, OutputArray dst,
+                           InputArray mask = noArray(), int dtype = -1,
+                           AclStream& stream = AclStream::Null());
+#ifdef NEVER_DEFINED


NEVER_DEFINED

What is that?

We should not have platform-specific conditional compilation in OpenCV public headers.

Also OpenCV bindings generators can't properly handle that.

DEVER_DEFINED means this part of code will never be compiled. I did this only to cheat to python bindings to generate these two interfaces. Only did this can it support Mat subtract scalar. Otherwise, only Mat is accepted.

To be hoest, it's not a very good idea, do you have any suggestions？

opencv-alalek · 2023-09-10T13:27:19Z

modules/cannarithm/include/opencv2/acl_stream_accessor.hpp

+#ifndef OPENCV_CANN_STREAM_ACCESSOR_HPP
+#define OPENCV_CANN_STREAM_ACCESSOR_HPP
+
+#include <acl/acl.h>


BTW, this is platform-specific include in public header.

Some assumptions should be applied on the User side (like properly configured build environment, including headers paths and defines).

I did some code refactoring, <acl/acl.h> only defined in cann_call.hpp, other files can't call acl functions directly.
In this commit(opencv/opencv#22634), OpenCVFindCANN.cmake has already introduced, and cmake will set headers and libraries correctly.

opencv-alalek · 2023-09-10T13:30:57Z

modules/cannarithm/include/opencv2/cann_call.hpp

+void aclTwoInputs(const AclMat& src1, const AclMat& src2, AclMat& dst, const char* op,
+                  AclStream& stream = AclStream::Null());
+
+void transNCHWToNHWC(const AclMat& src, AclMat& dst, AclStream& stream = AclStream::Null());


Do we need a reverse operation?

Yes, we do. Not this function is renamed to transData, it support transform all kinds of img formats.

opencv-alalek · 2023-09-10T13:33:38Z

modules/cannarithm/misc/python/pyopencv_cann.hpp

+
+#include "opencv2/cann.hpp"
+
+typedef std::vector<cann::AclMat> vector_AclMat;


Any python test?

misc/python/test/test_cann.py

A python interface testcase is added.

opencv-alalek · 2023-09-10T13:35:27Z

modules/cannarithm/samples/sample.cpp

+    cv::cann::initAcl();
+    cv::cann::setDevice(0);
+
+    cv::cann::AclMat aclMat = cv::cann::AclMat();


cv::cann::AclMat aclMat = cv::cann::AclMat();

Just:

cv::cann::AclMat aclMat;

opencv-alalek · 2023-09-10T13:38:10Z

modules/cannarithm/test/test_element_operation.cpp

+        Mat cpuMat1 = randomMat(10, 10, CV_32SC3);           \
+        Mat cpuMat2 = randomMat(10, 10, CV_32SC3);           \
+        Mat cpuDst;                                          \
+        cv::op(cpuMat1, cpuMat2, cpuDst, __VA_ARGS__);       \


It is better to avoid multi-line macros.
Especially in tests, as we can't debug code line-by-line (macros is a single complex line of code)

Use templates instead.

FIxed. But it's still exists some multi-line macros. I will change them in next code refactoring(maybe in separate PR).

hipudding · 2023-09-15T08:48:33Z

Hi @opencv-alalek , Thanks for your review. I fixed the review coments, and did some code refactoring. Could you please review it again? Thanks.

fengyuentau · 2023-10-26T07:47:33Z

Hello @hipudding , does this PR build against opencv/opencv#24277?

hipudding · 2023-10-28T01:25:43Z

Hello @hipudding , does this PR build against opencv/opencv#24277?

Yes, it is. This PR need some predeclaration, which have to put in opencv main repo.

hipudding · 2023-11-03T09:04:36Z

@fengyuentau Good weekends. Now this pr is not depend on opencv's main repo anymore(except OpenCVFindCANN.cmake). Please review it again. Thanks.

CANN (Compute Architecture of Neural Networks), developped by Huawei, is a heterogeneous computing architecture for AI. Opencv DNN has already suppoted CANN backend [#22634](opencv/opencv#22634). There are more and more users using [Ascend NPU](https://www.hiascend.com/) and programming with CANN, and the number is still growing rapidly. AI training and inference are inseparable from data preprocessing. When users use OpenCV to work with CANN backend, data preprocessing can only run on CPUs, resulting in inefficiency. The purpose of this commit is to enable OpenCV operators on CANN backend. The usage of CANN backend is consistent, Please refer to OpenCV DNN: [CANN backend manual] (https://gist.github.com/fengyuentau/083f7f339592545c1f1d2c1fde6a53dc#file-a_ocv_cann-md): 1. [Install dependencies] (https://gist.github.com/fengyuentau/083f7f339592545c1f1d2c1fde6a53dc#install-dependencies) 2. [Install CANN] (https://gist.github.com/fengyuentau/083f7f339592545c1f1d2c1fde6a53dc#install-cann) 3. [Compile OpenCV with CANN] (https://gist.github.com/fengyuentau/083f7f339592545c1f1d2c1fde6a53dc#build-opencv-with-cann) The CANN backend is used in a similar way to CUDA: | Object | CANN | CUDA | | --------- | ------------ | -------- | | Namespace | cv::cann | cv::cuda | | Matrix | AscendMat | GpuMat | | Stream | AscendStream | Stream | | Event | AscendEvent | Event | The current commit provides CANN backend operator support framework, In order to make code viewing easy, only a few basic interfaces are implemented, all of the following operators are tested and compared result with CPU backend. More operators will continue implement in new independent commits. Co-authored-by: CaoMengqing <[email protected]>

diandianliu · 2024-03-07T07:46:22Z

@hipudding @fengyuentau Hi, When cv::Canny and cv::GaussianBlur operators can be supported with CANN?

hipudding · 2024-03-07T08:28:44Z

@hipudding @fengyuentau Hi, When cv::Canny and cv::GaussianBlur operators can be supported with CANN?

Thank you for your interest in OpenCV’s CANN support. Unfortunately, we do not have any plans for support these operators. OpenCV's CANN support mainly provides the ability of calling CANN build-in operators and run AscendC kernels(see also #3614).

Can we talk more details about technical issues via email? ([email protected])

Add additional image processing operators for Ascend NPU by utilizing DVPP #3608 The user base for [Ascend NPU](https://www.hiascend.com/en/) and programming with CANN is increasing rapidly, with a growing number of users joining each day. To facilitate the use of these users, this PR provides more support for Ascend backend operators. All operators this PR offers are using use DVPP as the computational unit. Digital Vision Pre-Processing (DVPP) is an image processing unit built into the Ascend AI processor. Its main functions include image and video encoding/decoding, as well as image cropping and scaling. The high-frequency operators with NPU as the backend and basic data structure AscendMat has been provided in #3552, while it still lacks many image processing operators. Moreover, only two interpolation algorithms for the resize operator are supported in #3552. In this PR, the bilinear interpolation algorithm and nearest neighbour interpolation algorithm are implemented for the resize operator, as well as the Ascend implementation of the copyMakeBorder operator. In addition, the serialization of image processing operations is widely used in the preprocessing and post-processing stages of computer vision deep learning methods. Therefore, providing integrated operators is very meaningful for improving the convenience of use for OpenCV and deep learning crossover users. For example, torchvision also provides similar operators: [RESIZED_CROP](https://pytorch.org/vision/stable/generated/torchvision.transforms.functional.resized_crop.html?highlight=resizedcrop). Thus, this PR also provides two serialization processing operators: cropResize and cropResizeMakeBorder. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [N/A] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

Add additional image processing operators for Ascend NPU by utilizing DVPP opencv#3608 The user base for [Ascend NPU](https://www.hiascend.com/en/) and programming with CANN is increasing rapidly, with a growing number of users joining each day. To facilitate the use of these users, this PR provides more support for Ascend backend operators. All operators this PR offers are using use DVPP as the computational unit. Digital Vision Pre-Processing (DVPP) is an image processing unit built into the Ascend AI processor. Its main functions include image and video encoding/decoding, as well as image cropping and scaling. The high-frequency operators with NPU as the backend and basic data structure AscendMat has been provided in opencv#3552, while it still lacks many image processing operators. Moreover, only two interpolation algorithms for the resize operator are supported in opencv#3552. In this PR, the bilinear interpolation algorithm and nearest neighbour interpolation algorithm are implemented for the resize operator, as well as the Ascend implementation of the copyMakeBorder operator. In addition, the serialization of image processing operations is widely used in the preprocessing and post-processing stages of computer vision deep learning methods. Therefore, providing integrated operators is very meaningful for improving the convenience of use for OpenCV and deep learning crossover users. For example, torchvision also provides similar operators: [RESIZED_CROP](https://pytorch.org/vision/stable/generated/torchvision.transforms.functional.resized_crop.html?highlight=resizedcrop). Thus, this PR also provides two serialization processing operators: cropResize and cropResizeMakeBorder. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [N/A] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

hipudding force-pushed the npu_support branch 5 times, most recently from fe53efd to 0a751c3 Compare August 22, 2023 02:41

hipudding changed the title ~~Support Ascend NPU~~ Support operators to execute on CANN backend Aug 22, 2023

hipudding force-pushed the npu_support branch from 0a751c3 to 66b30f0 Compare August 22, 2023 02:43

hipudding force-pushed the npu_support branch 2 times, most recently from 4176ecd to dc927ce Compare August 30, 2023 08:04

hipudding changed the title ~~Support operators to execute on CANN backend~~ Add operators support for Ascend NPU (CANN backend) Aug 31, 2023

hipudding force-pushed the npu_support branch 2 times, most recently from 381d853 to b07acec Compare August 31, 2023 03:15

opencv-alalek reviewed Sep 10, 2023

View reviewed changes

hipudding force-pushed the npu_support branch from dbceb99 to c5770bc Compare September 15, 2023 08:46

hipudding force-pushed the npu_support branch from c5770bc to 981baad Compare September 15, 2023 08:53

hipudding mentioned this pull request Sep 15, 2023

Suport Ascend NPU opencv/opencv#24277

Closed

5 tasks

hipudding requested a review from opencv-alalek September 15, 2023 09:09

hipudding force-pushed the npu_support branch 9 times, most recently from 3df0507 to 2820b30 Compare September 22, 2023 07:42

hipudding force-pushed the npu_support branch from 26650b0 to d9dc1e5 Compare November 3, 2023 08:34

hipudding mentioned this pull request Nov 3, 2023

Link lib_acl_op_compiler when compile with CANN opencv/opencv#24488

Merged

5 tasks

hipudding force-pushed the npu_support branch 4 times, most recently from 2be0761 to cb3809a Compare November 6, 2023 09:03

fengyuentau mentioned this pull request Nov 10, 2023

Added CI pipeline with openEuler22.03.SP2 and Ascend310 opencv/ci-gha-workflow#120

Merged

hipudding force-pushed the npu_support branch 2 times, most recently from aedad52 to 310a96f Compare November 16, 2023 07:28

fengyuentau added the category: cann label Nov 16, 2023

hipudding force-pushed the npu_support branch from 310a96f to 2b98295 Compare November 16, 2023 07:55

hipudding force-pushed the npu_support branch from 2b98295 to ebfbef1 Compare November 16, 2023 08:09

vpisarev merged commit 3c5635e into opencv:4.x Nov 21, 2023
10 checks passed

opencv-alalek removed their request for review November 27, 2023 11:14

MengqingCao mentioned this pull request Dec 13, 2023

Add additional image processing operators for Ascend NPU by utilizing DVPP #3608

Merged

5 tasks

vpisarev mentioned this pull request Feb 14, 2024

Introducing non-CPU HAL for OpenCV 5+ opencv/opencv#25025

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add operators support for Ascend NPU (CANN backend) #3552

Add operators support for Ascend NPU (CANN backend) #3552

hipudding commented Aug 17, 2023 •

edited

Loading

hipudding commented Aug 30, 2023

hipudding commented Aug 31, 2023 •

edited

Loading

opencv-alalek Sep 10, 2023

hipudding Sep 10, 2023

hipudding Sep 15, 2023

opencv-alalek Sep 10, 2023

hipudding Sep 15, 2023

opencv-alalek Sep 10, 2023

hipudding Sep 15, 2023

opencv-alalek Sep 10, 2023

hipudding Sep 15, 2023

opencv-alalek Sep 10, 2023

hipudding Sep 15, 2023

opencv-alalek Sep 10, 2023

hipudding Sep 15, 2023

opencv-alalek Sep 10, 2023

hipudding Sep 15, 2023

hipudding commented Sep 15, 2023

fengyuentau commented Oct 26, 2023

hipudding commented Oct 28, 2023

hipudding commented Nov 3, 2023

diandianliu commented Mar 7, 2024

hipudding commented Mar 7, 2024


		#include "opencv2/cann.hpp"

		typedef std::vector<cann::AclMat> vector_AclMat;

Add operators support for Ascend NPU (CANN backend) #3552

Add operators support for Ascend NPU (CANN backend) #3552

Conversation

hipudding commented Aug 17, 2023 • edited Loading

Pull Request Readiness Checklist

hipudding commented Aug 30, 2023

Performance Test Reasult

hipudding commented Aug 31, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hipudding commented Sep 15, 2023

fengyuentau commented Oct 26, 2023

hipudding commented Oct 28, 2023

hipudding commented Nov 3, 2023

diandianliu commented Mar 7, 2024

hipudding commented Mar 7, 2024

hipudding commented Aug 17, 2023 •

edited

Loading

hipudding commented Aug 31, 2023 •

edited

Loading