The advent of deep learning has yielded remarkable results across various computer vision tasks. In a bid to enhance performance, several traditional computer vision techniques have been integrated into deep learning frameworks. This repository presents a comparative study examining the performance of deep neural networks when augmented with traditional computer vision algorithms: Warping, SIFT, Edge Detection, and Gabor Filters.
-
SIFT: We used
SIFT
provided by OpenCV and 3 modes:SIFT-Default
,SIFT-Circle
, andSIFT-CircleBlur
as shown in follow images. Parametersn_features=200, contrastThreshold=0.04, edgeThreshold=10, sigma=1.6
are fixed for all experiments. The fourth image below depictsSIFT-CircleBlur
, where blurring is applied to the colored area in the third image. -
Gabor Filters:
gabor_kernel
provided byskimage.filters
are used. We concatenated 8 gabor filtered images that have angles each 0, 45, 90, 135 and 0.3, 0.5 frequency to input. In our experiment, we replaced the first layer of ResNet18 with a Gabor layer, allowing the parameters of the Gabor filters to be trained.Sample images illustrating the output of the 1st layer with 64 channels and their corresponding Gabor filter weights:
We utilized ResNet18 and modified the first convolutional layer to adjust the number of input image channels. Our experimentation involved CIFAR10, CIFAR100, as well as high-resolution datasets such as the Oxford 102 Flower Dataset and the Large Scale Fish Dataset, aiming to enhance the performance and impact of computer vision tasks.
Method \ Dataset | Flower102 | CIFAR10 | CIFAR100 | FISH |
---|---|---|---|---|
Baseline | 0.4004 | 0.7169 | 0.4141 | 0.7558 |
SIFT Default | 0.4093 | 0.7071 | 0.3774 | 0.7674 |
SIFT Circle | 0.4046 | 0.7119 | 0.4016 | 0.7558 |
SIFT Blur Circle | 0.4061 | 0.7042 | 0.3936 | 0.6279 |
Perspective | 0.3980 | 0.6947 | 0.3781 | 0.7326 |
Edge (150, 500) | 0.298 | 0.7173 | 0.3843 | 0.7442 |
Gabor | 0.2885 | 0.7183 | 0.3999 | 0.7442 |
Gabor Net | 0.2132 | 0.6776 | 0.3572 | 0.6279 |
We discovered that computer vision algorithms tend to aid deep neural networks more effectively when operating with higher resolutions compared to lower resolutions. In the case of low-resolution images, the computer vision algorithms do not appear to extract significant features. Instead, they seem to interfere with the feature extraction process carried out by the deep learning model.
The intention behind employing Warping was to aid in the extraction of high-level information by enabling the examination of existing images from various angles. However, the expected outcome did not materialize as anticipated. In the process of extracting spatial information in the first convolution layer, it appears that the inclusion of additional algorithms may actually impede progress rather than facilitate it.
The experiment showed that the most significant performance improvement occurred when SIFT was applied to high-resolution images, out of the four traditional computer vision algorithms tested: warping, SIFT, edge detection, and Gabor. Simply adding features generated by classical computer vision techniques to a channel did not yield satisfactory results. Considering the replacement of feature extraction with classical computer vision, or passing the extracted feature to the subsequent layer, might have resulted in better outcomes.