In this work, we evaluate the performance of different machine learning strategies to determine the crop type of a set of pixels in a multiband spectral image. Such spectral image was taken by sensor type AVIRIS (Airborne Visible / Infrared Imaging Spectrometer) on board a satellite and it covers a region of Indiana, United States.
First, in terms of linear classifiers, the performance of the linear Support Vector Machine (SVM), Logistic Regression (LR), and Linear Discriminant Analysis (LDA) algorithms are studied. After that, non-linear classifiers such as k-Nearest Neighbours (k-NN), polynomial and Gaussian kernel SVM, and Binary Decision Tree are analyzed. Finally, ensemble classifiers such as Bagging, including Random Forest (RF), and Boosting are studied. In order to determine the optimal parameters of these classifiers, the cross-validation process is used.
On the feature extraction side, several linear, non-linear, supervised and non-supervised feature extraction techniques are used. First, in terms of linear multivariate analysis methods, the performance of the Principal Component Analysis (PCA), Partial Least Squares (PLS), Canonical Correlation Analysis (CCA), and Linear Discriminant Analysis (LDA) algorithms is studied. After that, non-linear feature extraction techniques based on kernel methods such as Kernel Principal Component Analysis (KPCA), Kernel Partial Least Squares (KPLS), and Kernel Canonical Correlation Analysis (KCCA) are analyzed.
Finally, several methods of feature selection techniques are analyzed. First, in terms of Filter methods, the performance of the Anova F-Test, Mutual Information, Random Forest and Hilbert-Schmidt Independence Criterion (HSIC) algorithms are studied. Then, the Minimum Redundancy Maximal Relevance (mRMR) algorithm, which is a Search method, is studied. After that, a Wrapper feature selection technique is analyzed, using the Recursive Feature Elimination (RFE) method. We also study the performance of two Embedded methods: the L1-SVM regularization and the L1-Logistic Regression regularization.
Data availability: contact the author.
University Carlos III of Madrid, Machine Learning Applications.