Journal of Remote Sensing / 2021 / Article / Tab 5

Review Article

Mapping Tree Species Using Advanced Remote Sensing Technologies: A State-of-the-Art Review and Perspective

Table 5

A summary of methods/algorithms frequently used in tree species (TS) classification.

Algorithm/modelCharacteristic and descriptionAdvantage and limitationMajor factorExample

Spectral mixture analysis
Linear spectral mixture model (LSM)LSM is a linear combination of spectral signatures of surface materials with their corresponding areal proportions as weighting factors. Endmembers (EM, e.g., TS) in LSM are the same for each pixel in an imageSimple and provides a physically meaningful measure of TS abundance within mixed pixels. LSM cannot account for subtle spectral differences between TS; the max number of EM is limited by the number of bandsDefine and extract representative EM training spectra (TS)[153, 154]
Multiple endmember spectral mixture analysis (MESMA)The number of EMs (TS) are not limited by the number of bands and allowed to vary across pixels. A series of candidate 2-/3-EM models are evaluated for each pixel and an optimal model is finally adopted based on selection criteriaThe limitations in LSM are overcome. The number of EMs (TS) and types are not limited by the number of bands in the image. Takes time to define a large number of 2-/3-EM modelsDefine and extract training spectra (TS). Define a large number of 2-/3-EM models[155, 156]
Traditional classification method
Maximum likelihood classifier (MLC)A standard parametric classifier that assumes normal distribution of training data that are used in computing probability density functions for each class. Decision rule is based on probabilityUnbiased in the presence of larger sample sizes. Biased for small samples, sensitive to the number of input variablesMeet required min training sample size for each class (TS)[38, 60, 70]
Linear discriminant analysis (LDA)A parametric classifier that assumes normal distribution of training data. Output is a linear combination of input variables. Decision boundaries in feature space based on training class multispectral distance measurementsEasier interpretation of between-class differences and contribution of each input variable to output classes. Less sensitive to ill-posed problems and limited ability to deal with multicollinearity and outliersDetermine and collect training samples[30, 39, 118]
Logistic regression (LR)LR makes maximum likelihood estimates to evaluate the probability of categorical samples. LR defines the odds of an event as the ratio of the probability of occurrence to that of nonoccurrenceNo specific requirement on assumptions on the distribution of training data, takes both quality and quantitative input variables. Sensitive to outliers, has overfitting problems with unbalanced input dataLarge and balanced input data needed[55, 109, 129]
Spectral angle mapper (SAM)Calculate the spectral similarity between a reference and a test spectrum by calculating the “angle” between the two spectra. Treating the both spectra as spectrum vectors in a space of an MS or HS imageThis spectral similarity measure is insensitive to gain factors (illumination). SAM does not consider the variation within a pixelThe imaging data have been reduced to “apparent reflectance”[14, 82, 158]
-nearest neighbor (-NN)-NN is a nonparametric method when the pixel/IO is classified by a majority vote of its neighbors and it is assigned to the most common class among its -nearest neighborsSimple and effective, and it is appropriate for those samples that cross multiple classes but it can be highly affected by the representatives of training samples for each classDetermination of ’s value[96, 146, 159]
Advanced classifiers/models (machine learning (ML) methods)
Random forest (RF)Nonparametric classifier for both feature selection and target classification; achieving satisfactory results depends upon the determination of the “best” tree structure and the decision boundariesComputationally fast, less sensitive to overfitting and output variables’ importance. The “best” tree structure and the decision boundaries are not easy to “find.” Might overfit in the presence of noisy dataFind the “best” tree structure[17, 34, 160]
Support vector machine (SVM)Nonparametric classifier to map data from spectral space into feature space, wherein continuous predictors are partitioned into binary categories by an optimal -dimensional hyperplaneHandle data efficiently in high dimensionality, deal with noisy samples in a robust way, make use of only those called support vectors to construct classification models. The mapping data procedure is relatively complicated. Selection of kernel function parametersMapping data from the original input feature space to a kernel feature space[38, 107, 139]
Artificial neural network (ANN)Nonparametric supervised classifier. A backpropagation algorithm is often used to train the multilayer perceptron neural network model with input of various features and output of classes (TS)Be able to estimate the properties (patterns and trends) of data based on limited training samples. The nature of hidden layers is poorly known and it takes time to find a set of ideal structure parametersTest and find a set of ideal architecture parameters[38, 92, 107]
Convolutional neural networks (CNNs)CNNs are constructed by neurons and links with learnable weights and biases. Each neuron receives a weighted sum of several inputs, passes it through an activation function and responds with an output. A common CNN architecture consists of an input layer, stacks of convolution and pooling layers, a fully connected layer, and an output layerEfficiently processing multidimensional images and often leading to better classification results compared to other method classifiers. The model structure is complex and corresponding tools/software for different CNN architectures are often not available and accessibleBuild the convolutional layer[107, 119, 161]