Mapping Tree Species Using Advanced Remote Sensing Technologies: A State-of-the-Art Review and Perspective

Table 5

A summary of methods/algorithms frequently used in tree species (TS) classification.

Algorithm/model

Characteristic and description

Advantage and limitation

Major factor

Example

Spectral mixture analysis

Linear spectral mixture model (LSM)

LSM is a linear combination of spectral signatures of surface materials with their corresponding areal proportions as weighting factors. Endmembers (EM, e.g., TS) in LSM are the same for each pixel in an image

Simple and provides a physically meaningful measure of TS abundance within mixed pixels. LSM cannot account for subtle spectral differences between TS; the max number of EM is limited by the number of bands

Define and extract representative EM training spectra (TS)

The number of EMs (TS) are not limited by the number of bands and allowed to vary across pixels. A series of candidate 2-/3-EM models are evaluated for each pixel and an optimal model is finally adopted based on selection criteria

The limitations in LSM are overcome. The number of EMs (TS) and types are not limited by the number of bands in the image. Takes time to define a large number of 2-/3-EM models

Define and extract training spectra (TS). Define a large number of 2-/3-EM models

A standard parametric classifier that assumes normal distribution of training data that are used in computing probability density functions for each class. Decision rule is based on probability

Unbiased in the presence of larger sample sizes. Biased for small samples, sensitive to the number of input variables

Meet required min training sample size for each class (TS)

A parametric classifier that assumes normal distribution of training data. Output is a linear combination of input variables. Decision boundaries in feature space based on training class multispectral distance measurements

Easier interpretation of between-class differences and contribution of each input variable to output classes. Less sensitive to ill-posed problems and limited ability to deal with multicollinearity and outliers

LR makes maximum likelihood estimates to evaluate the probability of categorical samples. LR defines the odds of an event as the ratio of the probability of occurrence to that of nonoccurrence

No specific requirement on assumptions on the distribution of training data, takes both quality and quantitative input variables. Sensitive to outliers, has overfitting problems with unbalanced input data

Calculate the spectral similarity between a reference and a test spectrum by calculating the “angle” between the two spectra. Treating the both spectra as spectrum vectors in a space of an MS or HS image

This spectral similarity measure is insensitive to gain factors (illumination). SAM does not consider the variation within a pixel

The imaging data have been reduced to “apparent reflectance”

-NN is a nonparametric method when the pixel/IO is classified by a majority vote of its neighbors and it is assigned to the most common class among its -nearest neighbors

Simple and effective, and it is appropriate for those samples that cross multiple classes but it can be highly affected by the representatives of training samples for each class

Nonparametric classifier for both feature selection and target classification; achieving satisfactory results depends upon the determination of the “best” tree structure and the decision boundaries

Computationally fast, less sensitive to overfitting and output variables’ importance. The “best” tree structure and the decision boundaries are not easy to “find.” Might overfit in the presence of noisy data

Nonparametric classifier to map data from spectral space into feature space, wherein continuous predictors are partitioned into binary categories by an optimal -dimensional hyperplane

Handle data efficiently in high dimensionality, deal with noisy samples in a robust way, make use of only those called support vectors to construct classification models. The mapping data procedure is relatively complicated. Selection of kernel function parameters

Mapping data from the original input feature space to a kernel feature space

Nonparametric supervised classifier. A backpropagation algorithm is often used to train the multilayer perceptron neural network model with input of various features and output of classes (TS)

Be able to estimate the properties (patterns and trends) of data based on limited training samples. The nature of hidden layers is poorly known and it takes time to find a set of ideal structure parameters

Test and find a set of ideal architecture parameters

CNNs are constructed by neurons and links with learnable weights and biases. Each neuron receives a weighted sum of several inputs, passes it through an activation function and responds with an output. A common CNN architecture consists of an input layer, stacks of convolution and pooling layers, a fully connected layer, and an output layer

Efficiently processing multidimensional images and often leading to better classification results compared to other method classifiers. The model structure is complex and corresponding tools/software for different CNN architectures are often not available and accessible