Research Article  Open Access
Sen Jia, Zhangwei Zhan, Meng Xu, "ShearletBased StructureAware Filtering for Hyperspectral and LiDAR Data Classification", Journal of Remote Sensing, vol. 2021, Article ID 9825415, 25 pages, 2021. https://doi.org/10.34133/2021/9825415
ShearletBased StructureAware Filtering for Hyperspectral and LiDAR Data Classification
Abstract
The joint interpretation of hyperspectral images (HSIs) and light detection and ranging (LiDAR) data has developed rapidly in recent years due to continuously evolving image processing technology. Nowadays, most feature extraction methods are carried out by convolving the raw data with fixedsize filters, whereas the structural and texture information of objects in multiple scales cannot be sufficiently exploited. In this article, a shearletbased structureaware filtering approach, abbreviated as ShearSAF, is proposed for HSI and LiDAR feature extraction and classification. Specifically, superpixelguided kernel principal component analysis (KPCA) is firstly adopted on raw HSIs to reduce the dimensions. Then, the KPCAreduced HSI and LiDAR data are converted to the shearlet domain for texture and area feature extraction. In contrast, superpixel segmentation algorithm utilizes the raw HSI data to obtain the initial oversegmentation map. Subsequently, by utilizing a welldesigned minimum merging cost that fully considers spectral (HSI and LiDAR data), texture, and area features, a region merging procedure is gradually conducted to produce a final merging map. Further, a scale map that locally indicates the filter size is achieved by calculating the edge distance. Finally, the KPCAreduced HSI and LiDAR data are convolved with the locally adaptive filters for feature extraction, and a random forest (RF) classifier is thus adopted for classification. The effectiveness of our ShearSAF approach is verified on three realworld datasets, and the results show that the performance of ShearSAF can achieve an accuracy higher than that of comparison methods when exploiting smallsize training sample problems. The codes of this work will be available at http://jiasen.tech/papers/ for the sake of reproducibility.
1. Introduction
Recently, the continuously evolving remote sensing sensor technologies have contributed to the capture of multisource data in the same area [1, 2]. Among those numerous remote sensing data, hyperspectral images (HSIs) contain joint spectrum and space information, providing a distinctive discriminating ability for Earth’s surface objects. HSIs have hundreds or thousands of narrow spectral bands, covering the spectral region from the visible to the infrared field [3]. In particular, HSIs have both spatial and spectral smoothness, which not only produces detailed and accurate descriptions of objects but also results in a high correlation between adjacent bands [4–6]. Based on the above reasons, some obstacles and challenges exist regarding the interpretation of HSI information. Specifically, HSIs are prone to include information redundancy as a result of high correlation and the Hughes phenomenon caused by a high spectral dimension [7, 8]. In addition, environmental factors, such as clouds and noise, will also cause information confusion when the remote sensor captures scene data [9]. Compared with HSI, LiDAR integrates a laser ranging system, a global positioning system, and an inertial navigation system, so that it can collect the position and intensity information of objects in a threedimensional space [10–12]. However, LiDAR works in a single band and lacks semantic information; thus, it has poor discriminative ability in distinguishing targets with similar heights but different spectra [13].
Many works in the literature have proven the effectiveness of combined HSI and LiDAR interpretation, indicating that the intensity information provided by LiDAR can supplement the HSI deficiencies regarding the target height and shape information [5, 14–17]. In 2013, the HSI and LiDAR data fusion competition, organized by IEEE Geoscience and Remote Sensing Society, greatly promoted the research on HSI and LiDAR data fusion methods for classification [18]. In general, these fusion methods can be roughly divided into three categories: pixellevel fusion, featurelevel fusion, and decisionlevel fusion. The strategy of pixellevel fusion relies on concatenating multisource data directly on the original data, which requires geometric registration. Featurelevel fusion is considered as a better approach in that it can achieve better classification performance in most cases [19–22]. It conducts the feature extraction for each source individually and then combines them. Decisionlevel fusion methods are aimed at integrating several rough classification results of multisource data into the final classification [23, 24]. Although the computational complexity is relatively low, it relies heavily on the original classification results and integration strategies. In fact, due to the inherent shortcomings of single data fusion methods, there are many articles exploring the classification framework that combines featurelevel fusion and decisionlevel fusion [25–27].
In addition, it must be mentioned that deep learningbased approaches are also widely favored in recent years for hyperspectral feature extraction and classification. The pioneering work is the stacked autoencoder proposed by Chen et al., which has been used for hyperspectral highlevel feature extraction [28]. Subsequently, convolutional neural network (CNN) [29–31] and recurrent neural network (RNN) [32, 33] have been widely developed. Among them, the representative 3DCNN can effectively extract the joint spatialspectral information and has shown good performance [34]. Furthermore, the deep learningbased framework, which combines traditional features and network structure, has also been explored [35, 36]. However, the model parameter optimization of these deep learningbased approaches generally relies on a large number of training samples, which greatly limits its applicable ability due to the difficulty of sample labeling in remote sensing field. Inspired by the small sample set circumstance, some new strategies have been developed. For example, Yu et al. proposed a novel CNN model by combining data augmentation and convolutional kernel [37]. More recently, semisupervised CNN and PCANetderivative methods have also received constant attention because of their performance with limited training samples [38–41].
Alternatively, wavelet analysis is an important mathematical tool because of its optimal approximate fitness in signal processing. However, in the case of multidimensional images with a discontinuity curve, traditional wavelet loses its sparsity on the edge response [42, 43]. Thus, the multidimensional directional wavelet is required. Gabor, which is widely used in texture analysis and feature extraction, can be considered as an early directional wavelet [44, 45]. Its inherent drawback is that directions are restricted on each scale once sampled. There has been an emerging series of directional wavelets in the past few decades, such as contourlets, bandlets, curvelets, and shearlets [46–49]. They all provide a flexible framework in mathematical theory while capturing the geometric features in applications. Among these methods, the shearlet possesses remarkable properties: it accurately captures the edge direction, has an optimal sparse representation for multidimensional data, uses a wellorganized multiscale structure, and exhibits fast algorithm implementation and efficient calculation [50–52]. In its simplest form, the shearlet starts with the construction of the socalled mother function, and then, it adopts three basic operations (scaling, translation, and direction) to provide a derivative with more shape and direction. Two wellknown properties of shearlets are highlighted as follows: (1) if a point is far away from the edge, then its shearlet coefficient decays rapidly as the scale decreases; (2) if a point is an edge point or a corner point, then its shearlet coefficient decays slowly in the normal direction, while it decays rapidly in other directions. Therefore, shearlets have been widely used for edge and corner detection [53–55]. In addition, due to its frequency domain division, there are also some pioneering works using it for denoising, feature extraction, and data fusion [56–60].
During the past two decades in the computer vision field, another emerging and rapidly spreading concept is the use of superpixel segmentation. Specifically, a superpixel is considered as a homogeneous area containing some texture or structural similar pixels [61–63]. The edge of the superpixel is a closed curve with continuity, which is different from scenarios encountered with edge extraction algorithms in which continuous scattered points may exist. Moreover, the superpixel should also possess region compactness, shape regularity, and boundary smoothness [64]. Currently, superpixel algorithms are roughly divided into three categories: clusterbased approaches represented by simple linear iterative clustering (SLIC) and simple noniterative clustering (SNIC) [65, 66], graphbased approaches represented by entropy rate superpixel segmentation (ERS) and normalized nuts (NCut) [67, 68], and gradientbased methods represented by spatialconstrained watershed (SCoW) and superpixels using the shortest gradient distance (SSGD) [69, 70]. Notably, fuzzy superpixels were proposed in the past two years and have further enriched the content of superpixels, especially in cases of low spatial resolution such as in remote images [71, 72]. However, regardless of the kind of superpixel algorithm, direct or indirect spatial constraints exist for compactness requirements. For example, SLIC directly adds spatial distance to the clustering metric, and ERS requires each superpixel to be as close to the same size as possible. This spatial constraint inevitably leads to conflicts between oversegmentation and undersegmentation. Moreover, the situation worsens since the resistance of this constraint grows as a power series. In other words, due to larger space constraints, a large homogeneous region has to be divided into several small superpixels, and a small area may contain several objects because there is almost no spatial restriction. Nevertheless, some heuristic solutions have been proposed for superpixel number selection [73, 74], but this inherent property of superpixels has not been slackened effectively.
In particular, filters play an extremely important role in image processing. There are already many filters based on different applications, such as the mean filter, Gaussian filter, and Gabor filter, for feature extraction [75, 76], and the Laplacian, Sobel, and Laplacian of Gaussian (LoG), for edge capture. However, fixedsize filter does not have the ability to obtain the best description of surface objects with various scales, and thus, it is undeniably difficult for a uniformsize filter to achieve globally satisfactory results. In other words, near the edge, a larger size of a filter will cause more confusing information, whereas in the center of the region, a smaller size of the filter hardly works well when abnormal points exist. Some researchers have made some attempts in this area, such as the multiscale spectralspatial classification method with adaptive filtering [77–79] and the spatial adaptive multiscale filtering technique [80, 81]. However, the features or classification results in multiple filtering scales were simply concatenated or combined, and it is more desirable to take full advantage of the internal structure of objects and achieve local structureaware filtering (i.e., automatically adjusting the size of the filter kernel according to the local position).
In this article, we innovatively propose a shearletbased structureaware filtering framework, abbreviated as ShearSAF, for HSI and LiDAR data classification with the help of the above tools. First, superpixelguided kernel principal component analysis (KPCA) is adopted on raw HSIs for dimension reduction and information focus, which is greatly helpful for subsequent calculation and processing. Then, shearlet transform is implemented on KPCAreduced HSI and LiDAR, and structural description in frequency domain is achieved; i.e., the high frequency and low frequency, respectively, contain the region information and texture information. They are further processed by energy superposition and timefrequency conversion to attain region features and texture features. Second, a gradual region merging procedure is developed to alleviate the superpixel spatial constraints and enhance the robustness of the proposed ShearSAF method. Specifically, the SNIC superpixel algorithm acts on the raw HSI to address serious oversegmentation and ensure the homogeneity of each small region, and then, the superpixels are progressively combined together according to a welldesigned merging criterion that takes all the spectral information, region information, and texture information into account, eventually achieving the final merging map. Third, by locating the edge in the final merging map and calculating the shortest distance between each pixel and the edge, we can obtain a distance map to reflect the relationship between points and edges. Through geometry optimization and threshold processing, a scale map is also extracted to control the filter size, in which the point value indicates the size of the convolution kernel. Thus, the adaptivesize filter for each spatial pixel is obtained. Finally, KPCAreduced HSI and LiDAR are convolved by these structureaware mean filters for feature extraction and classification to verify the effectiveness of the proposed ShearSAF approach. For a better description, the detailed process of our ShearSAF is shown in Figure 1, and the main contributions of this article are summarized as follows: (i)First of all, we design a structureaware filtering scheme for HSI and LiDAR feature extraction. These locally adaptivesize filters have a smallsize kernel near the edge to protect the information from being disturbed by nearby objects, while the kernel size is larger at the center of the area to filter noise and abnormal points. Since structureaware filter size could reflect the spatial structure of objects more precisely, the convolutional procedure can be more elegant and the discriminability of extracted features can be promoted. To our best knowledge, this is the first time a method of extracting structureaware features for HSI and LiDAR data processing is discussed(ii)Second, shearlet transform is employed for structure description on both HSI and LiDAR data. After dividing the shearlet features into lowfrequency and highfrequency energy parts, they are converted into area feature and texture feature, respectively. Since the local region and edge structure of objects can be well characterized by the extracted area and texture features, both features are taken into account (conventional methods only use either feature for edge detection and noise reduction), which provide a valuable guidance for the subsequent superpixel fusion(iii)Third, we proposed an elaborate design to effectively combine HSI and LiDAR information in a distance measurement for gradual region merging. The welldesigned evaluation criterion integrates the spectral (from Euclidean distance viewpoint), area, and texture (from statistical distance viewpoint) features, which provides a more comprehensive description of the real scene and could greatly increase the structural representation ability of objects(iv)Finally, all the parameters can be either preset and kept unchanged for different HSI and LIDAR datasets or heuristically determined; therefore, the robustness and generalization ability of the proposed ShearSAF method can be ensured. Meanwhile, since the structureaware feature extraction procedure is unrelated to the training samples and only needed to be calculated once, our ShearSAF also has high efficiency. The source codes of this work will be available at http://jiasen.tech/papers/for the sake of reproducibility
We would like to point out that the structureaware filtering design presented here is essentially very general and can be easily utilized for other features (such as morphological and attribute features). Experimental results with several stateoftheart methods on three real datasets demonstrate the effectiveness of the proposed ShearSAF approach.
The organization of this paper is as follows. Section 2 introduces related works about shearlet transform, SNIC superpixel algorithm, and KPCA. Section 3 describes the process of designing the shearletbased structureaware filtering in detail. Section 4 presents the experimental data and two ablation experiments. The experimental results of the proposed ShearSAF method with a number of alternatives are given in Section 5. Finally, Section 6 provides the conclusions of this paper.
2. Related Works
This section introduces the theory of shearlet transform, SNIC superpixel algorithm, and KPCA.
2.1. Shearlet Transform
Shearlets have received great attention for their optimal approximation properties in representing images and were first introduced in [43, 82]. The shearlet is regarded as a multiscale representation system and possesses the ability to capture the direction and geometric features [50, 53]. Suppose there exist a dilation matrix and a shear matrix where and are called the dilation factor and shear factor, respectively, the shearlet mother function is expressed as where is a translation factor, and is the coordinate in the spatial domain. and , respectively, control the scale and orientation of the shearlet. In the frequency domain, function is written as and can be factorized as where and are the two coordinates in the frequency domain. Consequently, in the frequency domain can be expressed as follows: where is the coordinate in the frequency domain, which is just the concatenation of and . and are the continuous wavelet function and bump function, respectively, meeting a certain support domain.
In fact, the shearlet compact framework in the frequency domain can be divided into three parts: the lowfrequency region, horizontal cone, and vertical cone. Notably, the above factorization (4) and equation (5) are used for the horizontal cone (in the following section, we renamed in equation (5)as ). Alternatively, in the vertical cone, they are denoted as
For the lowfrequency domain, the shearlet has neither dilation nor shear, so it can be written as where is called the scaling function. The frequency support of the lowfrequency region, representative horizontal cone, and representative vertical cone are shown in Figure 2.
In practice, the above continuous shearlet needs to be discretized in digital image processing. Let us consider a singleband digital image mapped into a twodimensional grid; therefore, the related parameters, such as , , and , can be computed as where is the number of scales. By substituting these discretized parameters (9) into equations (5), (7), and (8), three shearlets in the frequency domain can be rewritten as (the fixed factor is ignored) where and . To avoid overly complicated mathematical formulas and theoretical derivations, we directly give the final expression of the shearlet transform: where represents twodimensional inverse Fourier transform and is the response of in the frequency domain. In this equation, the support domains of the first, second, and third parts are the lowfrequency region, horizontal cone, and vertical cone, respectively. Additional details of shearlet construction and derivation can be found in [83].
2.2. Multichannel SNIC
SNIC represent the simple noniterative clustering superpixel algorithm, which has both low computational complexity and good segmentation results [66]. Through parameter control in SNIC, the number of superpixel and the weight of spectralspatial information can be set manually. In this article, the multichannel SNIC is adopted, which is more suitable for multichannel hyperspectral image segmentation. Specifically, SNIC starts from the initialization of the centroid and adds the elements into a priority queue. Next, when an element is taken from the priority queue, the surrounding pixels are marked and added to the queue. At the same time, the coordinates of the centroid are updated accordingly. This process will continue until the queue is empty. The comparison criterion of the priority queue is the distance between the elements and centroid , which is defined as follows: where and represent the spectral vector and space coordinates, respectively, and and are their corresponding weights.
2.3. SuperpixelGuided KPCA
As a generalization of principal components analysis (PCA), KPCA maps the input data into a highdimensional or Hilbert space by a mapping function and can well reflect the complex structures in the corresponding highdimensional space [84, 85]. However, in the sample selection strategy of original KPCA, random sampling and conducting all sample (there may be a million levels) strategies usually cause feature degradation and computational explosion. Therefore, a superpixelbased KPCA scheme by taking advantages of superpixel homogeneity is applied for dimension reduction [86].
Specifically, in terms of a raw HSI ( and are, respectively, the spatial and spectral dimensions), SNIC is applied (here, the number of superpixels is simply as ) and a superpixel map is obtained. Then, the mean vector of each region is calculated and forms the input sample set . Subsequently, the mapping function converts the input lowdimensional sample data into a highdimensional feature . Let us consider the following covariance matrix:
Therefore, characteristic equation can be denoted as where is a diagonal matrix composed of eigenvalues arranged from large to small and is an matrix composed of corresponding eigenvectors. For convenience of calculation, an ingenious substitution is made for : where is the coefficients matrix, which is used for explaining the relationship between and . Next, through simultaneously multiplying a matrix by the lefthand side of equation (16), we can obtain where is known as the kernel function. For equations (18) and (19), can be optimized into , which is regarded as a new characteristic equation. Finally, for each spectral vector of , the dimensionality reduction process can be expressed as where represents the reserved dimension.
3. ShearletBased StructureAware Filtering
We now consider the proposed shearletbased structureaware filtering design. Briefly, in order to take full advantage of the structural information of objects, the following idea is complied: when a certain point approaches the edge, the size of the filter will shrink, while when it is at the center, the size of the filter will enlarge. Our ShearSAF approach to obtain this adaptivesize filter involves the following four steps: preprocessing, shearletbased feature extraction, gradual region merging, and structureaware filter designing. Table 1 summarizes some important mathematical symbols used in this paper for additional clarification.

3.1. Preprocessing
This part mainly includes two aspects: superpixelguided KPCA for HSI dimension reduction and multichannel SNIC for superpixel oversegmentation.
3.1.1. SuperpixelGuided KPCA for HSI Dimension Reduction
For highdimensional HSI data with complex structures, KPCA has superior capabilities for dimensionality reduction. In our superpixelguided KPCA, the radial basis function (RBF) kernel is adopted and 99.5% energy is maintained in the principal components. Afterward, the informationfocused hyperspectral data is attained.
3.1.2. Multichannel SNIC for Superpixel Oversegmentation
SNIC is an emerging superpixel algorithm containing both low computational complexity and good segmentation results. The multichannel SNIC is applied on raw HSI data , and an initial oversegmentation map can be obtained, in which the homogeneity of each superpixel can be largely ensured. Instead of directly providing the number of superpixels, the number of pixels inside each superpixel is set as (the value of this parameter will be discussed in the experimental section), and weight parameters and are set as and 0.5 as default, respectively.
3.2. ShearletBased Feature Extraction
The shearlet is a tight framework with clear mathematical meaning that provides directional scale decomposition. In the highfrequency part, it can effectively obtain texture information and is thus used for edge detection and corner detection [55]. Alternatively, in the lowfrequency part, it can effectively obtain area information and is thus used for denoising [87].
Let us start with the singleband LiDAR data . The number of scale in equation (9) is set as 3 by default in our shearlet transform, while the construction of (i.e., the Meyer wavelet function) and (i.e., the bump function) is the same as [83].
In the shearlet compact frame, when the scale is 0, 1, and 2, respectively, there are 4, 8, and 16 support cones with different directions (including the horizontal cone and vertical cone). Among them, the 16 highest frequency part is related to the texture information, while the remaining parts can well characterize the area information; therefore, the shearletbased frequency features are divided into two parts as follows: where and , respectively, represents the highest frequency and the rest frequency information of the LiDAR data .
Furthermore, for the highest frequency parts , i.e., , the sum of coefficients in 16 directions is used as the measure of texture feature . For the other remaining 13 frequency parts , including the lowfrequency region and remaining highfrequency cones with and , the inversion of the shearlet transform is applied to acquire the area information . They can be computed as follows: where is the inversion operator and is the absolute value operator.
Correspondingly, for the KPCAreduced HSI data , each component performs the above frequency separation process, and then the results are concatenated. Therefore, the texture information and area information for HSI data can be expressed as where and contain the texture and area information of the th component , respectively. The detailed procedure of shearletbased feature extraction is displayed in Figure 3.
3.3. Gradual Region Merging
The previous steps provide an oversegmentation map () and three different types of description of objects, including spectral information ( and ), area information ( and ) and texture information ( and ). Apparently, it is advantageous to investigate the three features in an unified framework to guide the fusion process of the oversegmentation map .
Specifically, the oversegmentation map is mapped onto an undirected graph. Each superpixel is regarded as a node, and there exists edge only when two superpixels are adjacent. In order to make the description of the proposed progressive region merging process more clear and intuitive, it is divided into two parts: merging cost definition and region merging procedure. Figure 4 illustrates the gradual region merging procedure.
3.3.1. Merging Cost Definition
It can be easily found that the merging cost between two adjacent regions is not only related to the size of the region and the length of shared edges but also related to the similarity among the three different kinds of features, including spectral, area, and texture features. Suppose there are two adjacent regions and in the oversegmentation map , the distance between the two regions in the spectral domain is calculated by the mean gaps and can be defined as follows: where and represent the mean value of region in the th band of KPCAreduced HSI data and LiDAR data , respectively.
On the other hand, for the area information and texture information , it is necessary to adopt statistical manner to measure the region distance since all the area and the texture features are extracted in the frequency domain. Taking the LiDAR texture information as an example, it is firstly normalized into the interval [0, 256] for convenience. Then, we select interval endpoints (, including 0 and 256) to divide the whole interval into parts with the same length. At the same time, these endpoints are considered as bins in the histogram and its value in region is denoted as follows (this process can be seen in Figure 5): where represents the LiDAR texture information value in region . Thus, the frequency histogram in region is calculating by
After obtaining the frequency distribution in each region, the statistic distance measurement is applied for two adjacent regions and :
Thus, the LiDAR texture distance can be expressed as . Similarly, the statistical distance of LiDAR area feature , denoted as , can be computed in the same way. Concerning and extracted from the hyperspectral feature , the above statistical calculation procedure is applied on each band, and , , ..., and , , ..., can be obtained correspondingly.
Since the information contained in each spectral band of and LiDAR is inconsistent, it is necessary to fuse these distance measures in a weighted manner, which is computed based on the homogeneity of each segmentation area [88]. Specifically, if the segmentation area has good homogeneity, it should have high weight. Conversely, if the segmented area is heterogeneous, the weight value should be small. In our framework, a locally adaptive approach is implemented. where is the maximal frequency in the corresponding band, while and represent the area distance and texture distance, respectively.
As indicated so far, the dissimilarity of two adjacent region can be defined by where is the balance factor addressing that the spectral distance and statistical distance are at the same order of magnitude. In our experiment, is set as and the interval endpoints is set as .
As mentioned before, the merging cost of region and is not only related to the dissimilarity but also related to the size of the region and the length of shared edges. The smaller the size of the region and the larger the shared boundary between the two regions, the easier the two regions merge together. Based on this point of view, the merging cost of region and is defined as where is the length of shared boundary of region and , while and represent the number of pixels in regions and , respectively.
3.3.2. Region Merging Procedure
A progressive region merging technique is introduced to effectively alleviate the conflict between oversegmentation and undersegmentation of superpixels and largely guarantee the homogeneity of the final merging map. Oversegmented superpixels ensure the homogeneity of each region, while region merging that gradually combines two adjacent similar regions does not introduce an undersegmentation problem with the help of shearlet extracted features.
Specifically, for the initial oversegmentation map , a data structure is utilized to record each pair of adjacent nodes with their merging cost, and a priority queue (denoted as ) is built to store all these structure. Based on the queue, the structure with the smallest cost is chosen and the corresponding two regions (called and for similarity) are obtained as well. Subsequently, all structures related to and in the priority queue are removed. Through adding all points in region into region , some new structures are created to record the reconstructed region and its neighborhood, which are then put into the priority queue. This progressive region merging procedure is carried out until the number of regions reaches a predefined value . At last, the oversegmentation map is gradually transformed into a final merging map .
3.4. StructureAware Filter Designing
For a point close to the edge, the surrounding labels are more likely to be different for object classifications, indicating that the neighboring spatial relationship should have less consideration. However, when it is located in the center of a local region, the surrounding objects tend to be the same; thus, the neighboring spatial relationship should be paid more attention. This perspective motivates us to design an adaptive structureaware filter whose kernel size changes with the distance from a point to the edge.
In fact, it is difficult to obtain accurate edges between objects in HSIs due to the inherent lowspatial resolution of remote sensing images. Fortunately, through applying the welldesigned shearletbased gradual region merging scheme on the SNIC oversegmentation map , a final merging map with lower space constraint conflicts is thus achieved, in which the homogeneity of local regions is largely ensured. Meanwhile, the junctions between regions are regarded as edges. In particular, for each point in , its region boundary must be a continuous closed curve, which means the number of edge points is limited. Therefore, all spatial distances between this point and its region boundary can be calculated. The smallest value is selected to form the distance map . This process can be expressed by the following formula: where is the spatial position matrix of , represents a point of region boundary , , and represents the twodimensional spatial coordinates of points and , respectively.
However, the direction from different points to their nearest boundary point is not fixed, implying that directly using as the filtering size may cause the filter to be oversized and introduce some disturbing information of other ground objects. As we know, the diagonal of the square is longer than the other inner straight lines. In other words, as long as the diagonal length of the adaptivesize filter is less than , the filter centered by point will not exceed the boundary. Therefore, we convert the distance map into the socalled scale map :
In addition, when the point is at the center of the region, an overly large filter size may contain more outsideregion points, which could degrade the feature representation ability. Thus, a thresholdtruncated method is introduced:
In our experiments, the threshold is simply set as 55.
Figure 6 illustrates the three circumstances of filter size determination procedure. Concretely, the dotted frame centered on is the filter with , while the solid frame centered on is the filter with . For the point , the dotted frame and solid frame represent the filters without a threshold process and with threshold process, respectively. Clearly, the dotted frames of and are more precise for filter size than those two solid frames. Besides, for the region edge point such as , the filter size is only , which obeys our filter size calculation process as well.
A final note is that all the points in the scale map are assigned an odd value ranging from 1 to 55, indicating the filter size with each pixel. For each spatial pixel , the corresponding structureaware filter is formulated as where represents a matrix where all element values are 1. Obviously, can be considered as a mean filter with adaptive size for each spatial pixel, which can be visually seen in Figure 1. Hence, the obtained adaptivesize filter achieves structureaware based on the geometric position of the convolution center. This flexible filter can well protect the difference of different objects on the edge, while reducing the abnormal points in the center area.
3.5. Feature Extraction and Classification
Since the edges in the final merging map may not be accurate edges, classification errors can occur more frequently near the edge. Hence, the formulated structureaware filter is solely used for feature extraction rather than regularization of classification results. Taking the LiDAR data as an example, the filtering process on each spatial pixel can be expressed as follows: where is the convolution operator. After applying the convolution procedure on each pixel in , the feature can be extracted. Similarly, through applying the convolution procedure on each band of , the corresponding feature cube can be obtained. By concatenating both the features and along the spectral direction, the final feature can be thus achieved.
During classification, random forest (RF) classifier is chosen, which can not only achieve high classification accuracy but also possess fast computation speed. Meanwhile, RF has advantages for antioverfitting and antinoise. Notably, RF consists of two steps: randomly selecting repeatable training subsets and building multiple decision trees, which involves bagging sampling techniques. In the experiments, the default subspace of RF is the floor of the logarithmic value of the features, and the number of trees in the forest is set as 500. Finally, by employing the RF classifier on the extracted feature , the classification map can be thus obtained. At last, the pseudocode of the proposed ShearSAF approach for HSI and LiDAR feature extraction and classification is outlined in Algorithm 1.

The computational complexity of our proposed ShearSAF can be divided into three parts. Firstly, the complexity of SNIC and SNICguided KPCA are and , respectively, while computational complexity of shearlet transform is . Secondly, since the number of adjacent nodes for a region is limited, the computational complexity of priority queue is ( in our experiments). Thus, the complexity of the region merging process is . Finally, the computational complexity of the convolution process and RF classification is and , respectively.
4. Experimental Data and Ablation Analysis
In this section, three real HSI and LiDAR datasets in diverse areas are used to evaluate the effectiveness of the proposed ShearSAF framework. Firstly, the three HSI and LiDAR datasets are presented. Secondly, the parameters contained in ShearSAF are analyzed. Thirdly, two ablation experiments are carried out to validate the advantage of the welldesigned structureaware filtering scheme and the superiority of the proposed ShearSAF method over other related filters.
4.1. Datasets
4.1.1. Houston Dataset
The first dataset is captured over the University of Houston campus [18], in which the Houston HSI contains 144 spectral bands ranging from 380 to 1050 nm. Each band contains 349 1905 pixels with 2.5 m of spatial resolution. Meanwhile, the corresponding LiDAR data has the same spatial size with the height information of surface materials. Fifteen landcover classes and 15,029 labeled samples are given in the groundtruth image, as shown in Table 2 and Figure 7.

4.1.2. Trento Dataset
The second dataset is collected over the south of Trento, Italy, consisting of 63 spectral bands that range from 400 to 980 nm [89]. Each band is 600 166 pixels with a spatial resolution of 1 m. Likewise, the LiDAR data only has one band of the same spatial size. The six landcover classes and 30,414 labeled pixels are listed in Table 3 and Figure 8.

4.1.3. MUUFL Gulfport Dataset
The third dataset was collected over the Gulf Park Campus of the University of Southern Mississippi [90, 91]. The spatial size of both HSI and LiDAR data is 325 220 with a spatial resolution of 1 m. After removing eight noisy bands from the original 72 bands of the HSI data, 64 spectral bands are employed in the experiment. The details are given in Table 4 and Figure 9.

4.2. Parameter Setting
In our proposed ShearSAF framework, there are several parameters that should be carefully specified. Concerning the scale parameter () for Shearlet transform, it is set as according to their original paper. With respect to the weight parameters for SNIC, and , they, respectively, correspond to the spectral and spatial dimension and thus are set as and . Meanwhile, the number of internal endpoints is set as to facilitate the subsequent statistic distance computation. For the dimension in KPCA, the corresponding number should guarantee that 99.5% energy is reserved.
In fact, there are two parameters in the gradual region merging procedure that are necessary to be determined: the initial number of pixels inside superpixel block in the oversegmentation map and the number of regions in the final merging map . In fact, it is difficult to obtain the final number of homogeneous regions as a fixed value for different datasets because of the impacts of object distributions, spatial complexity, and so on. Here, we propose a heuristic way to calculate , which contains the class number (), spatial complexity (), and space size ( and ). where is the floor operator. is defined as follows: the Sobel operator is adopted on the three normalized principal components of HSI and normalized LiDAR to calculate their gradients, and then, the sum of absolute values divided by is used as the spatial complexity. By this heuristic method, is , , and and is , , and for Houston, Trento, and MUUFL Gulfport datasets, respectively.
To prove the effectiveness of our strategy, we conduct a series of experiments to track the process of gradual region merging and record the overall accuracy (OA, which is computed by dividing the correctly predicted samples with the number of testing ones) varying with different and . Figure 10 shows that the OA varies with the parameter and for the Houston, Trento, and MUUFL Gulfport dataset. Here, the parameter ranges from 20 to 100 with the steps of 10, and then, the parameter ranges from 50 to 600 with the steps of 50 for the Trento dataset and MUUFL Gulfport dataset, while it ranges from 1000 to 6000 with steps of 500 for the Houston dataset. As far as the small sample set scenario is concerned, only 3, 5, 10, and 15 samples per class are randomly chosen from the labeled set, and the remaining labeled samples are used for testing. Each experiment is executed 20 times to obtain the mean value. It can be seen from Figure 10 that the OA is better when are, respectively, 3500, 100, and 200 for the three datasets, which are close to the values that are heuristically calculated by equation (39).
Two more observations can be found from Figure 10 First, the OA increases first and then decreases with decreasing . This is reasonable because the adjacent regions with similar objects are merged to improve feature performance at the beginning, while two adjacent areas with different objects are merged after reaches the critical value, leading to a decline in classification performance. The second is that OA has a slight increase with the decrease in in the three datasets. In fact, the parameter is used to ensure the homogeneity of each oversegmented region so that too many pixels inside the superpixel region would decrease the homogeneity. In the experiment, is set as 50 for the three datasets, which not only keeps the region homogeneity but also promotes the calculation speed in region merging. To be more clear, Figure 11 illustrates the result of the gradual region merging procedure on the three datasets. It can be easily observed the structure information of various materials can be well represented.
At last, the parameter setting in the proposed ShearSAF approach is summarized in Table 5. Apparently, all the parameters included can be either preset and kept unchanged for different experimental datasets (such as , , and ) or heuristically computed (such as and ); hence, the robustness and generalization ability of ShearSAF can be guaranteed, which is a distinct advantage of the proposed ShearSAF approach.
4.3. Ablation Analysis
In this part, two ablation experiments are carried out to validate the effectiveness and superiority of the proposed structureaware filtering scheme. On the one hand, our ShearSAF is compared with fixedsize mean filters, whose kernel sizes range from 1 to 55 with a step of 2. That is, the features are obtained by convolving the KPCAreduced HSI and LiDAR with the mean filter that has the fixed spatial size all the time, and the RF classifier is then employed. Similarly, the experiment is executed 20 times due to the small training sample scenario, and the OA of the mean filters with different sizes on the Houston, Trento, and MUUFL Gulfport datasets is illustrated in Figure 12. It should be mentioned that the four curves from the bottom to the top (blue, red, green, and black) indicate the performance of the mean filters with a fixed size under the conditions of 3, 5, 10, and 15 training samples per class as the training set, respectively, and correspondingly, the horizontal dotted lines from the bottom to the top (blue, red, green, and black) represent the performance of ShearSAF with the same training set, respectively.
It can be easily observed from Figure 12 that the OA of the curve rises when the filter size is relatively small. Analytically, the ability to filter noise and abnormal points is improved as the kernel size increases for considering more neighborhood relations. Then, the OA drops when the filter size increases continuously. This is because the continuous increase in the filter size will damage the feature performance at the junctions of objects. Moreover, it can be clearly seen that our ShearSAF approach always shows the best performance, implying that our structureaware filter design does protect the edges and filter the noise in the center region. Besides, it is worth mentioning that the kernel size of the optimal filter is inconsistent for different datasets. For the three real datasets concerned here, as illustrated in Figure 12, the optimal filter size is 9, 7, and 3 for Houston, Trento, and MUUFL Gulfport, respectively, and thus, the filter size is hard to be determined in advance in practice. Alternatively, our structureaware filter design can automatically adjust the filter size according to the welldesigned scale map and achieve higher accuracy, indicating the advantage and feasibility of the proposed ShearSAF approach.
Alternatively, our structureaware design with other filters is also examined, as illustrated in Figure 13. Here, both the Gaussian and Gabor filters are taken into consideration. Specifically, the ShearSAFGaussian means that twodimensional (2D) Gaussian with structureaware size is applied on the stacked HSI and LiDAR data. In other words, we obtain the scale map in the same way as the ShearSAF, and each point in the obtained scale map represents the corresponding Gaussian filter size. Then, the structureaware Gaussian filters are convolved with the stacked HSI and LiDAR data to achieve the related features. At last, the RF classifier is utilized for classification. Similarly, a series of 2D Gabor filters (four scales and six orientations) with adaptive spatial size is applied on the stacked HSI and LiDAR data for feature extraction, called ShearSAFGabor. It can be seen from Figure 13 that ShearSAFGabor performs better than ShearSAFGaussian on the Trento dataset, while the opposite situation can be observed in the Houston and MUUFL Gulfport datasets. This is reasonable since the spatial distribution of objects in the Trento dataset (as shown in Figure 8) is more regular than the rest two datasets (as shown in Figures 7 and 9); the features obtained by the 2D Gabor filters with various orientations and scales can be more specific than those extracted by the Gaussian filter. Furthermore, the proposed ShearSAF constantly achieves the best results on the three datasets all the time, validating the importance and suitability of the simplest mean filter for our ShearSAF approach.
5. Experimental Results
In this section, a number of stateoftheart feature extraction and fusion algorithms are incorporated to compare with the proposed ShearSAF approach. Firstly, two simplest methods, including the RF classifier on the raw HSI data (named as RawH) and on the concatenation of both HSI and LiDAR data (named as Raw), are used as the benchmark. Secondly, three deep learningbased methods, 3DCNN (3D convolutional neural network [34], a classic deep learningbased method that can simultaneously capture spatialspectral joint information), miniGCN (minibatch graph convolutional network [92], an emerging deep learningbased method that allows to train largescale GCNs in a minibatch fashion), and SAELR (stacked autoencoder with logistic regression [28], an autoencoderbased deep learning method that can preserve abstract and invariant information in deeper features), are taken into consideration for HSI and LiDAR data classification. Thirdly, five widely used feature extraction and fusion algorithms, that is, NMFL (nonlinear multiple feature learningbased classification [93] that explores different types of available features in a collaborative and flexible way), EMAP (extended morphological attribute profile [94]), GGF (generalized graphbased fusion [22]), EPCA (a novel ensemble classifier [95]), and OTVCA (orthogonal total variation component analysis [96] that can get the best lowrank representation and show strong antinoise ability), are also conducted on both HSI and LiDAR data. For the classification issue, 3 to 15 samples per class are randomly selected from the labeled dataset to form the training set, while the rest are used for the testing set. At the same time, each experiment is run twenty times in order to reduce the effects of random factors. Both the mean values and standard deviations are reported. Except the OA measure, the kappa coefficient (), which reflects the impact of classes, is also adopted to evaluate the classification performance.
Figures 14–16 show the OA and of the eleven compared methods including the RawH, Raw, 3DCNN, miniGCN, SAELR, NMFL, EPCA, GGF, EMAP, OTVCA, and our ShearSAF when the training set ranges from 3 to 15. It should be noted that the OA obtained by a single LiDAR data is much smaller than that of other methods; thus, it has not been added for comparison. Generally, the classification performs better as the number of training sample grows for the three datasets. Compared to the Raw method, the performance of the RawH that just uses HSI data shows lower classification accuracies, confirming that the supplement of LiDAR information can improve the performance of HSI classification. Specifically, HSI data provides abundant spectral information for distinguishing materials with different physical properties, while LiDAR provides shape and height information that can be used to distinguish different targets of the same material. For the rest compared methods, the proposed ShearSAF outputs the highest results all the time, which is reasonable since the designed structureaware filters can reduce its size to avoid interclass interference at the near edge and introduce more neighborhood information to reduce the environmental impact at the region center.
In addition, it should be noted that deep learningbased methods behaved badly for three datasets for limited training samples. Specifically, SAELR gives the worst performance on the Houston and MUUFL Gulfport dataset, while 3DCNN performed the worst on the Trento dataset and the second worst on the Houston dataset. As for miniGCN, it is also lower than the traditional classification method in most cases. Analytically, the deep learningbased methods usually need a great quantity of training samples to constantly modify the magnanimous parameters in the process of training model. But the small sample set in the experiments significantly limits the performance of deep learningbased methods. Meanwhile, the training process of deep learning methods requires considerable time consumption as well.
Furthermore, when there are only five training samples per class, the classification performances, including each class accuracy, OA, and of the eleven methods, have been summarized in Tables 6–8 for the Houston, Trento, and MUUFL Gulfport datasets, respectively. It can be seen that ShearSAF outputs the best performance in most cases, which favors the superiority of our ShearSAF method. In more detail, considering the C5 class (vineyard) of the Trento dataset, it can be found from the groundtruth map (Figure 8) that the spatial distribution of C5 is very regular, and ShearSAF effectively filters the noise in the area and protects the edges; thus, the performance increases from 72.58% for the RAW method to 98.23% for our approach, as illustrated in Table 7. Alternatively, concerning the C10 class (yellow curbs) of the MUUFL Gulfport dataset in Table 8, it has scattering in the scene and is even hard to be seen in Figure 9. Although our method is not optimal in the C10 class, this structureaware filter does work for reducing its own size and keeping the target information from being interfered by neighboring objects. To illustrate, the groundtruth and the complete classification maps for all the three datasets of the eleven compared methods are shown in Figures 17–19. It can be easily observed that our ShearSAF approach is outstanding compared to the others, demonstrating the effectiveness of the proposed method.



Finally, when there are five training samples per class, the computation time is given in Table 9, which was recorded by a workstation with a 24core Intel processor at 2.20 GHz with 128 GB RAM. As expected, the deep learningbased method (3DCNN, miniGCN, and SAELR) takes more times than the others because model training and parameter optimization require considerable time. It can be observed that the time cost of our ShearSAF method is less than that of the other methods, which is mainly due to the irrelevance of the structureaware feature extraction procedure with the training set. That is to say, the feature extraction procedure in ShearSAF method is executed only once, while RF classifier has low computational cost; therefore, the proposed ShearSAF method is computationally efficient and is applicable for remote sensing image with large spatial size, which proves the superiority of our method once again.

6. Conclusions
In this paper, a newly designed shearletbased structureaware filtering approach has been proposed for HSI and LiDAR feature extraction. Specifically, the shearlet transform is implemented on the KPCAreduced HSI and LiDAR data for area and texture feature extraction. Then, the spectral, area, and texture features are used to guide the gradual region merging procedure, which converts the initial oversegmentation map into a final merging map, and the spatial structure of objects can be well characterized. By calculating the edge distance in the final merging map, the scale map can be acquired, which is utilized to adaptively select the filter size for convolution. Finally, the RF classifier is used for classification.
In summary, the most important contribution of this article involves the design of the structureaware filtering design. In this process, we innovatively proposed a shearletbased area and texture feature representation that could effectively measure the distance between two adjacent areas. At the same time, the structureaware filter is constructed in an elegant manner to ensure that the pixel near the edge could have a smallsize kernel to protect the information from being disturbed by nearby objects, while the point at the center of the area could have a larger kernel size to filter noise and abnormal points. Two ablation experiments with various fixedsize mean filters and other adaptivesize filters (Gaussian and Gabor) demonstrate the effectiveness of the proposed ShearSAF method. Meanwhile, comparison with several stateoftheart methods (3DCNN, miniGCN, SAELR, NMFL, EPCA, GGF, EMAP, and OTVCA) constantly shows the superiority of the proposed ShearSAF approach. At last, we still want to emphasize that the structureaware filtering design presented here can be further embedded with other kinds of feature. For instance, manifold learningbased methods, such as LLE (locally linear embedding) and ISOMAP (isometric mapping), can be used for dimension reduction and feature extraction. Furthermore, structureaware filter design pattern can be integrated with other centralbased filters (including Gaussian and median filter) to extract discriminative feature and improve the robustness of the whole framework. All these aspects are worthy of more attention.
Data Availability
The codes of this work are available at http://jiasen.tech/papers/.
Conflicts of Interest
The authors declare no conflicts of interest.
Authors’ Contributions
S. Jia, Z. Zhan, and M. Xu proposed the method. Z. Zhan implemented the experiments. S. Jia, Z. Zhan, and M. Xu wrote the manuscript. All authors read and approved the final manuscript.
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China under Grant 41971300 and Grant 61901278; in part by the Key Project of Department of Education of Guangdong Province under Grant 2020ZDZX3045; and in part by the Shenzhen Scientific Research and Development Funding Program under Grant JCYJ20180305124802421 and Grant JCYJ20180305125902403.
References
 J. Richards, Remote Sensing Digital Image Analysis: An Introduction, Springer, 2013. View at: Publisher Site
 G. CampsValls, D. Tuia, L. GómezChova, S. Jiménez, and J. Malo, “Remote Sensing Image Processing,” Synthesis Lectures on Image, Video, and Multimedia Processing, vol. 5, no. 1, pp. 1–192, 2011. View at: Publisher Site  Google Scholar
 J. BioucasDias, A. Plaza, N. Dobigeon et al., “Hyperspectral unmixing overview: geometrical, statistical, and sparse regressionbased approaches,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 5, no. 2, pp. 354–379, 2012. View at: Publisher Site  Google Scholar
 J. BioucasDias, A. Plaza, G. CampsValls, P. Scheunders, N. Nasrabadi, and J. Chanussot, “Hyperspectral remote sensing data analysis and future challenges,” IEEE Geoscience and Remote Sensing Magazine, vol. 1, no. 2, pp. 6–36, 2013. View at: Publisher Site  Google Scholar
 M. Khodadadzadeh, J. Li, S. Prasad, and A. Plaza, “Fusion of hyperspectral and LiDAR remote sensing data using multiple feature learning,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 8, no. 6, pp. 2971–2983, 2015. View at: Publisher Site  Google Scholar
 M. Kishore and S. Kulkarni, “Approches and challenges in classification for hyperspectral data: a review,” in 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), pp. 3418–3421, Chennai, India, 2016. View at: Publisher Site  Google Scholar
 S. Jia, Z. Zhu, L. Shen, and Q. Li, “A twostage feature selection framework for hyperspectral image classification using few labeled samples,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 7, no. 4, pp. 1023–1035, 2014. View at: Publisher Site  Google Scholar
 Y. Zhou, J. Peng, and C. Chen, “Dimension reduction using spatial and spectral regularized local discriminant embedding for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 53, no. 2, pp. 1082–1095, 2015. View at: Publisher Site  Google Scholar
 P. Hartzell, C. Glennie, and S. Khan, “Terrestrial hyperspectral image shadow restoration through lidar fusion,” Remote Sensing, vol. 9, no. 5, p. 421, 2017. View at: Publisher Site  Google Scholar
 S. Sun and C. Salvaggio, “Aerial 3d building detection and modeling from airborne LiDAR point clouds,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 6, no. 3, pp. 1440–1449, 2013. View at: Publisher Site  Google Scholar
 C. Paris and L. Bruzzone, “A threedimensional modelbased approach to the estimation of the tree top height by fusing lowdensity LiDAR data and very high resolution optical images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 53, no. 1, pp. 467–480, 2015. View at: Publisher Site  Google Scholar
 P. Ghamisi and B. Höfle, “LiDAR data classification using extinction profiles and a composite kernel support vector machine,” IEEE Geoscience and Remote Sensing Letters, vol. 14, no. 5, pp. 659–663, 2017. View at: Publisher Site  Google Scholar
 J. Rau, J. Jhan, and Y. Hsu, “Analysis of oblique aerial images for land cover and point cloud classification in an urban environment,” IEEE Transactions on Geoscience and Remote Sensing, vol. 53, no. 3, pp. 1304–1319, 2015. View at: Publisher Site  Google Scholar
 M. Soleimanzadeh, A. Karami, and P. Scheunders, “Fusion of hyperspectral and LiDAR images using nonsubsampled shearlet transform,” in IEEE Int. Geosci. Remote Sens. Symp. (IGARSS), pp. 8873–8876, Valencia, Spain, 2018. View at: Google Scholar
 M. Zhang, P. Ghamisi, and W. Li, “Classification of hyperspectral and LiDAR data using extinction profiles with feature fusion,” Remote Sensing Letters, vol. 8, no. 10, pp. 957–966, 2017. View at: Publisher Site  Google Scholar
 B. Bigdeli and P. Pahlavani, “A Dempster Shaferbased fuzzy multisensor fusion system using airborne LiDAR and hyperspectral imagery,” International Journal of Remote Sensing, vol. 39, no. 21, pp. 7718–7737, 2018. View at: Publisher Site  Google Scholar
 C. Ge, Q. Du, W. Li, Y. Li, and W. Sun, “Hyperspectral and LiDAR data classification using kernel collaborative representation based residual fusion,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 12, no. 6, pp. 1963–1973, 2019. View at: Publisher Site  Google Scholar
 C. Debes, A. Merentitis, R. Heremans et al., “Hyperspectral and LiDAR data fusion: outcome of the 2013 GRSS data fusion contest,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 7, no. 6, pp. 2405–2418, 2014. View at: Publisher Site  Google Scholar
 Z. Zhong, B. Fan, K. Ding, H. Li, S. Xiang, and C. Pan, “Efficient multiple feature fusion with hashing for hyperspectral imagery classification: a comparative study,” IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 8, pp. 4461–4478, 2016. View at: Google Scholar
 B. Rasti, P. Ghamisi, and R. Gloaguen, “Hyperspectral and LiDAR fusion using extinction profiles and total variation component analysis,” IEEE Transactions on Geoscience and Remote Sensing, vol. 55, no. 7, pp. 3997–4007, 2017. View at: Publisher Site  Google Scholar
 W. Chen, X. Dai, B. Pan, and T. Huang, “A novel discriminant criterion based on feature fusion strategy for face recognition,” Neurocomputing, vol. 159, pp. 67–77, 2015. View at: Publisher Site  Google Scholar
 Wenzhi Liao, A. Pizurica, R. Bellens, S. Gautama, and W. Philips, “Generalized graphbased fusion of hyperspectral and LiDAR data using morphological features,” IEEE Geoscience and Remote Sensing Letters, vol. 12, no. 3, pp. 552–556, 2015. View at: Publisher Site  Google Scholar
 Z. Ye, S. Prasad, W. Li, J. Fowler, and M. He, “Classification based on 3D DWT and decision fusion for hyperspectral image analysis,” IEEE Geoscience and Remote Sensing Letters, vol. 11, no. 1, pp. 173–177, 2014. View at: Publisher Site  Google Scholar
 K. Schindler, “An overview and comparison of smooth labeling methods for landcover classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 50, no. 11, pp. 4534–4545, 2012. View at: Publisher Site  Google Scholar
 W. Liao, R. Bellens, A. Pizurica, S. Gautama, and W. Philips, “Combining feature fusion and decision fusion for classification of hyperspectral and LiDAR data,” in IEEE Geoscience and Remote Sensing Symposium, pp. 1241–1244, Quebec City, QC, Canada, 2014. View at: Publisher Site  Google Scholar
 R. Luo, W. Liao, H. Zhang, Y. Pi, and W. Philips, “Classification of cloudy hyperspectral image and LiDAR data based on feature fusion and decision fusion,” in 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp. 2518–2521, Beijing, China, 2016. View at: Publisher Site  Google Scholar
 R. Luo, W. Liao, H. Zhang et al., “Fusion of hyperspectral and LiDAR data for classification of cloudshadow mixed remote sensed scene,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 10, no. 8, pp. 3768–3781, 2017. View at: Publisher Site  Google Scholar
 Y. Chen, Z. Lin, X. Zhao, G. Wang, and Y. Gu, “Deep learningbased classification of hyperspectral data,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 7, no. 6, pp. 2094–2107, 2014. View at: Publisher Site  Google Scholar
 K. Makantasis, K. Karantzalos, A. Doulamis, and N. Doulamis, “Deep supervised learning for hyperspectral data classification through convolutional neural networks,” in 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp. 4959–4962, Milan, Italy, 2015. View at: Publisher Site  Google Scholar
 M. Zhang, W. Li, and Q. Du, “Diverse regionbased CNN for hyperspectral image classification,” IEEE Transactions on Image Processing, vol. 27, no. 6, pp. 2623–2634, 2018. View at: Publisher Site  Google Scholar
 A. Ben Hamida, A. Benoit, P. Lambert, and C. Ben Amar, “3D deep learning approach for remote sensing image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 8, pp. 4420–4434, 2018. View at: Publisher Site  Google Scholar
 L. Mou, P. Ghamisi, and X. X. Zhu, “Deep recurrent neural networks for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 55, no. 7, pp. 3639–3655, 2017. View at: Publisher Site  Google Scholar
 A. Ma, A. Filippi, Z. Wang, and Z. Yin, “Hyperspectral image classification using similarity measurementsbased deep recurrent neural networks,” Remote Sensing, vol. 11, no. 2, p. 194, 2019. View at: Publisher Site  Google Scholar
 Y. Li, H. Zhang, and Q. Shen, “Spectralspatial classification of hyperspectral imagery with 3d convolutional neural network,” Remote Sensing, vol. 9, no. 1, p. 67, 2017. View at: Publisher Site  Google Scholar
 P. Ghamisi, B. Höfle, and X. X. Zhu, “Hyperspectral and LiDAR data fusion using extinction profiles and deep convolutional neural network,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 10, no. 6, pp. 3011–3024, 2017. View at: Publisher Site  Google Scholar
 Q. Cao, Y. Zhong, A. Ma, and L. Zhang, “Urban land use/land cover classification based on feature fusion fusing hyperspectral image and LiDAR data,” in IGARSS 2018  2018 IEEE International Geoscience and Remote Sensing Symposium, pp. 8869–8872, Valencia, Spain, 2018. View at: Publisher Site  Google Scholar
 S. Yu, S. Jia, and C. Xu, “Convolutional neural networks for hyperspectral image classification,” Neurocomputing, vol. 219, pp. 88–98, 2017. View at: Publisher Site  Google Scholar
 B. Liu, X. Yu, P. Zhang, X. Tan, A. Yu, and Z. Xue, “A semisupervised convolutional neural network for hyperspectral image classification,” Remote Sensing Letters, vol. 8, no. 9, pp. 839–848, 2017. View at: Publisher Site  Google Scholar
 H. Wu and S. Prasad, “Semisupervised deep learning using pseudo labels for hyperspectral image classification,” IEEE Transactions on Image Processing, vol. 27, no. 3, pp. 1259–1270, 2018. View at: Publisher Site  Google Scholar
 B. Pan, Z. Shi, and X. Xu, “RVCANet: a new deeplearningbased hyperspectral image classification method,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 10, no. 5, pp. 1975–1986, 2017. View at: Publisher Site  Google Scholar
 B. Pan, Z. Shi, and X. Xu, “MugNet: deep learning for hyperspectral image classification using limited samples,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 145, pp. 108–119, 2018. View at: Publisher Site  Google Scholar
 K. Guo, D. Labate, W.Q. Lim, G. Weiss, and E. Wilson, “Wavelets with composite dilations and their MRA properties,” Applied and Computational Harmonic Analysis, vol. 20, no. 2, pp. 202–236, 2006. View at: Publisher Site  Google Scholar
 G. Easley, D. Labate, and W.Q. Lim, “Sparse directional image representations using the discrete shearlet transform,” Applied and Computational Harmonic Analysis, vol. 25, no. 1, pp. 25–46, 2008. View at: Publisher Site  Google Scholar
 S. Jia, L. Shen, J. Zhu, and Q. Li, “A 3D Gabor phasebased coding and matching framework for hyperspectral imagery classification,” IEEE Transactions on Cybernetics, vol. 48, no. 4, pp. 1176–1188, 2018. View at: Publisher Site  Google Scholar
 S. Jia, Z. Lin, B. Deng, J. Zhu, and Q. Li, “Cascade superpixel regularized Gabor feature fusion for hyperspectral image classification,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 5, pp. 1638–1652, 2020. View at: Publisher Site  Google Scholar
 E. Candès, L. Demanet, D. Donoho, and L. Ying, “Fast discrete curvelet transforms,” Multiscale Modeling & Simulation, vol. 5, no. 3, pp. 861–899, 2006. View at: Publisher Site  Google Scholar
 J. Ma and G. Plonka, “A review of curvelets and recent applications,” IEEE Signal Processing Magazine, vol. 27, 2011. View at: Google Scholar
 A. L. Da Cunha, J. Zhou, and M. N. Do, “The nonsubsampled contourlet transform: theory, design, and applications,” IEEE Transactions on Image Processing, vol. 15, no. 10, pp. 3089–3101, 2006. View at: Publisher Site  Google Scholar
 W. Lim, “The discrete shearlet transform: a new directional transform and compactly supported shearlet frames,” IEEE Transactions on Image Processing, vol. 19, no. 5, pp. 1166–1180, 2010. View at: Publisher Site  Google Scholar
 G. R. Easley, D. Labate, and W. Lim, “Optimally sparse image representations using shearlets,” in 2006 Fortieth Asilomar Conference on Signals, Systems and Computers, pp. 974–978, Pacific Grove, CA, USA, 2006. View at: Publisher Site  Google Scholar
 P. S. Negi and D. Labate, “3D discrete shearlet transform and video processing,” IEEE Transactions on Image Processing, vol. 21, no. 6, pp. 2944–2954, 2012. View at: Publisher Site  Google Scholar
 W. Lim, “Nonseparable shearlet transform,” IEEE Transactions on Image Processing, vol. 22, no. 5, pp. 2056–2065, 2013. View at: Publisher Site  Google Scholar
 Sheng Yi, D. Labate, G. R. Easley, and H. Krim, “A shearlet approach to edge analysis and detection,” IEEE Transactions on Image Processing, vol. 18, no. 5, pp. 929–941, 2009. View at: Publisher Site  Google Scholar
 K. Guo, D. Labate, and W.Q. Lim, “Edge analysis and identification using the continuous shearlet transform,” Applied and Computational Harmonic Analysis, vol. 27, no. 1, pp. 24–46, 2009. View at: Publisher Site  Google Scholar
 M. A. DuvalPoo, F. Odone, and E. De Vito, “Edges and corners with shearlets,” IEEE Transactions on Image Processing, vol. 24, no. 11, pp. 3768–3780, 2015. View at: Publisher Site  Google Scholar
 G. R. Easley, D. Labate, and F. Colonna, “Shearletbased total variation diffusion for denoising,” IEEE Transactions on Image Processing, vol. 18, no. 2, pp. 260–268, 2009. View at: Publisher Site  Google Scholar
 S. Häuser and G. Steidl, “Convex multiclass segmentation with shearlet regularization,” International Journal of Computer Mathematics, vol. 90, no. 1, pp. 62–81, 2013. View at: Publisher Site  Google Scholar
 Y. Li, L. Po, C. Cheung et al., “Noreference video quality assessment with 3d shearlet transform and convolutional neural networks,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 6, pp. 1044–1057, 2016. View at: Publisher Site  Google Scholar
 M. Zaouali, S. Bouzidi, and E. Zagrouba, “3D shearlet transform based feature extraction for improved joint sparse representation HSI classification,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 11, no. 4, pp. 1306–1314, 2018. View at: Publisher Site  Google Scholar
 H. Rezaei, A. Karami, and P. Scheunders, “Hyperspectral and multispectral image fusion based on spectral matching in the shearlet domain,” in IGARSS 2018  2018 IEEE International Geoscience and Remote Sensing Symposium, pp. 8070–8073, Valencia, Spain, 2018. View at: Publisher Site  Google Scholar
 A. Moore, S. Prince, J. Warrell, U. Mohammed, and G. Jones, “Superpixel lattices,” in 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, Anchorage, AK, USA, 2008. View at: Publisher Site  Google Scholar
 W. Wang, D. Xiang, Y. Ban, J. Zhang, and J. Wan, “Superpixel segmentation of polarimetric SAR images based on integrated distance measure and entropy rate method,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 10, no. 9, pp. 4045–4058, 2017. View at: Publisher Site  Google Scholar
 F. Meng, H. Li, Q. Wu, B. Luo, C. Huang, and K. N. Ngan, “Globally measuring the similarity of superpixels by binary edge maps for superpixel clustering,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 4, pp. 906–919, 2018. View at: Publisher Site  Google Scholar
 S. Patel and B. Kadhiwala, “Comparative analysis of cluster based superpixel segmentation techniques,” in 2018 2nd International Conference on Trends in Electronics and Informatics (ICOEI), pp. 1454–1459, Tirunelveli, India, 2018. View at: Publisher Site  Google Scholar
 R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk, “SLIC superpixels compared to stateoftheart superpixel methods,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 11, pp. 2274–2282, 2012. View at: Publisher Site  Google Scholar
 R. Achanta and S. Süsstrunk, “Superpixels and polygons using simple noniterative clustering,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4895–4904, Honolulu, HI, USA, 2017. View at: Publisher Site  Google Scholar
 Jianbo Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888–905, 2000. View at: Publisher Site  Google Scholar
 M. Liu, O. Tuzel, S. Ramalingam, and R. Chellappa, “Entropy rate superpixel segmentation,” in CVPR 2011, pp. 2097–2104, Colorado Springs, CO, USA, 2011. View at: Publisher Site  Google Scholar
 Z. Hu, Q. Zou, and Q. Li, “Watershed superpixel,” in 2015 IEEE International Conference on Image Processing (ICIP), pp. 349–353, Quebec City, QC, Canada, 2015. View at: Publisher Site  Google Scholar
 N. Zhang and L. Zhang, “SSGD: superpixels using the shortest gradient distance,” in 2017 IEEE International Conference on Image Processing (ICIP), pp. 3869–3873, Beijing, China, 2017. View at: Publisher Site  Google Scholar
 Y. Guo, L. Jiao, S. Wang, S. Wang, F. Liu, and W. Hua, “Fuzzy superpixels for polarimetric SAR images classification,” IEEE Transactions on Fuzzy Systems, vol. 26, no. 5, pp. 2846–2860, 2018. View at: Publisher Site  Google Scholar
 C. Wu, J. Zheng, Z. Feng et al., “Fuzzy SLIC: Fuzzy Simple Linear Iterative Clustering,” IEEE Transactions on Circuits and Systems for Video Technology, p. 1, 2020. View at: Publisher Site  Google Scholar
 S. Jia, X. Deng, J. Zhu, M. Xu, J. Zhou, and X. Jia, “Collaborative representationbased multiscale superpixel fusion for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 10, pp. 7770–7784, 2019. View at: Publisher Site  Google Scholar
 Q. Leng, H. Yang, J. Jiang, and Q. Tian, “Adaptive multiscale segmentations for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 58, no. 8, pp. 5847–5860, 2020. View at: Publisher Site  Google Scholar
 Linlin Shen and Sen Jia, “Threedimensional Gabor wavelets for pixelbased hyperspectral imagery classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 49, no. 12, pp. 5039–5046, 2011. View at: Publisher Site  Google Scholar
 F. Mirzapour and H. Ghassemian, “Multiscale Gaussian derivative functions for hyperspectral image feature extraction,” IEEE Geoscience and Remote Sensing Letters, vol. 13, no. 4, pp. 525–529, 2016. View at: Publisher Site  Google Scholar
 Y. Teng, Y. Zhang, Y. Chen, and C. Ti, “Adaptive morphological filtering method for structural fusion restoration of hyperspectral images,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 9, no. 2, pp. 655–667, 2016. View at: Publisher Site  Google Scholar
 S. Wu, J. Zhang, C. Shi, and W. Li, “Multiscale spectralspatial hyperspectral image classification with adaptive filtering,” in IGARSS 2018  2018 IEEE International Geoscience and Remote Sensing Symposium, pp. 2591–2594, Valencia, Spain, 2018. View at: Publisher Site  Google Scholar
 Z. Sun, Z. Zhang, Y. Chen, S. Liu, and Y. Song, “Frost filtering algorithm of SAR images with adaptive windowing and adaptive tuning factor,” IEEE Geoscience and Remote Sensing Letters, vol. 17, no. 6, pp. 1097–1101, 2020. View at: Publisher Site  Google Scholar
 Y. Yang, W. Wan, S. Huang, F. Yuan, S. Yang, and Y. Que, “Remote sensing image fusion based on adaptive IHS and multiscale guided filter,” IEEE Access, vol. 4, pp. 4573–4582, 2016. View at: Publisher Site  Google Scholar
 C. Kadam and S. B. Borse, “An improved image denoising using spatial adaptive mask filter for medical images,” in 2017 International Conference on Computing, Communication, Control and Automation (ICCUBEA),, pp. 1–5, Pune, India, 2017. View at: Publisher Site  Google Scholar
 D. Labate, W.Q. Lim, G. Kutyniok, and G. Weiss, “Sparse multidimensional representation using shearlets,” in Wavelets XI, San Diego, California, USA, Aug. 2005. View at: Publisher Site  Google Scholar
 S. Huser and G. Steidl, “Fast finite shearlet transform,” 2014, https://arxiv.org/abs/1202.1773. View at: Google Scholar
 M. Fauvel, J. Chanussot, and J. A. Benediktsson, “Kernel principal component analysis for the classification of hyperspectral remote sensing data over urban areas,” EURASIP Journal on Advances in Signal Processing, vol. 2009, no. 1, pp. 1–14, 2009. View at: Publisher Site  Google Scholar
 H. Halim, S. Isa, and S. Mulyono, “Comparative analysis of PCA and KPCA on paddy growth stages classification,” in 2016 IEEE Region 10 Symposium (TENSYMP), pp. 167–172, Bali, Indonesia, 2016. View at: Publisher Site  Google Scholar
 S. Jia, Z. Zhan, M. Zhang et al., “Multiple featurebased superpixellevel decision fusion for hyperspectral and LiDAR data classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 2, pp. 1437–1452, 2021. View at: Publisher Site  Google Scholar
 A. Karami, R. Heylen, and P. Scheunders, “Bandspecific shearletbased hyperspectral image noise reduction,” IEEE Transactions on Geoscience and Remote Sensing, vol. 53, no. 9, pp. 5054–5066, 2015. View at: Publisher Site  Google Scholar
 Z. Hu, Z. Wu, Q. Zhang, Q. Fan, and J. Xu, “A spatiallyconstrained color–texture model for hierarchical VHR image segmentation,” IEEE Geoscience and Remote Sensing Letters, vol. 10, no. 1, pp. 120–124, 2013. View at: Publisher Site  Google Scholar
 M. Zhang, W. Li, Q. Du, L. Gao, and B. Zhang, “Feature extraction for classification of hyperspectral and LiDAR data using patchtopatch CNN,” IEEE Transactions on Cybernetics, vol. 50, no. 1, pp. 100–111, 2020. View at: Publisher Site  Google Scholar
 P. Gader, A. Zare, R. Close, J. Aitken, and G. Tuell, MUUFL Gulfport hyperspectral and LiDAR airborne data set, University of Florida, Gainesville, 2013.
 X. Du and A. Zare, “Technical report: scene label ground truth map for MUUFL Gulfport data set,” Tech. Rep., University of Florida, Gainesville, 2017. View at: Google Scholar
 D. Hong, L. Gao, J. Yao, B. Zhang, A. Plaza, and J. Chanussot, “Graph convolutional networks for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, pp. 1–13, 2020. View at: Publisher Site  Google Scholar
 J. Li, X. Huang, P. Gamba et al., “Multiple feature learning for hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 53, pp. 1592–1606, 2015. View at: Google Scholar
 M. DallaMura, J. AtliBenediktsson, B. Waske, and L. Bruzzone, “Extended profiles with morphological attribute filters for the analysis of hyperspectral data,” International Journal of Remote Sensing, vol. 31, no. 22, pp. 5975–5991, 2010. View at: Publisher Site  Google Scholar
 J. Xia, N. Yokoya, and A. Iwasaki, “Fusion of hyperspectral and LiDAR data with a novel ensemble classifier,” IEEE Geoscience and Remote Sensing Letters, vol. 15, no. 6, pp. 957–961, 2018. View at: Publisher Site  Google Scholar
 B. Rasti, D. Hong, R. Hang et al., “Feature extraction for hyperspectral imagery: the evolution from shallow to deep: overview and toolbox,” IEEE Geoscience and Remote Sensing Magazine, vol. 8, no. 4, pp. 60–88, 2020. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2021 Sen Jia et al. Exclusive Licensee Aerospace Information Research Institute, Chinese Academy of Sciences. Distributed under a Creative Commons Attribution License (CC BY 4.0).