An Efficient Hyperspectral Image Retrieval Method: Deep Spectral-Spatial Feature Extraction with DCGAN and Dimensionality Reduction Using t-SNE-Based NM Hashing
Jing Zhang, Lu Chen, Li Zhuo, Xi Liang, Jiafeng Li
2018
Remote Sensing
Hyperspectral images are one of the most important fundamental and strategic information resources, imaging the same ground object with hundreds of spectral bands varying from the ultraviolet to the microwave. With the emergence of huge volumes of high-resolution hyperspectral images produced by all sorts of imaging sensors, processing and analysis of these images requires effective retrieval techniques. How to ensure retrieval accuracy and efficiency is a challenging task in the field of
more »
... pectral image retrieval. In this paper, an efficient hyperspectral image retrieval method is proposed. In principle, our method includes the following steps: (1) in order to make powerful representations for hyperspectral images, deep spectral-spatial features are extracted with the Deep Convolutional Generative Adversarial Networks (DCGAN) model; (2) considering the higher dimensionality of deep spectral-spatial features, t-Distributed Stochastic Neighbor Embedding-based Nonlinear Manifold (t-SNE-based NM) hashing is utilized to make dimensionality reduction by learning compact binary codes embedded on the intrinsic manifolds of deep spectral-spatial features for balancing between learning efficiency and retrieval accuracy; and (3) multi-index hashing in Hamming space is measured to find similar hyperspectral images. Five comparative experiments are conducted to verify the effectiveness of deep spectral-spatial features, dimensionality reduction of t-SNE-based NM hashing, and similarity measurement of multi-index hashing. The experimental results using NASA datasets show that our hyperspectral image retrieval method can achieve comparable and superior performance with less computational time. Remote Sens. 2018, 10, 271 2 of 20 data, and similarity measurement makes a feature match between the query image and other images in the database [6]. As we know, a hyperspectral image is represented as an intricate 3D image cube, which acquires light intensity for a large number of contiguous spectral bands (typically a few tens to several hundred) from ultraviolet to the microwave range, including not only the visual features, but also the most significant spectral features and spatial features [7] . Commonly, handcrafted features, such as spectral features, spatial features, and texture features, are customarily extracted for traditional hyperspectral image retrieval. However, hyperspectral imagery contains hundreds of bands, in which the most useful information can be concentrated in a few bands after hyperspectral image bands transform [8] . Due to the complexity of hyperspectral imagery, the feature representation of hyperspectral imagery requires a stronger descriptive ability [9] . Obviously, the traditional low-level handcrafted features are unhelpful for achieving this requirement [10, 11] . Deep Learning (DL) represents the latest research in artificial intelligence, combining low-level features to form more abstract high-level feature representations [12] . Some researchers have adopted DL networks, such as the Deep Belief Network (DBN) and the Convolutional Neural Network (CNN) to extract deep spectral-spatial features from hyperspectral images [13] [14] [15] , which show excellent performance under conditions of large-scale labeled samples. However, since there is no public labeled dataset for hyperspectral image retrieval, and hyperspectral images acquire the spectrum of a scene from hundreds of narrow wavelength ranges of the electromagnetic spectrum, including visible and invisible bands, the limited human visual system makes it very difficult to obtain labeled samples. Recently, as one of the most popular DL methods, the Deep Convolutional Generative Adversarial Networks (DCGAN) model can effectively learn the image features jointly in unsupervised (learning without labeled training samples) and supervised (learning with labeled training samples) ways when labeled data is scarce [16] . According to the characteristic of hyperspectral image, DCGAN could provide a new approach for extracting a stronger descriptive ability of hyperspectral image features with limited data. However, the deep feature usually has up to thousands of dimensionalities, which would further aggravate the problem of "curse of dimensionality", especially for hyperspectral images, greatly affecting the retrieval performance [17] . To solve this problem, various dimensionality reduction methods are developed, such as Principal Component Analysis (PCA), Locally Linear Embedding (LLE) and Locality Sensitive Hashing (LSH) [18] [19] [20] . Hashing techniques that map the high-dimensional data points to low-dimensional compact binary codes have attracted considerable attention due to their computational and storage efficiency [18] . Manifold learning-based hashing techniques, including nonlinear manifold learning and linear manifold learning-based hashing, in contrast, are better able to model the intrinsic structure embedded in the original high-dimensional data [21] . In general, dimensionality reduction of nonlinear manifold learning-based hashing methods are more powerful than linear manifold techniques, as they are able to more effectively preserve the local structure of the input data without assuming global linearity [22] . However, there are two problems hindering the use of nonlinear manifold learning for hashing. One is that nonlinear manifold learning-based hashing methods are unsuitable when the datasets are larger, for example, when the number of samples in the dataset is 10,000, the computational complexity reaches O(10 8 ). Thus, constructing the neighborhood graph O(n 2 ) for n data points is intractable to a large dataset. The other one is the loss of semantic information in a feature after dimensionality reduction with nonlinear manifold learning-based hashing methods. How best to preserve the semantic information in this process is a practical issue [23, 24] . The first problem is that a representative small dataset is selected to replace the entire dataset for calculating. Along the way, the computational complexity will not increase, even the dataset is larger, because the number of the representative dataset would not get larger. As for the second problem, the best performing of the identified manifolds-based t-Distributed Stochastic Neighbor Embedding (t-SNE) is introduced, which has been shown to be effective in discovering semantic manifolds among a set of all features with lower computational complexity [25] . According to the above analysis, t-Distributed Stochastic Neighbor Embedding-based Nonlinear Manifold (t-SNE-based NM) hashing
doi:10.3390/rs10020271
fatcat:ei7mrzq3lvgl3fy6wjjveehbmm