- Biomedical Optics & Medical Imaging
- Defense & Security
- Electronic Imaging & Signal Processing
- Illumination & Displays
- Lasers & Sources
- Micro/Nano Lithography
- Optical Design & Engineering
- Optoelectronics & Communications
- Remote Sensing
- Sensing & Measurement
- Solar & Alternative Energy
- Sign up for Newsroom E-Alerts
- Information for:
Virtual dimensionality for hyperspectral imagery
A versatile, novel approach enables assessment of the number of spectrally distinct signatures in data covering a large spectral range.
28 September 2009, SPIE Newsroom. DOI: 10.1117/2.1200909.1749
Estimates of how many sources are present in a given data set are very difficult to obtain in a signal/image-processing context. This is particularly challenging for hyperspectral data (defined as spanning a large fraction of the electromagnetic spectrum), because many unknown signal sources are usually uncovered on the basis of significantly improved spectral resolution. Nevertheless, several theoretical criteria have been used in conjunction with sensor arrays to estimate the presence and number of signal sources. However, we have shown that they are neither appropriate nor effective.1,2
Virtual dimensionality (VD) provides an effective alternative.1,2 Its basic, underlying concept is the ‘pigeon-hole principle’.3 If we represent a signal source by a pigeon and a spectral band by a pigeon hole, we can use a given spectral band to accommodate one particular source.4 Thus, if a signal source is present in our data, we should be able to detect it in the relevant spectral band. In essence, this implies that we must calculate the eigenvalues of both the data-correlation and covariance matrices, and for the lth band. A signal source is present if their difference, is positive.
A binary-composite-hypothesis test is formulated for each spectral band such that the null and alternative hypotheses (H0 and H1) represent two scenarios, corresponding to the absence and presence of a signal in the lth band, respectively. We have developed the Harsanyi-Farrand-Chang (HFC) method,5 as well as a noise-whitened version (NWHFC),1,2 on the basis of Neyman-Pearson detection theory to explore how many times the test fails for all spectral bands and for a given false-alarm probability, PF. The corresponding VD is defined as the number of failures, which indicates the number of signal sources present in the data. The VD is thus completely determined by PF.
Figure 1. (a) Hyperspectral digital imagery collection experiment (HYDICE) scene containing vehicles and panels. (b) Ground-truth map of the objects' recovered spatial locations.
We have shown that VD has found many applications in hyperspectral-data exploitation.6,7 Here, we present a number of illustrative examples. Because of its use of hundreds of contiguous spectral bands, a hyperspectral image contains vast amounts of data. Dimensionality reduction (DR) and band selection (BS) are two major approaches that take advantage of high band correlation to remove data redundancy. Both aim to derive how to determine the number of dimensions, q, after DR or the number of bands, , needed for BS. We have demonstrated that VD provides a good estimate of q.8 Second, we have shown that VD can also be used to estimate the required for BS.9 In addition, VD can be used to predict how many spectrally distinct signal sources are present in the data. Two applications relevant to this question, endmember extraction and linear spectral-mixture analysis (LSMA), are of particular interest. An endmember is an idealized, pure signature specifying a spectral class. Estimating the number of endmembers present in the data is a fundamental and crucial step in understanding hyperspectral data.10,11 LSMA is one of most widely used data-analysis techniques in remote sensing. It assumes that data vectors are linearly mixed using a set of p basic, spectrally distinct signal signatures, that are a priori known to be present in the data. Unfortunately, in many real applications knowledge of is either too difficult to obtain or may not be reliable. We have demonstrated that VD then provides an effective means of estimating p and lays the foundation for unsupervised LSMA.12–14
To demonstrate the applicability of the VD, we used several well-known hyperspectral image-data sets for illustrative experiments. Figure 1 shows a hyperspectral digital imagery collection experiment (HYDICE) scene and its ground-truth counterpart covering 200×64pixels, acquired using 210 spectral channels ranging from 0.4 to 2.5μm with spatial and spectral resolutions of approximately 1.56m and 10nm, respectively. The ground-truth map in Figure 1(b) highlights five distinct signatures, specifying the 15 panels on the left side of the scene. On the right, two different types of vehicles with sizes of 4×8 and 6×3m2 are recovered in the bottom row, while three objects are shown at the top. Thus, this particular scene reveals at least ten distinct man-made signatures in addition to background signals.
Figure 2. Airborne Visible IR Imaging Spectrometer image scenes and their ground-truth counterparts. (a) Purdue Indiana Indian Pine test site (the numbers refer to the distinct object and background classes). (b) Cuprite. A: Alunite. B: Buddingtonite. C: Calcite. K: Kaolinite. M: Muscovite. (c) Lunar Crater Volcanic Field.
Figure 2 shows three different image scenes acquired by the Airborne Visible IR Imaging Spectrometer (AVIRIS) using 224 spectral channels ranging from 0.4 to 2.5μm with spatial and spectral resolutions of approximately 20m and 10nm, respectively. Figure 2(a) shows 145×145-pixel representations of the well-studied Purdue Indiana Indian Pine test site,15 covering an area of mixed agriculture and forestry in northwestern Indiana.16 The ground-truth image shows 16 distinct object classes and an additional background class. Figure 2(b) shows detection of five different types of minerals in another well-known AVIRIS scene.17 We used two types of Cuprite images, ‘Reflectance’ and ‘Radiance.’ Figure 2(c) is an image scene of pixels in the Lunar Crater Volcanic Field, located in northern Nye County, Nevada. There are five targets of interest, highlighted by the radiance spectra of red oxidized basaltic cinders, rhyolite, playa (dry lake), vegetation, and shade.18
The VD values estimated using the HFC and NWHFC methods for the three AVIRIS image scenes in Figure 2 are tabulated in Table 1. VD provides reasonable estimates of the number of signatures present in the data sets when the false-alarm probability rate is limited to PF ≤ 10−3. In addition, the NWHFC method is generally more effective than the HFC approach.
Estimated VD on the basis of the HFC and NWHFC methods.
Estimating signal sources in hyperspectral imagery is a very challenging problem to which VD provides an effective solution. A number of alternative methods based on different criteria have also been developed19–22 (for example, to estimate the number of anomalies in anomaly detection). Their performance varies for different applications and is not necessarily superior to that achieved by the HFC/NWHFC methods. Since the latter estimate VD on the basis of the pigeon-hole principle, they effectively identify spectral rather than spatial classes (as generally done in traditional spatial-domain-based analysis). Consequently, VD may not be suitable for use with applications and image data characterized by significant spatial correlations. Therefore, we recently extended and generalized VD, providing an ‘effective spectral dimensionality.’ We are currently investigating a number of techniques related to this new concept, such as approaches based on linear spectral mixing and maximum orthogonal subspace projection.
Remote Sensing Signal and Image Processing Laboratory
University of Maryland, Baltimore County
Chein-I Chang is a professor. He received his PhD in electrical engineering from the University of Maryland at College Park. He has published over 100 journal articles, authored a book (Hyperspectral Imaging), edited two books (Recent Advances in Hyperspectral Signal and Image Processing and Hyperspectral Data Exploitation: Theory and Applications), and co-edited a fourth book (High Performance Computing in Remote Sensing). He is currently working on his second single-author book, Hyper.