Advanced Signal Processing Algorithms, Architectures, and Implementations

Wavelets and related time-scale transforms

Patrick Flandrin

Show abstract

In a very recent past, new techniques, referred to as time-scale methods and making use of the so-called wavelet transform, have been proposed for the analysis of nonstationary or time-varying signals. They are basically devoted to the description of signal time evolutions at different observation scales; this is achieved by using shifted and dilated versions of some elementary analyzing waveform along the time axis. The purpose of this paper is twofold: it is intended (1) to provide a brief overview of linear wavelet techniques (continuous and discrete transforms) and bilinear time-scale methods (time-scale energy distributions), and (2) to put them in some common perspective with existing Signal Processing tools (Gabor decompositions, constant-Q analysis, quadrature mirror filters, wideband ambiguity functions, time-frequency energy distributions). Existing or potentially relevant applications are also pointed out.

Tomographic techniques in image and signal processing

Donald L. Snyder

Show abstract

An iterative solution is given for solving deblurring problems having nonnegativity constraints through the use of methods motivated by tomographic imaging.

Higher-order statistics (spectra) and their application in signal processing

Jerry M. Mendel

Show abstract

Most real-world signals are non-Gaussian. If they were Gaussian then they could be completely characterized by their first- and second-order statistics, because the probability density function (p.d.f.) for a Gaussian signal is completely described by these statistics. Because most real-world signals are not Gaussian, we need to use more than just first- and second-order statistics, i.e., we need to use "higher-order statistics." We could use higher-order moments, e.g., triplecorrelations, quadruple-correlations, etc., or we could use cumulants. Cumulants are related to higher-order moments, but do not necessarily always equal these moments. Reasons for preferring cumulants over moments are explained below.

Recursive least-squares learning algorithms for neural networks

Paul S. Lewis, Jenq Neng Hwang

Show abstract

This paper presents the development of a pair of recursive least squares (ItLS) algorithms for online training of multilayer perceptrons which are a class of feedforward artificial neural networks. These algorithms incorporate second order information about the training error surface in order to achieve faster learning rates than are possible using first order gradient descent algorithms such as the generalized delta rule. A least squares formulation is derived from a linearization of the training error function. Individual training pattern errors are linearized about the network parameters that were in effect when the pattern was presented. This permits the recursive solution of the least squares approximation either via conventional RLS recursions or by recursive QR decomposition-based techniques. The computational complexity of the update is 0(N2) where N is the number of network parameters. This is due to the estimation of the N x N inverse Hessian matrix. Less computationally intensive approximations of the ilLS algorithms can be easily derived by using only block diagonal elements of this matrix thereby partitioning the learning into independent sets. A simulation example is presented in which a neural network is trained to approximate a two dimensional Gaussian bump. In this example RLS training required an order of magnitude fewer iterations on average (527) than did training with the generalized delta rule (6 1 BACKGROUND Artificial neural networks (ANNs) offer an interesting and potentially useful paradigm for signal processing and pattern recognition. The majority of ANN applications employ the feed-forward multilayer perceptron (MLP) network architecture in which network parameters are " trained" by a supervised learning algorithm employing the generalized delta rule (GDIt) [1 2]. The GDR algorithm approximates a fixed step steepest descent algorithm using derivatives computed by error backpropagatiori. The GDII algorithm is sometimes referred to as the backpropagation algorithm. However in this paper we will use the term backpropagation to refer only to the process of computing error derivatives. While multilayer perceptrons provide a very powerful nonlinear modeling capability GDR training can be very slow and inefficient. In linear adaptive filtering the analog of the GDR algorithm is the leastmean- squares (LMS) algorithm. Steepest descent-based algorithms such as GDR or LMS are first order because they use only first derivative or gradient information about the training error to be minimized. To speed up the training process second order algorithms may be employed that take advantage of second derivative or Hessian matrix information. Second order information can be incorporated into MLP training in different ways. In many applications especially in the area of pattern recognition the training set is finite. In these cases block learning can be applied using standard nonlinear optimization techniques [3 4 5].

Real-time SAR change-detection using neural networks

Christopher John Oliver, Richard Geoffrey White

Show abstract

This paper describes the techniques evolved at RSRE for the production of undistorted, focused synthetic aperture radar (SAR) images, target detection using a neural network method and the automatic detection of changes between pairs of SAR images. All these processes are achievable in a single pipelined process operating on an input data rate in excess of 10 Mbytes/second.

Nonlinear signal processing using radial basis functions

Terence J. Shepherd, David S. Broomhead

Show abstract

A procedure for multidimensional nonlinear modeling and interpolation is described which employs the method of radial basis function analysis. A systolic array for efficiently performing the associated computation for both the modeling and interpolation modes recursively in time is also described. Conditions are given for the further improvement of efficiency in the algorithm when the input data constitute a time series, and an associated processing structure is outlined.

Nonlinear classification and adaptive structures

Colin F. N. Cowan, Peter M. Grant, Shang-Liang Chen, et al.

Show abstract

The main purpose of this paper is to examine a number of possible architectures for nonlinear adaptive filtering specifically related to adaptive equalisation. The approach taken proceeds by first reformulating the filtering process as a form of classification task in N dimensions. In the case of filtering the dimensionality is determined by the number of data samples in the filter data input vector. The task of classification then proceeds using a number of possible strategies i. e. the multilayer perceptron Volterra series modeling and cluster analysis. The techniques are evaluated in comparison with normal linear equalisation procedures.

Global search of adaptive IIR filter error surfaces using stochastic learning automata

Clive K. K. Tang, Philip Mars

Show abstract

The multimodal nature of adaptive IIR filter error surfaces limits the use of gradient search adaptive algorithms. Recently an intelligent learning approach was suggested by the authors to tackle the problem. This paper describes further research results and shows that stochastic learning automata are capable of locating the global minimum under different conditions. Computer simulation results for a system identification application are presented to illustrate that it is possible to achieve global convergence irrespective of insufficient filter order or input colouration or both. Stability during adaptation is also maintained.

Adaptive lattice bilinear filters

Heung Ki Baik, V. John Mathews, Robert T. Short

Show abstract

This paper presents two fast least-squares lattice algorithms for adaptive nonlinear filters equipped with bilinear system models. Bilinear models are attractive for adaptive filtering applications because they can approximate a large class of nonlinear systems adequately and usually with considerable parsimony in the number of coefficients required. The lattice filter formulation transforms the nonlinear filtering problem into an equivalent multichannel linear filtering problem and then uses multichannel lattice filtering algorithms to solve the nonlinear filtering problem. The first of the two approaches is an equation-error algorithm that uses the measured desired response signal directly to compute the adaptive filter outputs. This method is conceptually very simple however it will result in biased system models in the presence of measurement noise. The second approach is an approximate least-squares output-error solution. In this case the past samples of the output of the adaptive system itself are used to produce the filter output at the current time. This approach is expected to reduce the effect of measurement noise on the behavior of the system. Results of several experiments that demonstrate and compare the properties of the adaptive bilinear filters are also presented in the paper.

Efficient beam-based adaptive processing for planar arrays

Steven G. Kratzer

Show abstract

Beam-based adaptive processing is an economical way to achieve good interference rejection performance from an adaptive receiving array, at much less computational cost than full element-based methods. However, to exploit this potential for planar arrays it is necessary to identify, in real time, which beams must be retained for adaptive concellation. This paper analyzes the beam-selection problem and presents a computationally efficient algorithm that performs real-time beam selection.

Direction finding using a modified minimum-eigenvector technique

Meng Hwa Er

Show abstract

In this paper, an adaptive algorithm for direction-finding of correlated sources is presented. The algorithm is low in computational complexity and it does not require determination of the effective rank of the array correlation matrix. The algorithm employs a gradient technique to determine the minimum eigenvector of the correlation matrix and an orthogonalization technique to determine the second minimum eigenvector. The two noise eigenvectors are then used to compute the spatial spectra. The angle of arrivals can then be found by superimposing the two spectra. To verify further the true arrivals, additional spatial spectral can be computed using a combination of the two noise eigenvectors. Numerical results show that the proposed algorithms are effective in resolving correlated sources.

Translation, rotation, and scaling invariant object and texture classification using polyspectra

Michail K. Tsatsanis, Georgios B. Giannakis

Show abstract

The problem addressed in this paper is the detection and classification of deterministic objects and random textures in a noisy scene. An energy detector is developed in the cumulant domain, by exploiting the noise insensitivity of higher-order statistics. An efficient implementation of this detector is described, using matched filtering. Its performance is analyzed using asymptotic distributions in a binary hypothesis testing framework. Object and texture classifiers are derived using higher-order statistics. They are minimum distance classifiers in the cumulant domain, and can be efficiently implemented using a bank of matched filters. Further, they are robust to additive Gaussian noise and insensitive to object shifts. Extensions, which can handle object rotation and scaling are also discussed. An alternate texture classifier is derived from an ML viewpoint, that is more efficient at the expense of complexity. The application of these algorithms to texture modeling is shown and consistent parameter estimators are obtained. Simulations are shown for both the object and the texture classification problems.

Algorithms for instantaneous frequency estimation: a comparative study

Boualem Boashash, Peter O'Shea, Morgan J. Arnold

Show abstract

This paper examines the problem of instantaneous frequency (IF) estimation for Frequency Modulated (FM) signals imbedded in white Gaussian noise. It reviews currently available techniques and in addition proposes some new ones based on a modelling of the signal phase as a polynomial. Both linear least-squares techniques and Maximum Likelihood (ML) techniques are investigated for estimating the polynomial coefficients. It is seen that the linear least squares approach is efficient (i. e. unbiased and meets the Cramer-Rao bounds) for high SNR while the ML scheme is efficient for a much larger range of SNR. Theoretical lower variance bounds are given for estimating the polynomial coefficients and are compared with the results of simulations. Guidelines are given as to which estimation method should be used for a given signal class and Signal to Noise Ratio (SNR) level.

Distributions concentrated along the instantaneous frequency

Leon Cohen

Show abstract

We show that for purely frequency modulated signals one can always find a distribution which is infinitely concentrated along its instantaneous frequency. For signals that are both frequency and amplitude modulated we consider a class of distribution which are explicit functionals of the instantaneous frequency. lated

Time-frequency filtering and synthesis from convex projections

Langford B. White

Show abstract

This paper describes the application of the theory of projections onto convex sets to time-frequency filtering and synthesis problems. We show that the class of Wigner-Ville Distributions (WVD) of L2 signals form the boundary of a closed convex subset of L2(R2). This result is obtained by considering the convex set of states on the Heisenberg group of which the ambiguity functions form the extreme points. The form of the projection onto the set of WVDs is deduced. Various linear and non-linear filtering operations are incorporated by formulation as convex projections. An example algorithm for simultaneous time-frequency filtering and synthesis is suggested.

Instantaneous frequency and kernel requirements for discrete time-frequency distributions

Jechang Jeong, Gregory S. Cunningham, William J. Williams

Show abstract

In this paper we introduce a new definition for the instantaneous frequency of a discrete-time analytic signal. Unlike the existing definition which uses only two data samples around a particular time this method utilizes all the data samples for estimating the instantaneous frequency. We prove that this quantity is identical to the average frequency evaluated at the particular time in the discrete-time TFD. This property is consistent with the analogous continuous-time property. We also derive requirements on the discrete-time kernel needed to satisfy this property. Using computer-generated signals and real data performance comparisons are made between the proposed approach and the existing one.

Optimal kernels for time-frequency analysis

Richard G. Baraniuk, Douglas L. Jones

Show abstract

Current bilinear time-frequency representations apply a fixed kernel to smooth the Wigner distribution. However the choice of a fixed kernel limits the class ofsignals that can be analyzed effectively. This paper presents optimality criteria for the design of signal-dependeni kernels that suppress cross-components while passing as much auto-component energy as possible irrespective of the form of the signal. A fast algorithm for the optimal kernel solution makes the procedure competitive computationaily with fixed kernel methods. Examples demonstrate the superior performance of the optimal kernel for a frequency modulated signal.

Segmentation and classification of nasal phonemes based on their time-frequency representation

Edgar F. Velez, Richard G. Absher

Show abstract

Time-frequency analysis of speech provides detailed information which may be used for the study of its production mechanisms and for recognition purposes. The smoothed Wigner-Ville distribution can be used as an effective speech analysis tool when a high temporal resolution is required. Smoothing demands a careful balance between resolution and suppression of interferences in the distribution while taking into account the characteristics of the speech signal. The resulting representation is applied to analyze nasal phonemes and transitions in continuous speech and shown to unveil important transitory features not revealed by the spectrogram and to result in accurate segmentation and discrimination of their place of articulation.

Kernel synthesis for generalized time-frequency distributions using the method of projections onto convex sets

Seho Oh, Robert Jackson Marks II, Les E. Atlas, et al.

Show abstract

The kernel in Cohen's generalized time frequency representation (GTFR) requires is chosen in accordance to certain desired performance attributes. Properties of the kernel are typically expressed as constraints. We establish that many commonly used constraints are convex in the sense that all allowable kernels satisfying a given constraint form a convex set. Thus, for a given set of constraints, the kernel can be designed by alternately projecting among these sets. If there exists a nonempty intersection among the constraint sets, then the theory of projeciion onto convex seis ( POCS) guarantees convergence to a point in the intersection. If the constraints can be partitioned into two sets, each with a nonempty intersection, then POCS guarantees convergence to a kernel that satisfies the inconsistent constraints with minimum mean square error.

Frequency versus time-delay approach for acquiring multiple unknown chirp signals

Hsieh-Sheng Hou, Wai-Hung Ng

Show abstract

A frequency versus timedelay (FVTD) technique is introduced to acquire unknown parameters of received chirp signals. This technique can precisely determine a single chirp clearly distinguish completely overlapped upchirp and downchirp and is also capable of realizing various overlapped multiple chirp signals. The basic implementation concept of this approach is relatively simple. We first employ a bank of bandpass filters to noncoherently process the incoming chirps. The filtered and sampled signals are then shifted into a set of frequency time and power distribution sequences which provide enough information for acquiring the unknown parameters of the received chirp signals. Examples and figures are used to illustrate this procedure.

Comparison of the Gabor and short-time Fourier transforms for signal detection and feature extraction in noisy environments

Louis Auslander, C. Buffalano, Richard S. Orr, et al.

Show abstract

Effective signal detection and feature extraction in noisy environments generally depend on exploiting some knowledge of the signal. The short-time Fourier transform and the Gabor transform are two methods that exploit signal envelope information. This paper compares the two transforms and makes the case that the Gabor representation can often be more compact, and may require substantially less computation and storage in some applications. There is a sense in which the Gabor achieves a preferential trade of SNR for resolution, and because of this, one can also expect better signal recognition and feature reconstructions from the Gabor transform in the presence of noise.

Applications of the fast wavelet transform

Christopher E. Heil

Show abstract

The fast wavelet transform is an order-N algorithm, due to S. Mallat, which performs a time and frequency localization of a discrete signal. It is based on the existence of orthonormal bases ( for the space of finite-energy signals on the real line) which are constructed from translates and dilates of a single fixed function, the "mother wavelet" (the Haar system is a classical example of such a basis; recent continuous examples with compact support are due to I. Daubechies). We discuss the derivation of the Mallat wavelet transform, give some examples showing its potential for use in edge detection or texture discrimination, and finally discuss how to generate Daubechies' orthonormal bases.

Radar signal synthesis with constraints: a group theoretic approach

Langford B. White

Show abstract

This paper addresses the problem of designing signals for general group representations subject to constraints which are formulated as convex sets in the Hilbert space of the group states. In particular, the paper considers irreducible representations in an infinite dimensional Hilbert space and derives an iterative producedure for proceeding from an arbitrary element of the Hilbert space to a state of the group subject to a priori imposed constraints with closed convex range. As examples, the paper focuses on narrowband and wideband radar ambiguity synthesis.

Source-bearing estimation and sensor positioning with the Propagator method

Sylvie Marcos, Messaoud Benidir

Show abstract

This paper presents several properties of the Propagator a linear propagation operator recently introduced for source bearing estimation problems. Among these properties the so-called Propagator method does not require any eigendecomposition of the cross-spectral matrix (CSM) of the received signals and can be performed by adaptive techniques which reduce the computational burden. Moreover the Propagator method does not require any modeling of the background noise but if the noise is spatially white its asymptotical resolution power is theoretically infinite. Besides other propagation parameters like the sensor location or the wavefront geometry can be easily determined. In the case of a uniform (equispaced sensors) rectilinear antenna a linear prediction polynomial can easily be constructed the zeros of which give estimates of the source bearings.

Wavelets, tomography, and line-segment image representations

Richard A. Altes

Show abstract

Conventional scale dependent wavelet analysis represents a signal or iniage as a superposition of translated differently scaled versions of the same basis function. When the basis function for time series analysis is a chirp with linear frequency modulation a scale dependent wavelet representation is equivalent to a sequence of projections of the signal timefrequency distribution along differently rotated lines and reconstruction of the signal from its chirped wavelet representation is analogous to tomographic reconstruction from time frequency projections. The same analogy applies in two dimensions if scaled basis functions are replaced by rotated ones such that an image is represented by a superposition of translated differently rotated versions of the same basis function. For rotation dependent wavelet analysis basis functions consisting of very long line segments yield a tomographic representation while shorter line segments yield a line segment image representation as in the primate visual cortex. Applications include binocular robot vision and synthetic aperture radar.

Truncated sampling for the Fourier-Mellin transform with applications to wideband WVD computation

Jeffrey Allen

Show abstract

The Fourier-Mellin transform (FMT) of an input function is defined as and is the magnitude squared of the Mellin transform of the magnitude squared of the Fourier transform of the input function [1]. As such the FMT is unchanged by translations and dilations of the input function. While the FMT has found applications in optical pattern recognition [3] [5] ship classification by sonar and radar [15] and image processing [10] only cursory attention has been paid to the truncation error incurred by using a finite number of samples of the input function. This paper establishes truncation bounds for computing the FMT for band-limited functions from a finite number of samples of the input function. These bounds naturally suggest an implementation of the FMT by the method of direct expansions [4] [14]. This approach readily generalizes to a direct expansion for the Wigner-Ville distribution [13] and the Q distribution [2]. 1 Principal Notation u(x) fff00 e_2tu(t)dt Fourier transform of u M(u s) fD X_i2r8() Mellin transform of u . FM(u s) M(lI(x)I2 s)________ Fourier-Mellin transform of u Q(U V f002rt U(wft)_V(w/fr) Q distribution of U and V W(U V t w) fe_i2ntY U(w + y/2) V(w y/2) dy Wigner-Ville distribution of U and V

Relationships between the Fourier transform and the wavelet transform

Howard L. Resnikoff, C. Sidney Burrus

Show abstract

This paper derives the relationship between the coefficients of the Fourier expansion of a periodic signal and the coefficients of the wavelet expansion of the same signal. The formula derived shows how the Fourier concept of frequency and the wavelet concept of scale are related and how the wavelet coefficients display the information contained in the signal in a new way. The relationship also shows how the wavelet expansion can be used to approximately calculate the Fourier coefficients.

Generalized parametric estimator for multiple-wideband signals

Qiang Wu, James P. Reilly, Kon Max Wong

Show abstract

This paper presents a generalized parametric estimator for the directions of arrival (DOA) of wide-band signals. This estimator is derived by extenting the geometrical explanation of the ML estimator of narrow-band signals to the focussed correlation matrix [1 The consistency of the estimator for estimating DOA has been proved. This estimator can be considered as a coherent signal processing method by which the computation complexity can be reduced approximately by the number of the frequency bins. We have also shown that under certain condition the proposed estimator is equivalent to the ML estimator derived by applying the likelihood principle on the Fourier coefficients of each frequency bin. Such an equivalence implies that the MLE has some inherent advantages from the perspective of improving performance and that the focussing techniques are not necessary for the ML estimator. The simulation results that the new parametric method possesses higher resolution than the eigendecomposition methods especially at a small sampling size. However the price paid for the better performance is the increase on computation complexity comparing to the eigendecomposition methods.

Adaptive Lanczos methods for recursive condition estimation

William R. Ferng, Gene H. Golub, Robert J. Plemmons

Show abstract

Estimates for the condition number of a matrix are useful in many areas of scientific computing including: recursive least squares computations optimization eigenanalysis and general nonlinear problems solved by linearization techniques where matrix modification techniques are used. The purpose of this paper is to propose an adaptive Lanczos estimator scheme which we call ale for tracking the condition number of the modified matrix over time. Applications to recursive least squares (RLS) computations using the covariance method with sliding data windows are considered. ale is fast for relatively small n - parameter problems arising in RLS methods in control and signal processing and is adaptive over time i. e. estimates at time t are used to produce estimates at time t + 1 . Comparisons are made with other adaptive and non-adaptive condition estimators for recursive least squares problems. Numerical experiments are reported indicating that ale yields a very accurate recursive condition estimator.

Updating signal subspaces

Christian H. Bischof, Gautam M. Shroff

Show abstract

We develop an algorithm for adaptively estimating the noise subspace of a data matrix as is required in signal processing applications employing the ''signal subspace'' approach. The noise subspace is estimated using a rank-revealing QR factorization instead of the more expensive singular value or eigenvalue decompositions. Using incremental condition estimation to monitor the smallest singular values of triangular matrices we can update the rank-revealing triangular factorization inexpensively when new rows are added and old rows are deleted. Experiments demonstrate that the new approach usually requires 0(n2) work to update an n x n matrix and accurately tracks the noise subspace.

CORDIC algorithms in four dimensions

Jean-Marc Delosme, Shen-Fu Hsiao

Show abstract

CORDIC algorithms offer an attractive alternative to multiply-and-add based algorithms for the implementation of two-dimensional rotations preserving either norm: (x2 + 2) or (x2 _ y2)/2 Indeed these norms whose computation is a significant part of the evaluation of the two-dimensional rotations are computed much more easily by the CORDIC algorithms. However the part played by norm computations in the evaluation of rotations becomes quickly small as the dimension of the space increases. Thus in spaces of dimension 5 or more there is no practical alternative to multiply-and-add based algorithms. In the intermediate region dimensions 3 and 4 extensions of the CORDIC algorithms are an interesting option. The four-dimensional extensions are particularly elegant and are the main object of this paper.

Eigenvalue decomposition of a cumulant tensor with applications

Pierre Comon, Jean-Francois Cardoso

Show abstract

The so-called Independent Component Analysis (ICA) raises new numerical problems of particular nature. In fact contrary to the Principal Component Analysis (PCA) ICA requires the resorting to statistics of order higher than 2 involving necessarily objects with more than two indices such as the fourth-order cumulant tensor. ICA computation may be related to the diagonalization of the cumulant tensor with some particularities stemming from the symmetries it enjoys. Two algorithms are proposed. The first connects the problem to the computation of eigenpairs of an hermitian operator defmed on the space of square matrices. The second algorithm reduces the complexity by making use of the information redundancy inherent in the cumulant tensor its convergence properties are much alike those of the Jacobi algorithm for EVD calculation. Key words: Identification Cumulant Principal Component Independent Component Mixture Contrast

Parallel algorithm for the eigenvalues and eigenvectors of a general matrix

Gautam M. Shroff

Show abstract

A new parallel Jacobi-like algorithm for computing the eigenvalues of a general complex matrix is presented. The asymptotic convergence rate of this algorithm is provably quadratic and this is also demonstrated in numerical experiments. The algorithm promises to be suitable for real-time signal processing applications. In particular the algorithm can be implemented using n2/4 processors taking O(n log2 n) time for random matrices.

Algorithm for the singular value decomposition of a matrix product

Adam W. Bojanczyk, L. Magnus Ewerbring, Franklin T. Luk, et al.

Show abstract

In this paper we propose a new algorithm for computing a singular value decomposition of a matrix product. We show that our algorithm is numerically desirable in that all relevant residual elements will be numerically small.

Computing the generalized singular value decomposition on the Connection Machine

Haesun Park, L. Magnus Ewerbring

Show abstract

A new algorithm for computing the generalized singular value decomposition that diagonalizes two matrices is introduced. First two existing algorithms are studied one due to Paige and the other to Han and Veselic. The former requires only orthogonal plane transformations resulting in two matrices with parallel rows. The latter employs nonsingular (not necessarily orthogonal) transformations to directly diagonalize the two data matrices. For many applications it is preferable to be given both the diagonal matrices and the txansformation matrices explicitly. Our algorithm has the advantages that the nonsingular transformation is readily available computation of transformations is simple convergence test is efficient in a parallel computing environment and the condition of each nonsingular transformation can be controlled. We present implementation results obtained on a Connection Machine 2 to compare our new algorithm with that of Paige.

Solving unstructured grid problems on massively parallel computers

Steven W. Hammond, Robert Schreiber

Show abstract

We present a highly parailel graph mapping technique that enables one to efficiently solve unstructured grid problems on massively parallel computers. Many implicit and explicit methods for solving discretized partial differential equations require each point in the discretization to exchange data with its neighboring points every time step or iteration. The cost of this communication can negate the high performance promised by massively parallel computing. To eliminate this bottleneck we map the graph of the irregular problem into the graph representing the interconnection topology of the computer such that the sum of the distances that the messages travel is minimized. We show that using our heuristic mapping algorithm significantly reduces the communication time compared to a naive assignment of processes to processors.

FFT communications requirement optimizations on massively parallel architectures with local and global interprocessor communications capabilities

Christopher L. Kuszmaul

Show abstract

Fast Fourier Transforms [1] Batcher sorting [2] Cyclic Reduction [3] and a host of other recursively defined divide and conquer style algorithms can be implemented on massively parallel computers which provide for rapid communications between data elements whose indices differ by a power of two. This paper addresses the general issue of how two different communication mechanisms one Global and one Local can provide for hybrid performance that substantially exceeds what either could provide separately. In particular power of two communications schemes are explored for the MP-1 family ofmassively parallel computers. By using a combination of the eight way nearest neighbor toroidally wrapped grid and the Global Router on an MP-1 1200 series computer with 16 processors (PEs) the communications requirements for a 16 point FFT are shown to require less than 2 milliseconds.

Systolic array for Kalman filtering with algorithm-based fault tolerance

William F. Mitchell

Show abstract

The Kalman filter is an optimal recursive estimator for linear dynamic systems with white noise. Many digital signal processing applications require the real time computation of the Kalman filter e. g. tracking an aircraft with radar. Consequently many papers have been published on the use of parallel processing technology in the form of systolic arrays for fast computation of the Kalman filter. This paper presents a new systolic array implementation of the Kalman filter that is not excessive in either hardware or computation steps. For a dynamic system with N states and M observation components the array uses N(N-i-1) processors and about 4N+6M computation steps. In some applications it is also required that the processing system continue to function even after some of the components of the system fail. The Kalman filter systolic array is extended to one that is tolerant of faults in the processing elements of the array by using techniques of algorithm-based fault tolerance. Overhead for fault tolerance is about 47 additional hardware and 17 additional computational steps in the example of radar tracking.

Multiscale signal detection in fractional Brownian motion

Ahmed H. Tewfik, Myoungjin Kim

Show abstract

A multiresolution model of a discrete fractional Brownian motion is developed. The model leads to a multiscale algorithm for constructing the optimal filter that must be used in detection problems involving a fractional Brownian motion and white noise.

Radar super-range resolution and Bragg cell interferometry

Theagenis J. Abatzoglou, Larry K. Lam

Show abstract

This paper describes a Bragg Cell Interferometer (BCI) that is designed to measure a signal''s amplitude and phase spectrum precisely. The output of the BCI can be applied to a Superresolution range-Doppler estimator. It is shown that by using the Fourier Transform outputs, the estimation of range-Doppler from scatterers is transformed into a 2D sinusoidal frequency estimation problem, whose frequencies correspond to range and Doppler. Hence, superresolution techniques based on linear prediction may be applied to estimate the range-Doppler of scatterers with great accuracy.

Improved jammer localization using multiple focusing

Edgar F. Velez, Moeness G. Amin

Show abstract

Focusing techniques have proven efficient in direction-of-arrival estimation of broadband signals. However, when used alone at high frequency operation, these techniques cannot accurately locate the sources, due to the numerous spurious peaks in the spectrum. Since the spurious peaks depend on the array manifold, their location varies from one focusing frequency to another. Improved jammer localization can, therefore, be achieved by focusing at different frequencies and then averaging the corresponding MUSIC spectra. The averaging smooths out the undesired peaks while boosting the common spectral peaks, allowing correct detection and location of the waveforms impinging on the array.

Parallel algorithms for automatic target recognition using laser radar imagery

Arthur V. Forman Jr., Daniel J. Sullivan, Arthur W. Chang

Show abstract

This paper describes two techniques for automatic recognition of surface targets from an airborne platform using an imaging laser radar sensor. The first technique rotates a three-dimensional model of the target in real time to enable a generalized Hough transform to match the ladar image to the target''s key discriminating features as a basis for target identification. The second technique uses a variation on minimum average correlation energy filters to perform robust target identification. Examples illustrating the application of these algorithms are presented along with a description of a real-time implementation of the critical parts of the algorithms on a 40 systolic array based on the Geometric Arithmetic Parallel Processor (GAPP) chip.

Nonparametric estimation of autocorrelation and spectra using cumulants and polyspectra

Georgios B. Giannakis, Anastasios N. Delopoulos

Show abstract

Autocorrelation and specira of linear random processes can be expressed in terms of cumulants and polyspectra respectively. The insensitivity of the latter to additive Gaussian noise of unknown covariance is exploited in this paper to develop spectral estimators of deterministic and linear non-Gaussian signals using polyspectra. In the time-domain windowed projections of third-order cumulants are shown to yield consistent estimators of the autocorrelation sequence. Both batch and recursive algorithms are derived. In the frequency-domain a Fourier-slice solution and a least-squares approach are described for performing spectral analysis through windowed bi-periodograms. Asymptotic variance expressions of the time- and frequencydomain estimators are also presented. Two-dimensional extensions are indicated and potential applications are discussed. Simulations are provided to illustrate the performance of the proposed algorithms and compare them with conventional approaches. I.

Accurate characterization of error propagation in a highly parallel architecture

Venugopal R. Shamapant, Jacob A. Abraham, D. G. Saab

Show abstract

Errors caused by faults in a system manifest themselves in the output results in various patterns. These output patterns of erroneous data the nature of their propagation through the various modules of the system and their impact on the system are collectively referred to as error characerisics. This paper deals with efficient methods of obtaining these error characteristics. Both experimental and analytical approaches to the problem are considered and a probabilistic approach to this problem is proposed. The effectiveness of this approach with respect to the accuracy of results and amount of computation involved is discussed.

Implementation of the phase-gradient algorithm

Daniel E. Wahl, Paul H. Eichel, Charles V. Jakowatz Jr.

Show abstract

The recently introduced Phase Gradient Autofocus (PGA) algorithm is a non-parametric autofocus technique which has been shown to be quite effective for phase correction of Synthetic Aperture Radar (SAR) imagery. This paper will show that this powerful algorithm can be executed at near real-ume speeds and also be implemented in a relatively small piece of hardware. A briefreview ofthe PGA wiibe presented along with an overview ofsome critical implementation considerations. In addition a demonstration of the PGA algorithm running on a 7" xlO" printed circuit board containing a TMS32OC3O digital signal pro cessing (DSP) chip wilibe given. With this system using only the 20 range bins which contain the brightest points in the image the algorithm can correct a badly degraded 256x256 image in as little as 3 seconds. Using all range bins the algorithm can correct the image in 9 seconds.

Matrix triangularization by fixed-point redundant CORDIC with constant scale factor

Jeong-A Lee, Tomas Lang

Show abstract

We develop a redundani CORDIC scheme where the scale factor is forced to be constant while computing angles for 2 x 1 plane rotations. Based on the scheme we present a fixed-point implementation of matrix triangularization by Luk''s parallel algorithm with the following additional features: (1) the final scaling operation is done by shifting (2) the number of iterations in CORDIC rotation unit is reduced by about 25 by expressing the direction of the rotation in radix-2 and radix-4 and (3) the conventional number representation of rotated output is obtained on-thefly not from a carry-propagate adder. The number of hardware modules and the speed are evaluated and compared with the previous CORDIC schemes.

Evaluation of optical memory card system for patient records

Junji Shoda, Carlos Vallbona, Jack H. U. Brown, et al.

Show abstract

A new computer memory using a laser beam to impress dimples on a standard sized credit card has been devised and tested. The card contains approximately 2 megabytes of memory with associated software which provides random access and can be written upon or read with a personal computer equipped with a simple device. Software has been developed to place a complete medical record on the card and update it at encounters with the health care system. The complete system has been under test in an ambulatory patient clinic and has proven to be highly satisfactory. This paper presents details of the software, hardware and of the evaluation of the system.

Probabilistic model for the evaluation of fault-tolerant multiprocessor systems using concurrent error detection

V. S. Sukumar Nair, Jacob A. Abraham

Show abstract

The analysis of fault-tolerant multiprocessor systems that use concurrent error detection (CED) schemes is much more difficult than the analysis of conventional fault-tolerant architectures. Various analytical techniques have been proposed to evaluate CED schemes deterministically. However these approaches are based on the worst-case assumptions related to the failure of system components. Often the evaluation results are not good measures for comparing the fault tolerance capabilities of a system. In this paper we develop a probabilistic approach to evaluate and compare fault-tolerant multiprocessor systems. Various probabilities associated with CED schemes are identified and used in the framework of the matrix-based model which we had proposed previously. Based on these probabilistic matrices we analytically derive estimates for the fault tolerance capabilities of various systems.

Transient detection in the presence of strong narrowband interference via Wigner-Ville spectrum

Nenad M. Marinovic, Leonid M. Roytman

Show abstract

We consider the problem of detecting a known Gaussian random transient in the presence of a strong known random Gaussian narrowband interference. This can be regarded as a special case of the classical problem of detecting a known Gaussian random signal in known Gaussian colored noise. There exists a standard solution for such a problem based on the classical optimum detector for random signals in noise. However such a detector does not explicitly use the non-stationary character of the signal as a priori available information. Reformulation of the optimum detection in the time-frequency plane allows one to exploit this distinguishing signal feature and suppress the stationary interference and noise. This is accomplished here by use of the Wigner-Ville signal representation and an optimum signal/noise subspace decomposition that maximizes the transient signal to noise ratio. The new detection procedure eliminates the subspace where major part of the energy of random noise sample will fall while retaining almost all of the signal energy. In this fashion a gain in the output signal to noise ratio is achieved as verified by simulations.

Advanced Signal Processing Algorithms, Architectures, and Implementations

Volume Details

Table of Contents

Table of Contents