Proceedings Volume 2296

Advanced Signal Processing: Algorithms, Architectures, and Implementations V

cover
Proceedings Volume 2296

Advanced Signal Processing: Algorithms, Architectures, and Implementations V

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 28 October 1994
Contents: 12 Sessions, 71 Papers, 0 Presentations
Conference: SPIE's 1994 International Symposium on Optics, Imaging, and Instrumentation 1994
Volume Number: 2296

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Cumulants I
  • Cumulants II
  • Cumulants I
  • Cumulants II
  • Time Frequency Signal Analysis
  • Mobile Communications
  • Nonlinear Dynamics and Signal Processing
  • Subspace Tracking
  • Array Processing
  • Fast Solvers for Structured Systems
  • Regularization Methods
  • Frequency Representations and Machine Implementations
  • Architectures
  • Algorithm-Based Fault Tolerance
  • Frequency Representations and Machine Implementations
  • Mobile Communications
  • Subspace Tracking
Cumulants I
icon_mobile_dropdown
Uses of cumulants in wavelet analysis
David R. Brillinger
Cumulants are useful in studying nonlinear phenomena and in developing (approximate) statistical properties of quantities computed from random process data. Wavelet analysis is a powerful tool for the approximation and estimation of curves and surfaces. This work considers wavelets and cumulants, developing some sampling properties of wavelet fits to a signal in the presence of additive stationary noise via the calculus of cumulants. Of some concern is the construction of approximate confidence bounds around a fit. Both linear and shrunken wavelet estimates are considered. Extensions to spatial processes, irregularly observed processes and long memory processes are discussed. The usefulness of the cumulants lies in their employment to develop some of the statistical properties of the estimates.
Adaptive rational subspace estimation: case of nonwhite additive noise
Inbar Fijalkow, Philippe Loubaton
In previous work, time domain algorithms were proposed to adaptively estimate a rational and orthonormal spanning of rational source and noise subspaces using a cascadable lossless structure. However, they apply only if the additive channel noise occurring at the receiver sensor array is a spatially and temporally white multivariate process. We propose a new approach in the case where the `useful' signal is corrupted by an additive spatially and/or temporally non-white noise. The source and noise subspaces are characterized by the mean of fourth order statistics under pragmatic assumptions on the sources and noise distributions. A new adaptive algorithm is deduced and its satisfactory asymptotical convergence properties are proved.
Cumulant-based cubic difference tone (CDT) detectors for otoacoustic emissions (OAE) signals
Ananthram Swami, Preeti Rao
Interactions among spontaneous otoacoustic emissions (SOEs), which are narrow-band signals emitted by the cochlea, include the cubic difference tone (CDT) that occurs at frequency 2f1 - f2 due to the nonlinear coupling of primary SOEs at f1 and f2. We show that appropriately defined trispectra and tricoherence may be used to detect the CDTs. OAE signals are also characterized by temporal variations in the amplitudes and frequencies; this leads to polynomial phase models, and multiplicative noise processes; we give closed-form expressions for the large sample CRLB for the parameters of the resulting model. We also consider cyclic approaches, and we compare polyperiodogram and periodogram based approaches for the detection of arbitrary frequency coupling laws. Some of the techniques are validated against real data.
Statistical monitoring of rotating machinery by cumulant spectral analysis
Richard W. Barker, Melvin J. Hinich
Background on the rotating drill wear problem, including approaches using a combination of sensors and signal features, are briefly summarized prior to sharing results from a higher- order statistical study of accelerometer data. Vibroacoustic signals of rotating machinery are composed of sums of modulated periodicities, broadband random components, and occasionally a set of transient responses. These signals are not ergodic as the modulated periodicities are partially coherent. Progressive wear of the rotating machine causes the nonlinear structure of the received signal to intensify, and nonlinearity results in transfer of energy between harmonics of the signal's periodic components. Statistics developed from bispectrum and second-order cumulant spectrum estimates of the measured signal are combined with power spectrum amplitudes as feature inputs for standard multivariate classifiers. The higher-order statistics measure, respectively, the extent of nonlinearity and intermodulation of the received signal. Classification results of actual drill wear data collected from a controlled experiment reveal that statistics from estimates of the second order cumulant spectrum have increased discrimination power for detecting incipient drill wear.
Identification of spherical acoustic objects in motion
Roger F. Dwyer
Often in applications it is desired to remotely identify an object from its reflected signature. In active sonar the object is insonified by a transmitted waveform which may be broadband (modulated) or narrowband and the received pressure is processed to localize and identify the object. Identification usually requires a broadband transmitted waveform. In this paper the received pressure time series from spherical acoustic objects in motion that have been insonified by linear frequency modulated waveforms are used to identify the objects. Specifically, the unknown object's unknown Doppler using an unique property of the fourth- order spectrum is extracted and then its transfer function is extracted. The results for five spherical acoustic objects in motion are discussed. Three methods, deconvolution, matched filtering, and fourth-order spectrum deconvolution to extract the object's transfer function are compared.
Cumulants II
icon_mobile_dropdown
New fast algorithm for blind MA-system identification based on higher order cumulants
Karl-Dirk Kammeyer, B. Jelonnek
In this paper, a new approach to blind Moving Average system identification is presented. It is derived from a specific algorithm for fast blind equalization (EVA, Eigen Vector Algorithm) published recently. The novel approach leads to a generalized eigenvalue problem and is thus denoted as EVI (Eigen Vector Identification). It will be shown that EVI allows a proper system (channel) estimation on the basis of a relatively small block of data samples. Another crucial property of EVI is its robustness against a system order overfit (this is in contrast to other blind algorithms known so far). Finally the EVI performance will be evaluated in presence of additive gaussian noise; it will be demonstrated by simulation results that the degradation of the channel estimate is very small even with low signal to noise ratios.
Cumulants I
icon_mobile_dropdown
Time-space segmentation and motion estimation based on higher order statistics
Alessandro Neri, Giuseppe Russo, Stefania Colonnese, et al.
The paper illustrates a comprehensive method for the motion compensation to be used in predictive video coding. The method is based on the observation that structured artifacts as those consisting of isolated points, lines, edges, organized textures are directly perceived by the user, while artifacts resembling realizations of gaussian processes can be considered less important. A fidelity criterion based on the Mean Forth-Cumulant as indirect estimate of the local entropy level is then applied to drive both the segmentation and the motion estimation phases. The motion estimator is conceptually similar to the higher order moments techniques employed in time delay estimation, and takes advantage of the Gaussian signals rejection capability, typical of the higher order cumulants. The contribution describes the theoretical framework of cumulant based motion estimation. The performance of a coder based on the discrimination of the temporal activity by means of cumulants, is illustrated through experimental data.
Cumulant slices for two-dimensional autoregressive signal modeling
Tania Stathaki, Anthony G. Constantinides
The research work reported in this paper is concerned with the use of higher order spectral estimation techniques as a means to deriving the parameters of 2D autoregressive (AR) models. Image analysis is examined from a higher order statistical perspective and in the context of noise. The objective is to develop analysis techniques through which robust autoregressive parameter estimation is accomplished. The approach taken involves the use of 2D AR models derive from third order cumulants. The directionality of the cumulant space influences the AR parameter estimation in a decisive manner. The specific application of the developed methods is in mammography, an area in which it is very difficult to discern the appropriate features. The results show significant discriminating gains through such techniques.
Second- and higher-order cyclostationary processing using acousto-optics
In this paper we consider cyclostationary signal processing techniques implemented via acousto-optics (AO). Cyclic processing methods are reviewed and motivated, including the cyclic correlation and the cyclic spectrum. We show how a 1D AO spectrum analyzer can be used to detect the presence, and estimate the value, of cycle frequencies. The cyclic correlation can then be computed at cycle frequencies of interest using a 1D time-integrating correlator. Next we consider the problem of computing the (2D) cyclic correlation for all cycle frequencies and lags simultaneously. This is accomplished via an AO triple-product processor, configured in a manner similar to that used for ambiguity function generation. The cyclic spectrum can be obtained in a post-processing step by Fourier transforming the cyclic correlation in one dimension. We then consider higher order extensions of the cyclic correlation and show how a 2D slice of the 3D cyclic triple-correlation can be computed using an AO four-product processor.
Decomposition of quantics in sums of powers
Pierre Comon, B. Mourrain
Symmetric tensors of order 3 and 4 arise more and more often in Signal Processing and Automatic Control, because of the recent complementary use of High-Order Statistics. However, very little special purpose tools are at disposal for manipulating such objects in engineering problems. In this paper, diagonalization issues are analyzed, with the help of the theory of homogeneous polynomials in several variables (i.e. quantics). A considerable number of results are known for binary questions, but too few for general cubs and quartics. Efficient numerical algorithms are also lacking.
Cumulants II
icon_mobile_dropdown
Blind channel estimation and deconvolution in colored noise using higher-order cumulants
Jitendra K. Tugnait, Uma Gummadavelli
Existing approaches to blind channel estimation and deconvolution (equalization) focus exclusively on channel or inverse-channel impulse response estimation. It is well-known that the quality of the deconvolved output depends crucially upon the noise statistics also. Typically it is assumed that the noise is white and the signal-to-noise ratio is known. In this paper we remove these restrictions. Both the channel impulse response and the noise model are estimated from the higher-order (fourth, e.g.) cumulant function and the (second-order) correlation function of the received data via a least-squares cumulant/correlation matching criterion. It is assumed that the noise higher-order cumulant function vanishes (e.g., Gaussian noise, as is the case for digital communications). Consistency of the proposed approach is established under certain mild sufficient conditions. The approach is illustrated via simulation examples involving blind equalization of digital communications signals.
Noncausal nonminimum phase ARMA modeling of non-Gaussian processes
Athina P. Petropulu
A method is presented for the estimation of the parameters of a noncausal nonminimum phase ARMA model for non-Gaussian random processes. Using certain higher-order cepstra slices, the Fourier phases of two intermediate sequences, hmin(n) and hmax(n), can be computed, where hmin(n) is composed of the minimum phase parts of the AR and MA models, and hmax(n) of the corresponding maximum phase parts. Under the condition that the AR and MA models do not have common zeros, these two sequences can be estimated from their phases only, and lead to the reconstruction of the AR and MA parameters, within a scalar and a time shift. The AR and MA orders do not have to be estimated separately, but they are a byproduct of the parameter estimation procedure. Unlike existing methods, the estimation procedure is fairly robust if a small order mismatch occurs. Since the robustness of the method in the presence of additive noise depends on the accuracy of the estimated phases of hmin(n) and hmax(n), the phase errors occurring due to finite length data are studied.
Class of time-domain procedures for testing that a stationary time series is Gaussian
Eric Moulines, Karim Choukri, Jean-Francois Cardoso
In this contribution, a class of time-domain procedures for testing that a stationary time-series is Gaussian, is presented. These tests are based on minimum chi-square statistics in the deviations of certain sample statistics from their ensemble counterpart. Exact asymptotic distributions of these tests are derived under the null hypothesis of Gaussianity and under a class of local and fixed alternatives. Two specific tests are then developed, based respectively on the third-order and the fourth-order moments and on the characteristic functions. Extensive simulations are presented to illustrate the power of the test against various alternatives (including additive and non-additive contaminations and non-linear serial dependence.
Retrieving random amplitude modulated harmonics using higher-order statistics
Georgios B. Giannakis, Guotong Zhou
Additive as well as multiplicative noise appears in weather radar returns, or, when acoustic waves propagate through the ocean. Second order spectral methods fail to retrieve frequencies of such random amplitude modulated harmonic time series, especially when the multiplicative noise is broadband. An existing fourth-order method assumes real Gaussian multiplicative and additive noise. A novel approach is proved herein to impose no constraints on the distribution or the color of the disturbances. It relies on fourth-order special cumulant or moment slices. In particular, the fourth-order moment spectrum is shown to offer computational advantages. Extension to multi-component models and generalizations to alternative sixth order moments are also provided.
Time Frequency Signal Analysis
icon_mobile_dropdown
Bootstrapping confidence bands for the instantaneous frequency
Abdelhak M. Zoubir, D. R. Iskander, Branko Ristic, et al.
Data produced by a reproducible source contains redundant information which allows seismic inversion to simultaneously determine the high-frequency fluctuation in the p-wave velocity (or reflectivity) as well as the input energy source. The seismogram model is the plane-wave convolutional model derived from the constant density, variable sound velocity acoustic wave equation. The first step is to analyze this linearized model when the background velocity is constant. Then perturbations in the seismic data stably determine corresponding perturbations in the source and reflectivity. The stability of this determination improves as the slowness aperture over which the data is defined increases. Further, the normal operator for the convolutional seismogram model is continuous with respect to velocity. Thus the stability result for constant background velocities may be extended to more realistic background velocity models which vary slowly and smoothly with depth. The theory above is illustrated with four synthetic numerical examples derived from marine data. The examples indicate that for a wide slowness aperture, inversion is very effective in establishing the true shape of the reflectivity and the shape and location of the compactly supported energy source. As this aperture window narrows, the corresponding inversion-estimated model still describes the data quite accurately, but the inversion is not able to recover the original two distinct parameters.
Blind deconvolution of ultrasound images
Udantha Abeyratne, Athina P. Petropulu, John M. Reid
We address the problem of improving the resolution of ultrasound images using blind deconvolution. The transducer measurement, that forms the ultrasound image, can be expressed as the convolution of two terms, the tissue response and the ultrasonic system response, plus additive noise. The quantity of interest is the tissue response, however, due to the convolution operation, we measure a blurred version of it, which obscures the fine details in the image. Our goal is to remove the blurring caused by the ultrasonic system, in order to enhance the diagnostic quality of the ultrasound image. Under the assumption that speckle behaves as an i.i.d. zero-mean non-Gaussian process, we were able to reconstruct the blurring kernel using bicepstrum operations on corresponding A-scan lines. Based on the estimated blurring kernel we performed deconvolution on the measured image. Preliminary results obtained from ultrasound images of a tissue mimicking phantom indicate significant resolution improvement.
General class of chi-square statistics for goodness-of-fit tests for stationary time series
Karim Choukri, Eric Moulines
In this contribution, a class of time-domain goodness-of-fit procedures for stationary time- series, is presented. These test procedures are based on minimum chi-square statistics in the deviations of certain sample statistics (obtained from finite-memory non-linear transformations of the process) from their ensemble counterparts. Two specific versions are derived, depending on the parameterization of the model manifold. Exact asymptotic distribution of these tests under the null hypothesis HO and local alternatives are derived. Two applications of this general procedure is finally presented, aiming at assessing that (1) a stationary scalar time-series is autoregressive and (2) that a multivariate stationary time-series is a noisy instantaneous mixture of independent scalar time-series.
Damped harmonics and polynomial phase signals
Guotong Zhou, Georgios B. Giannakis
The concern here is of retrieving damped harmonics and polynomial phase signals in the presence of additive noise. The damping function is not limited to the exponential model, and in certain cases, the additive noise does not have to be white. Three classes of algorithms are presented, namely DFT based, Kumaresan-Tufts type extensions, and subspace variants including the MUSIC algorithm. Preference should be based on the available data length and frequency separations. In addition, retrieval of self coupled damped harmonics, which may be present when nonlinearities exist in physical systems, is investigated. Simulation examples illustrate main points of the paper.
Fast computation of critically sampled time frequency signal representations
Nenad M. Marinovic
An algorithm is proposed to compute samples of any bilinear joint time-frequency representation of the Cohen's class. The computation is performed on a decimated sampling grid, mapping N equals 3BD signal samples into N equals K X L critical samples of the joint representation in the time-frequency domain. This is in contrast with the usual approaches that perform the computation on a much denser grid, mapping N signal samples into N X N samples in the time-frequency plane. The algorithm is based on the discrete Zak transform and represents an extension of the work by Auslander et al. on fast computation of the ambiguity function. For a number of popular representations, the algorithm is shown to have computational complexity about the same as an ordinary FFT.
Mobile Communications
icon_mobile_dropdown
Constant modulus factorization technique for smart antenna applications in mobile communications
Alle-Jan van der Veen, Arogyaswami Paulraj
A fundamental problem in sensor array signal processing is to separate and retrieve all independent co-channel signals that arrive at the antenna array. Such problems arise in smart antenna applications for mobile wireless communication, such as interference reduction and in- cell frequency reuse. In a mobile environment, the presence of large delay multipath makes the array manifold poorly defined, and spatial model methods are not applicable. However in case the signals have a constant modulus property (as in FDMA/FM systems like AMPS or TACS), iterative algorithms such as Godard and CMA have been used to retrieve the signals. Because of a non-convex optimization criterion, these algorithms suffer from local minima and random convergence behavior, with no satisfactory remedy known as yet. In this paper, we present an algorithm to compute the exact solution to the underlying constant modulus (CM) factorization problem. With this new approach, it is possible to detect the number of CM signals present at the array, and to retrieve all of them exactly, rejecting other, non-CM signals. Only a modest amount of samples are required. The algorithm is robust in the presence of noise, and is tested on real data, collected from an experimental set-up.
Nonstationary model for the simulation of mobile radio channels
Bouchra Senadji
This paper addresses the problem of channel modelling for the simulation of radio communication sub-systems. A statistical test for stationarity is first applied on measured impulse responses of an urban channel. This test reveals non-stationary trends in the channel. A new model is then proposed which takes into account the channel non-stationarity. It is based on statistical modelling of the variations of an existing stationary model when moving through areas where local stationarity can be assumed.
New binary modulation process
Bernard Lacaze
The effect of clock changes on binary processes is to shift the transitions, and to multiply them in the case where they are represented by rapidly varying functions. When the clock change is periodic, we show that its effect is the same as the one for a particular frequency change family. A simple modulation process is deduced from this.
Nonlinear Dynamics and Signal Processing
icon_mobile_dropdown
Modeling and identification of nonlinear dynamic systems
Sheng Chen
The paper summarizes some results of nonlinear system modelling and identification. Connections with the dynamical systems theory and neural networks are emphasized. Two general modelling approaches are highlighted. Issues of identifiability and model validation are also discussed.
Nonlinear adaptive filters for time variant equalization
Colin F. N. Cowan
Adaptive equalization is a well established procedure used in data communications systems to compensate for channel distortions and thus allow higher signalling rates. Such systems are made adaptive because the channel characteristics are both a-priori unknown and, usually, time variant. In many systems this time variation is slow and, therefore, does not cause a problem for the adaptive processes employed. However, in the case of certain application areas, such as mobile terminals, the rate of time variation can be significant in relation to the signalling rate. Under these circumstances the performance of normal adaptive equalizers degrades rapidly. This paper presents a new adaptive equalizer structure of a non-linear form which makes use of both time and instantaneous amplitude information to provide a structure which is matched to this particular problem. Simulation results are presented which demonstrate the superiority of this structure relative to normal linear equalizers.
Filtering chaotic time series
Michael E. Davies
We consider the relationship between modelling dynamics with a nonlinear function approximation and the application of a noise reduction algorithm. Filtering the data with such an algorithm is then shown to provide a better deterministic model for the data than an ordinary least squares estimate.
Visualizing predictability with chaotic ensembles
Leonard A. Smith
The self-consistent prediction of nonlinear, potentially chaotic, systems must account for observational noise both when constructing the model and when determining the current state of the system to be used as `the' initial condition. In fact, there exists an ensemble of initial states of the system, whose members are each consistent with a given observation. Determining the reliability of a forecast, even under a perfect model, requires an understanding of the evolution of this ensemble, and implies that traditional prediction of a single trajectory is rarely consistent with the assumptions upon which the underlying model is based. The implications of ensemble forecasts for chaotic systems are considered, drawing heavily from research in forecast verification of numerical weather prediction models. Applications in `simpler' laboratory scale systems are made, where nonlinear dynamical systems theory can be put to the test. In brief, it is argued that self consistency requires the estimation of probability density functions through ensemble prediction even in the case of deterministic systems: if observational noise is considered in the construction of the model, it must also be accepted in the determination of the initial condition.
Filtering time series with topology
David S. Broomhead, J. P. Huke
Following a brief discussion of the potential relevance of chaotic noise models, we consider the problem of separating a signal from an additive mixture with nonlinear noise. The approach we take exploits various properties of linear filters: their linearity is, of course, important when dealing with additive mixtures of signals, but we also need to understand their effect on nonlinear processes. We describe how FIR and IIR filters differ radically in this respect, and discuss the ways in which each can be used in conjunction with various nonlinear transformations for signal separation.
Adaptive algorithms for bilinear filtering
V. John Mathews, Junghsi Lee
This paper presents an overview of adaptive nonlinear filters equipped with bilinear system models. Bilinear filters are recursive nonlinear systems that belong to the class of polynomial systems. Because of the feedback structure, such models are able to represent many nonlinear systems efficiently. The paper first describe stochastic gradient adaptive bilinear filters. The class of recursive least-squares adaptive bilinear filters are discussed next. Stability issues associated with bilinear system models and adaptive bilinear filters are also discussed in the paper. The paper concludes with an experimental comparison of the performance of linear, truncated second-order Volterra, and bilinear system models in a nonlinear channel equalization problem.
Subspace Tracking
icon_mobile_dropdown
Recursive least-squares-based subspace tracking
Bin Yang
In this paper, we introduce a new interpretation of the signal subspace as the solution of an unconstrained minimization problem. We show that recursive least squares techniques can be applied to track the signal subspace recursively by making an appropriate projection approximation of the cost function. The resulting algorithms have a computational complexity of O(nr) where n is the input vector dimension and r(r<n) is the number of desired eigen components. We demonstrate that this approach can also be extended to track the rank, i.e. the number of signals, at the same order of linear (approximately n) computational complexity. Simulation results show that our algorithms offer a comparable and in some cases more robust performance than the spherical tracker by DeGroat, the URV updating by Stewart, and even the exact eigenvalue decomposition.
Schur method for low-rank matrix approximation
Alle-Jan van der Veen
The usual way to compute a low-rank approximant of a matrix H is to take its truncated SVD. However, the SVD is computationally expensive. This paper describes a much simpler generalized Schur-type algorithm to compute similar low-rank approximants. For a given matrix H which has d singular values larger than (epsilon) , we find all rank d approximate H such that H - H has 2-norm less than (epsilon) . The set of approximants includes the truncated SVD approximation. The advantages of the Schur algorithm are that it has a much lower computational complexity (similar to a QR factorization), and directly produces estimates of the column space of the approximants. This column space can be updated and downdated in an on-line scheme, amenable to implementation on a parallel array of processors.
Computing the ULLV decomposition
The ULLVD is an approximation of the generalized singular value decomposition; it is also an extension of the ULV decomposition. In this paper, we describe an implementation of a ULLVD algorithm and show how to compute the ULLVD accurately, stably, and efficiently. A MATLAB program is included.
Adaptive small-sample condition estimator for matrices with rank-one modifications
Tong J. Lee
A number of adaptive condition number estimators have been proposed in the past to dynamically estimate the sensitivity of the coefficient matrix of a linear systems of equations. Applications of these techniques often arise in the context of signal processing, where the information matrix is being updated with rank-one modifications. Various schemes, such as ACE, ALE and ICE, were proposed to cope with this problem. In this paper, we will briefly review the past work, and show how the small-sample condition estimator can be used in an adaptive manner.
Systematic exploration of the space of Jacobi algorithms
Hylke W. van Dijk, Ed F. A. Deprettere
The ordering of operations in the execution of Jacobi-type algorithms is not unique. Given a sequential imperative program specification of a Jacobi algorithm, there is a method of transformational reasoning to convert the program to any one of a set of input-output equivalent concurrent programs. The method explores associativity and (pseudo) commutativity properties in the algorithm to tune the program's critical path length to an optimal throughput in a desired parallel implementation. The method constructs a certain precedence graph in which vertices represent elementary transformation steps and edges expose step precedence relations. Every feasible cut-set of the precedence graph yields a dependence graph of a concurrent program which is input-output equivalent to the given one. Moreover, regular dependence graphs will be transformed into regular dependence graphs if the cut-set is chosen to keep that property invariant. The method has been successfully applied to time adaptive algorithms in which QR, inverse QR and SVD Jacobi algorithms play a crucial role. The time adaptive SVD algorithm will be used in this paper to illustrate the power of the method. Starting off with the well known cyclic by rows Kogbetliantz program, the transformations gradually decrease its critical path thereby providing various alternative concurrent programs with correspondingly increasing parallel implementation throughput.
Numerically stable Jacobi array for parallel singular value decomposition (SVD) updating
Filiep J. Vanpoucke, Marc Moonen, Ed F. A. Deprettere
A novel algorithm is presented for updating the singular value decomposition in parallel. It is an improvement upon an earlier developed Jacobi-type SVD updating algorithm, where now the exact orthogonality of a certain matrix is guaranteed by means of a minimal factorization in terms of angles. Its orthogonality is known to be crucial for the numerical stability of the overall algorithm. The factored approach leads to a triangular array of rotation cells, implementing an orthogonal matrix-vector multiplication, and a novel array for SVD updating. Both arrays can be built up of CORDIC processors since the algorithms make exclusive use of orthogonal planar transformations.
Array Processing
icon_mobile_dropdown
Updating rate of Jacobi singular value decomposition (SVD) arrays and data nonstationarity
Flavio Lorenzelli, Kung Yao
An effective updating algorithm for singular value decomposition, based on Jacobi rotations, has recently been proposed. This algorithm is composed of two basic steps: QR updating and rediagonalization. By proper interleaving these two operations, parallel implementations with very high updating rates are possible. In this paper, we are concerned with the behavior of this algorithm for nonstationary data, and the effect of the pipeline rate on tracking accuracy. In order to overcome the trade-off between accuracy and updating rate intrinsic in the original algorithm, we proposed two schemes which improve the overall performance when the rate of change of the data is high. In the `variable rotational rate' scheme, the number of Jacobi rotations per update is dynamically determined. The alternative approach is to make the forgetting factor variable and data-dependent. Behavior and performance of both schemes are discussed and compared.
Fast time series adaptive filtering algorithms based on the QRD inverse-updates method
Ian K. Proudler
We derive, from first principles, a new adaptive filtering algorithm for time-series data based on the QRD inverse-updates method of Pan and Plemmons. In common with other fast algorithms for time-series adaptive filtering, this algorithm only requires O(p) operations for the solution of pth order problem. Unlike previous fast algorithm based on the QRD technique, the algorithm presented here (in both square-root and square-root-free forms) explicitly produces the transversal filter weights. The results of some preliminary computer simulations of the algorithm, using finite-precision floating-point arithmetic, are presented.
Iterative version of the QRD for adaptive recursive least squares (RLS) filtering
Juergen Goetze
A modified version of the QR-decomposition (QRD) is presented. It uses approximate Givens rotations instead of exact Givens rotations, i.e., a matrix entry usually annihilated with an exact rotation by an angle (sigma) is only reduced by using an approximate rotation by an angle (sigma) . The approximation of the rotations is based on the idea of CORDIC. Evaluating a CORDIC-based approximate rotation is to determine the angle (sigma) equals (sigma) t equals arctan 2-t, which is closest to the exact rotation angle (sigma) . This angle (sigma) t is applied instead of (sigma) . Using approximate rotations for computing the QRD results in an iterative version of the original QRD. A recursive version of this QRD using CORDIC-based approximate rotations is applied to adaptive RLS filtering. Only a few angles of the CORDIC sequence, r say (r << b, where b is the word length), work as well as using exact rotations (r equals b, original CORDIC). The misadjustment error decreases as r increases. The convergence of the QRD-RLS algorithm, however, is insensitive to the value of r. Adapting the approximation accuracy during the course of the QRD-RLS algorithm is also discussed. Simulations (channel equalization) confirm the results.
Comparative analysis regarding numerical concerns for recursive least squares lattice (RLSL) algorithms
James R. Bunch, Richard C. LeBorne
Much interest has been devoted to the numerical behavior of the Recursive Least Squares Lattice (RLSL) algorithms. Two types of updating schemes that are equivalent in exact arithmetic have been proposed previously: the direct and the indirect versions of the RLSL algorithm. However, the numerical behavior of these algorithms is quite different, the direct updating scheme being the more robust of the two. We provide the results from an arithmetic error analysis that uses first order approximation techniques to give arithmetic error effects which propagate in both time and order for two fast adaptive RLSL algorithms.
Direction-of-arrival estimation in unknown correlated noise using MUSIC and higher order statistics
Hsien-Sen Hung, Keshi Chen
In this paper we consider the direction-of-arrival (DOA) estimation of Gaussian signals in non- Gaussian noise of unknown spatial correlations. The proposed method involves two primary steps. Firstly, the third-order cumulant function of the random wavefield received by an array of sensors is estimated and then projected onto the correlation domain to exploit the correlations of noise across the sensors in an array. Secondly, the estimated noise correlation matrix is then utilized in a correlation-based eigenstructure method, such as MUSIC, for DOA estimation. As expected, the proposed method improves the performance of MUSIC which requires the knowledge of noise correlation. Numerical simulation results for both the MUSIC and proposed algorithms are presented and compared.
Contribution of the polarization diversity in H.F. direction finding systems
Yvon Erhel, Alawi Edjeou, Louis Bertel
The aim of the paper is to estimate the contribution of the polarization diversity in high frequency (3 - 30 MHz) direction finding systems. We first describe the peculiarities of H.F. propagation and the resulting signal model involved in computer simulations. Next, we analyze the behavior of some particular direction finding systems using linear and circular geometries and polarization diversity. Some algorithms (non linear frequential analysis, M.U.S.I.C.) are tested in several conditions (narrowband or broadband signals, polarization filtering reiterated or no, sub-sampling). Theoretical and experimental results show that polarization diversity based upon the knowledge of the antenna complex responses improves greatly the efficiency of direction finding.
Fast Solvers for Structured Systems
icon_mobile_dropdown
Look-ahead Schur-type algorithm for solving general Toeplitz systems
Roland W. Freund
The classical fast Toeplitz solvers, such as the Levinson algorithm and Schur-type algorithms, require that all leading principal submatrices of the Toeplitz matrix be nonsingular, and they are numerically unstable for general Toeplitz systems. We present a Schur-type algorithm with look-ahead that can be applied to general Toeplitz systems with singular and nonsingular, but ill-conditioned leading principal submatrices.
Implementation of a superfast algorithm for symmetric positive definite linear equations of displacement rank 2
Thomas K. Huckle
In this paper we describe the implementation and first numerical results for the superfast algorithm based on a modified version of the Bitmead/Anderson-algorithm for real symmetric positive definite matrices of displacement rank 2. The total number of arithmetic operations for this algorithm is of order 93.75 nlog(n)2 flops. The method is based on repeatedly dividing the original problem into two subproblems with leading principal submatrix and the related Schur complement. All occurring matrices are represented by generating vectors of their displacement rank characterization.
Very fast algorithm for single- and multi-microphone noise cancellation
Dov Ramm, Dan Chazan
Many applications, such as directive microphone arrays, employ linearly filtered signals from a number of sensors to enhance an audio source embedded in noise produced by other audio sources. It turns out that the noise cancellation properties are improved with the length of the filters, and filters possessing as many as 4000 taps may be used beneficially. For such filters, the techniques of adaptive filtering are inadequate resulting in poor convergence and noise cancellation properties. In this work it will be shown how to combine fast filter adaptation with accurate noise cancellation. A judicious choice of basis functions (modulated gaussians) leads to a sparse system of equations, resulting in a low computational complexity of the solver. In the first stage of the algorithm the relevant correlation functions are computed very rapidly using an FFT technique. This calculation requires about NlogN operations and produces the coefficients of a system of linear equations for the filter coefficients. In the second stage these equations are solved with a complexity of the number of equations, N, rather than in N2 operations as is the case for the usual Toeplitz solvers or in NlogN for contemporary fast Toeplitz solvers. The proposed scheme is superior to previously used schemes for filter lengths beginning from 50. The algorithm had been implemented in a single microphone noise canceller and in a directive microphone array.
Time-variant structured matrices: an application to instrumental variable methods
Ali H. Sayed
We derive a recursive algorithm for the time-update of the triangular factors of non-Hermitian time-variant matrices with structure. These are matrices that undergo low-rank modifications as time progresses. special cases of which often arise in adaptive filtering and instrumental variable (IV) methods. A natural implementation of the algorithm is via two coupled triangular arrays of processing elements. We consider, in particular, an IV parameter estimation problem and show how the arrays collapse to a coupled parallelizable solution of the identification problem.
Generalization of Strang's preconditioner with applications to iterative deconvolution
Raymond Hon-fu Chan, Michael Kwok-po Ng, Robert J. Plemmons
In this paper, we proposed a method to generalize Strang's circulant preconditioner for arbitrary n-by-n matrices An. The [n/2]th column of our circulant preconditioner Sn is equal to the [n/2]th column of the given matrix An. Thus if An is a square Toeplitz matrix, then Sn is just the Strang circulant preconditioner. When Sn is not Hermitian, our circulant preconditioner can be defined as (S*nSn)1/2. This construction is similar to the forward-backward projection method used in constructing preconditioners for tomographic inversion problems in medical imaging. Comparisons of our preconditioner Sn with other circulant-based preconditioners are carried out for some 1D Toeplitz least squares problems: minb - Ax2. Preliminary numerical results show that Sn performs quite well. Test results are also reported for a 2D deconvolution problem arising in ground-based atmospheric imaging.
Regularization Methods
icon_mobile_dropdown
Fast restoration of atmospherically blurred images
James G. Nagy, Robert J. Plemmons, Todd C. Torgersen
This paper concerns solving deconvolution problems for atmospherically blurred images by the preconditioned conjugate gradient algorithm, where a new approximate inverse preconditioner is used to increase the rate of convergence. Removing a linear, shift-invariant blur from a signal or image can be accomplished by inverse or Wiener filtering, or by an iterative least squares deblurring procedure. Because of the ill-posed characteristics of the deconvolution problem, in the presence of noise, filtering methods often yield poor results. On the other hand, iterative methods often suffer from slow convergence at high spatial frequencies. Theoretical results are established to show that fast convergence for our iterative algorithm can be expected, and test results are reported for a ground-based astronomical imaging problem.
Reducing boundary distortion in image restoration
Franklin T. Luk, David Vandevoorde
The ill-conditioned nature of the inverse problem of image restoration produces dramatic consequences when the boundaries of the image are incorrectly modeled. Solutions to this problem are difficult to find in the literature. In this paper, we develop and analyze two sets of new approaches that are effective with algebraic restoration procedures.
Fast algorithms for the regularization of banded Toeplitz least squares problems
An algorithm for computing solutions to ill-conditioned banded Toeplitz least squares problems by a rank revealing URV factorization is considered. The factorization is computed in O((beta) nlogn + (beta) n2), where (beta) is the bandwidth of the coefficient matrix. An approximate solution to ill-conditioned banded Toeplitz systems, in the presence of noise, is then obtained by truncating the factorization. Numerical results are provided that illustrate truncated URV can compute solutions comparable to the more expensive truncated singular value decomposition.
Regularization approach for signal extrapolation with wavelets
We propose a scale-limited signal model based on wavelet representation and study the reconstructability of scale-limited signals via extrapolation in this research. In analogy with the band-limited case, we define a scale-limited time-concentrated operator, and examine various vector spaces associated with such an operator. It is proved that the scale-limited signal space can be decomposed into the direct sum of two subspaces and only the component in one subspace can be exactly reconstructed, where the reconstructable subspace can be interpreted as a space consisting of scale/time-limited signals. Due to the ill-posedness of scale-limited extrapolation, a regularization process is introduced for noisy data extrapolation.
Frequency Representations and Machine Implementations
icon_mobile_dropdown
Transformation of operators and scale
Explicit expressions are given for the transformation of an operator from one representation to another. Both continuous and discrete representations are considered. Examples are given for the time-frequency and scale operators in various representations including the Hermite, Fourier, discrete Fourier, scale, time, and sample theorem representations. Also, we generalize the characteristics operator method where we use an arbitrary complete set rather than complex exponentials.
Frequency-selective techniques based on singular value decomposition (SVD), total least squares (TLS), and bandpass filtering
Hua Chen, Sabine Van Huffel, Joos Vandewalle
Much research has focused on estimating the parameters of a sum of exponentially damped sinusoids. In some applications, such as nuclear magnetic resonance signal processing, only a few of the sinusoids are of interest. This paper presents some new frequency-selective techniques for estimating the parameters of sinusoids within a specified frequency region using a subspace and SVD-based estimation algorithm and an FIR filter matrix. The applicable estimation algorithms in this technique are the linear prediction method, the matrix pencil method, Kung et al.'s method and its total-least-squares variant, called the HTLS method. The benefits of the frequency-selective technique combined with the HTLS estimation method are confirmed through simulations.
Pipelining multiple singular value decomposition (SVDs) on a single processor array
Kishore Kota, Joseph R. Cavallaro
We present a new family of architectures for processor arrays to implement Jacobi SVD which allow systolic loading and unloading of input and result matrices. Unlike most of the previous SVD arrays in the literature, our architectures do not require special handling of external I/O and hence are closer to the traditional concept of systolic architectures. The boundary processors communicate with the host the same way any of the interior processors communicate with their neighbors. The arrays are surprisingly uniform and simple. The various architectures in the family represent different throughput-hardware tradeoffs corresponding to the degree to which the multiple sweeps have been unrolled and determine the number of independent SVDs which may be pipelined on the array. We achieved systolic loading by using the flexibility provided by the cyclic Jacobi method on the order in which pivot pairs may be chosen. The array operates on the matrix data even as it is being loaded. Once the pipeline is full, the ordering is very similar to odd-even ordering. Our ordering is equivalent to cyclic-by-rows ordering and hence the algorithm is guaranteed to converge. Our systolic loading scheme is very important in an I/O limited system, since it allows more communication to occur in parallel, where the communication includes the loading and unloading operations. The array with the highest throughput in our family of architectures, which implement one-sided Jacobi (either Hestenes' method or Eberlein and Park's method), is a linear array of processors with unidirectional links between neighbors. The architectures with lower throughput require fewer processors connected in a ring, allowing data to recirculate among the processors. The input matrix is loaded one column at a time from the left and the results stream out one column at a time from the right.
Architecture of a complex arithmetic processor for communication signal processing
Susan L. Gilfeather, John B. Gehman Jr., Calvin Harrison
The Complex Arithmetic Processor (CAP) is a high performance, single chip Digital Signal Processor optimized for communication signal processing operations. The CAP VLSI provides the communication system building block necessary to meet the increased signal processing requirements of complex modulation types, voice and image compression while maintaining the requirement for small, low power implementations. The chip is intended for high speed, low power digital communication system applications such as hand held spread spectrum communications systems. The CAP architecture has been developed specifically for the complex arithmetic functions required in communication signal processing. The CAP is a software programmable, highly integrated parallel array of processors containing the arithmetic resources, memories, address generation, bit manipulation and logic functions necessary to support the sophisticated processing required in advanced communication equipment. The CAP executes a 1024 point complex Fast Fourier Transform in 131 microseconds.
Architectures
icon_mobile_dropdown
Highly scalable hardware accelerator for digital signal processing
Chi-Hung Chi, Siu-Chung Lau
In this paper, a highly scalable hardware accelerator design for digital signal processing is presented. The key features of this accelerator are minimum I/O operations, highly scalable massive parallelism, easy programming, and modularity and regularity. With a very large register file (> 1000) per processing element, the reuse factor per datum in this accelerator can be increased significantly (as compared to traditional DSP architectures). System performance is improved because the amount of data transfer between the on-chip cache and the off-chip cache/memory is reduced by the same factor. Since the basic building block of this accelerator is simply a VLSI chip with several processing elements, scalable massive parallelism can be achieved by connecting multiple chips together in a SIMD `vector- like' fashion. Finally, programming of this accelerometer is not difficult because it is operated under the SIMD `vector-like' mode. With the expected VLSI technology in the next few years, the throughput of one single accelerator chip can approach GFLOPs performance. Hence, the high computing power needed by digital signal processing applications can be provided by just connecting a small number of this chip together.
Parallel digital signal processing architectures for image processing
Shirish P. Kshirsagar, David Andrew Hartley, David Mark Harvey, et al.
This paper describes research into a high speed image processing system using parallel digital signal processors for the processing of electro-optic images. The objective of the system is to reduce the processing time of non-contact type inspection problems including industrial and medical applications. A single processor can not deliver sufficient processing power required for the use of applications hence, a MIMD system is designed and constructed to enable fast processing of electro-optic images. The Texas Instruments TMS320C40 digital signal processor is used due to its high speed floating point CPU and the support for the parallel processing environment. A custom designed VISION bus is provided to transfer images between processors. The system is being applied for solder joint inspection of high technology printed circuit boards.
Massively parallel signal processing architecture based on response of retina cells
A massively parallel signal and image processing architecture is considered. The architecture is comprised of 2D arrays of cells that simulate the response of retina neurons. The results of simulations are compared to previously published experimental results and the system is applied to detection of spatio-temporal features in sequences of images representative of pulse- doppler radar images. By arranging the output layer so that the cells respond to various key input features an array of feature extraction cells can be obtained. The system is characterized by developing an image space to feature space mapping.
Compilation time analysis to minimize run-time overhead in preemptive scheduling on multiprocessors
Piet Wauters, Rudy Lauwereins, J. Peperstraete
This paper describes a scheduling method for hard real-time Digital Signal Processing (DSP) applications, implemented on a multi-processor. Due to the very high operating frequencies of DSP applications (typically hundreds of kHz) runtime overhead should be kept as small as possible. Because static scheduling introduces very little run-time overhead it is used as much as possible. Dynamic pre-emption of tasks is allowed if and only if it leads to better performance in spite of the extra run-time overhead. We essentially combine static scheduling with dynamic pre-emption using static priorities. Since we are dealing with hard real-time applications we must be able to guarantee at compile-time that all timing requirements will be satisfied at run-time. We will show that our method performs at least as good as any static scheduling method. It also reduces the total amount of dynamic pre-emptions compared with run time methods like deadline monotonic scheduling.
Nonlinear dynamical systems analyzer
Adrian S. Coffey, Martin Johnson, Robin Jones
Computationally intensive algorithms are an ever more common requirement of modern signal processing. Following the work of Gentleman and Kung, McWhirter, Shepherd and Proudler suggested that certain matrix-orientated algorithms can be mapped onto systolic array architectures for adaptive linear signal processing. This has been extended by Broomhead et al. to the calculation of nonlinear predictive models and applied by Jones et al. to target identification and recognition. We shall show that predictive models are extremely sharp discriminators. Our chosen problem, if implemented as a systolic array, would require 3403 processors which would result in high through-put rate at excessive cost. We are developing an efficient sub-optimally implemented systolic array; one processor servicing more than one systolic node. We describe a prototype Heuristic Processor which computes a multi- dimensional, nonlinear, predictive model. It consists of a Radial Basis Function Network and a least squares optimizer using QR decomposition. The optimized solution of a set of simultaneous equations in 81 unknowns is calculated in 150 (mu) S. The QR section emulates a triangular systolic array by the novel use of an array of 40 mature silicon DSP chips costing under $DOL100 each. The DSP chips operate in synchronism at a 50 MHz clock rate passing data to each other through multi-port memories on a dead-letter box principle; there are no memory access conflicts and only two-port and three-port memories are required. The processor provides 1-GFlop of computing power per cubic-foot of electronics for a component cost of approximately $DOL15,000.
Algorithm-Based Fault Tolerance
icon_mobile_dropdown
Recent advances in algorithm-based fault tolerance
Jacob A. Abraham
The technique of Algorithm-Based Fault Tolerance was developed as a low-cost scheme for providing concurrent error detection and fault location in computation-intensive applications. This paper describes some of the recent work in this area, including new encoding schemes, modeling techniques, and experimental evaluation. Possible directions for future research are pointed out.
Fault-tolerant adaptive FIR filters using variable detection threshold
L. K. Lin, G. Robert Redinbo
Adaptive filters are widely used in many digital signal processing applications, where tap weight of the filters are adjusted by stochastic gradient search methods. Block adaptive filtering techniques, such as block least mean square and block conjugate gradient algorithm, were developed to speed up the convergence as well as improve the tracking capability which are two important factors in designing real-time adaptive filter systems. Even though algorithm-based fault tolerance can be used as a low-cost high level fault-tolerant technique to protect the aforementioned systems from hardware failures with minimal hardware overhead, the issue of choosing a good detection threshold remains a challenging problem. First of all, the systems usually only have limited computational resources, i.e., concurrent error detection and correction is not feasible. Secondly, any prior knowledge of input data is very difficult to get in practical settings. We propose a checksum-based fault detection scheme using two-level variable detection thresholds that is dynamically dependent on the past syndromes. Simulations show that the proposed scheme reduces the possibility of false alarms and has a high degree of fault coverage in adaptive filter systems.
Fault-tolerant FIR adaptive filter using the conjugate gradient algorithm
Bernard A. Schnaufer, W. Kenneth Jenkins
In several recent papers we have introduced the notion of Adaptive Fault Tolerance (AFT), where the adaptive process is used to automatically compensate for faults occurring in the adaptive coefficients. Several adaptive filtering structures have been proposed which use AFT to provide fault tolerance with very low hardware overhead, while maintaining reasonable convergence behavior for certain classes of input signals. The previous structures were adapted with the LMS algorithm which is known to have poor convergence properties when the input signal is highly colored. In this paper we propose using the Conjugate Gradient (CG) algorithm to update the filter coefficients. The CG algorithm, which allows rapid convergence of the fault tolerant adaptive filter regardless of the input noise statistics, is known not to suffer from the same performance drawbacks as the LMS algorithm. Simulations will be presented to demonstrate the performance of the CG algorithm and comparisons with previous work will be made. Implementation issues will also be discussed.
Fault-tolerant QR decomposition for adaptive signal processing
M. P. Connolly, Patrick Fitzpatrick
We present an algorithm-based fault tolerant scheme for QR decomposition, that extends the well-known Gentleman-Kung-McWhirter triangular systolic array architecture. Assuming that the array is subject to transient faults, widely separated in time and each affecting a single processor, we give an algorithm that corrects the full triangular array with computational overhead equivalent, on average, to the interpolation of a single extra vector into the data stream.
Algorithm-based fault tolerance for noncomputationally intensive applications
V. S. Sukumar Nair, S. Venkatesan
Algorithm-based fault tolerance (ABFT) has been proposed as a cost-effective approach to concurrent error detection. So far, the application of ABFT has been limited to computationally intensive applications that lend easily to high-level fault modeling. In this paper we extend the application of ABFT to non-computationally intensive applications. To that end, we first develop a fault model for such systems. Based on the fault model, we develop ABFT schemes for a set of graph theoretic as well as set theoretic problems. Application of the new schemes is illustrated with examples.
Application of algorithm-based fault tolerance to high-level synthesis of signal flow graphs
Tanay Karnik, Shankar Ramaswamy, Steve M. Kang, et al.
Algorithm based fault tolerance techniques have been used for systolic processors and general purpose multiprocessors. In this paper, we have applied an algorithm based fault tolerance technique to high-level synthesis of signal flow graphs. The technique incorporates reliability into synthesized DSP filters. Given a Signal Flow Graph representation for these filters, our schemes synthesize a reliable schedule for the operations in them and allocate functional units to the operations subject to hardware and reliability constraints. We impose reliability constraints to avoid compensating errors in fault detection. Our first scheme uses the well known duplicate and compare approach, while the second uses a novel linearity based checking approach used in algorithm based fault tolerance methods for matrix computations. The schemes have been implemented and results obtained by using them on sample signal flow graphs are presented. These results show the linearity based scheme to have a low time overhead. For example, this scheme takes about 10% extra time for the reliable synthesis of a 10th order IIR filter. Our proposed work extends previous work in the area in three directions: (1) use of linearity based checks, over duplication based checks, (2) handling cyclic flow graphs, over previous proposals for acyclic graphs, and (3) reliability constrained hardware mapping. Extensions of the schemes to nonlinear flow graphs is also proposed.
Canonical computational model for algorithm-based fault tolerance (ABFT) systems
Ragini Narasimhan, Daniel J. Rosenkrantz, S. S. Ravi
Fault tolerance measures (such as fault detectability and fault locatability) of systems employing Algorithm-Based Fault Tolerance (ABFT) are determined by a binary relationship between fault patterns and error patterns. This relationship specifies whether a given fault pattern can induce a given error pattern. We develop a succinct and canonical representation of this relationship and present an efficient algorithm for obtaining this representation. We show that two ABFT systems have the same fault tolerance measures only when their canonical representation are identical.
Redundant finite rings for fault-tolerant signal processors
Graham A. Jullien, S. S. Bizzan, Neil M. Wigley, et al.
Redundant Residue Number Systems (RRNS) have been proposed as suitable candidates for fault tolerance in compute intensive applications. The redundancy is based on multiple projections to moduli sub-sets and conducting a search for results that lie in a so-called illegitimate range. This paper presents RRNS fault tolerant procedures for a recently introduced finite polynomial ring mapping procedure (modulus replication RNS). The mapping technique dispenses with the need for many relatively prime ring moduli, which is a major draw-back with conventional RRNS systems. Although double, triple, and quadrupole modular redundancy can be implemented in the polynomial mapping structure, polynomial coefficient circuitry, or the independent direct product ring computational channels, for error detection and/or correction, this paper discusses the implementation of redundant rings which are generated by (1) redundant residues, (2) spare general computational channels, or (3) a combination of the two. The first architecture is suitable for RNS embedding in the MRRNS, and the second for single moduli mappings. The combination architecture allows a trade-off between the two extremes. The application area is in fault tolerant compute intensive DSP arrays.
Algorithm-based fault-tolerant array architecture for Fermat number transform
Jamel M. Tahir, Satnam Singh Dlay, Raouf N. Gorgui-Naguib, et al.
For many real-time and scientific applications, it is desirable to perform signal and image processing algorithms by means of special hardware with very high speeds. With the advent of VLSI technology, large collections of processing elements, which cooperate with each other to achieve high-speed computation, have become economically feasible. In such systems, some level of fault tolerance must be obtained to ensure the validity of the results. Fermat number transforms (FNT's) are attractive for the implementation of digital convolution because the computations are carried out in modular arithmetic which involves no round-off error. In this paper we present a fault tolerant linear array design for FNT by adopting the weighted checksum approach. The results show that the approach is ideally suited to the FNT since it offers fault tolerance, with very low cost, free from round-off error and overflow problems.
Error detection in digital neural networks: an algorithm-based approach for inner product protection
Luca Breveglieri, Vincenzo Piuri
Artificial Neural Networks are an interesting solution for several real-time applications in the area of signal and image processing, in particular since recent advances in VLSI integration technologies allow for efficient hardware realizations. The use of dedicated circuits implementing the neural networks in mission-critical applications requires a high level of protection with respect to errors due to faults to guarantee output credibility and system availability. In this paper, the problem of concurrent error detection in dedicated neural networks is discussed by adopting an algorithm-based approach to check the inner product, i.e., the most of the computation performed in the neural network. Effectiveness and efficiency of this technique is shown and evaluated for the widely-used classes of neural paradigms.
Frequency Representations and Machine Implementations
icon_mobile_dropdown
Rapid prototyping of application-specific signal processors architectural issues
Walter G. Protas, Fred Shirley
The Rapid Prototyping of Application Specific Signal Processors (RASSP) program has as its goal the dramatic improvement of the process by which signal processors are specified, designed, documented, manufactured, and supported. A key part of the RASSP process is the method of architecture selection. An architecture overview is presented, together with a discussion of architecture classification, architecture evaluation factors, and signal processing flow. Two important architecture support aspects are briefly discussed: open systems architecture and integrated diagnostics. Recommendations for preferred approaches and rules of thumb for architecture selection are given. Finally, the RASSP architecture selection process is outlined, including a discussion of a standard hardware/software module interface.
Mobile Communications
icon_mobile_dropdown
Adaptive beamformer delay compensation for active sonar
James M. Alsup
Delay compensation as a preliminary step to adaptive beamforming (ABF) is described. Such compensation can be used for linear or planar arrays whenever the temporal width of a signal's matched-filter output peak is significantly shorter than the propagation time across the array. Preliminary delay compensation is not normally used in passive ABF processing since the signals being processed are usually steady-state or can be regarded as such. However, when transient events much be processed, such as those associated with either impulsive echoes or with matched-filter outputs of large TW signals, then compensation prior to ABF may be required in order to obtain the full array gain for the signal of interest. Methods for achieving wide-band delay compensation are described, the parameters associated with defining the minimal compensation required are explained, and losses suffered when compensation is not used are discussed.
Subspace Tracking
icon_mobile_dropdown
Computing low-dimensional signal subspaces
Ricardo D. Fierro, Per Christian Hansen
A two-sided (or complete) orthogonal decomposition of an m X n matrix A is a product of an orthogonal matrix, a triangular matrix, and another orthogonal matrix. Two examples are the URV and ULV decompositions. In this paper we present and analyze URV and ULV algorithms that are efficient whenever the numerical rank k of the matrix is much less than min(m,n). We also prove that good estimates of the singular vectors, needed in the algorithms, lead to good approximations of the singular subspaces of A.