Advanced Signal Processing Algorithms, Architectures, and Implementations XII

Phase-amplitude study of clouds

Lorenzo Galleani, Leon Cohen, Gabriel Cristobal, et al.

Show abstract

Using artificially generated clouds we study the spectral phase and amplitude contribution to the cloud image. This is done by reconstructing the cloud image from spectral amplitude and/or phase only. Also, images are reconstructed from partial phase and amplitude in such a way that one may control the relative contribution of the phase and amplitude. We conclude that both phase and amplitude contribute to the cloud like appearance.

The speech scale, the Mel scale, and the tube model for speech

Srinivasan Umesh, Leon Cohen, Douglas J. Nelson

Show abstract

We use the tube model of speech production to study the speech-hearing connection. Recently, using real speech we showed that sounds made by different individuals and perceived to be the same can be transformed into each other by a universal warping function. We call the transformation function the speech scale and we have shown that it is similar to the Mel scale. Thus experimentally establishing the speech-hearing connection. In this paper we explore the possible origins of the speech scale and attempt to understand it from the point of view of the tube model of speech. We use the two-tube model for various vowels and study the effect of varying the lengths of the tubes on the location of formant frequencies. We show that if we use the commonly used assumption that the length of the front-tube does not change significantly when compared to the back tube for different individuals enunciating the same sound, then their corresponding formant frequencies are non-uniformly scaled. Using the same method we used for real speech we compute the warping function.

Generalized equivalent bandwidth defined by using Renyi's entropy and its estimation

Hisashi Yoshida, Sho Kikkawa

Show abstract

In this paper, we present the definition of a the generalized equivalent bandwidth (EBW) of a stochastic process. The generalized EBW is defined by W^(α) = exp(H^(α))/2, where H^(α) is Renyi's entropy H^(α) = 1/(1-α)log (_-infinityIntegral^infinity) p^α(f)df, p(f) is the normlized power spectrum and α greater than or equal to 0 is the order of the EWB. The generalized EBW is a new class of EBW which can represent major equivalent bandwidths uniformly. We also argue an interpretation of the generalized EBW from a different perspective. In latter of this article, we examine an estimation property of the generalized EBW. When we obtain an estimated smoothed power spectrum by using the convolution of periodogram and smoothing window, we evaluate how smoothing window length, data length or the variance of an estimated spectrum affect estimation of the generalized EBW. The result indicates that if we increase the data length while keeping the variance constant, the increase rate of the generalized EBW caused by smoothing window will decrease. On the other hand, if we decrease the variance while holding the data length fixed, the generalized EBW of estimated power spectrum will increase.

Multiple-tube resonance model

Lawrence H. Smith, Douglas J. Nelson

Show abstract

In speech analysis, a recurring acoustical problem is the estimation of resonant structure of a tube of non-uniform cross-sectional area. We model such tubes as a finite sequence of cylindrical tubes of arbitrary, non-uniform length. From this model, we derive a closed form expression of the resonant structure of the model and analytically derive the boundary conditions for the case of a constant group delay. Since it has been noted in the literature that the group delay of the vocal tract is constant, these boundary conditions hold for the vocal tract. In the limiting case, the non-uniform tube model reduces the well studied uniform tube model. For this limiting case, we derive an expression of the tube resonant structure in terms of a Fourier transform. Finally, we derive wave equations from the model, which are consistent with the wave equations for the telegraph wire problem.

Time-frequency-guided quadratic filters

William J. Williams

Show abstract

Time-frequency distributions (TFDs) of Cohen's class often dramatically reveal complex structures that are not evident in the raw signal. Standard linear filters are often not able to separate the underlying signal from background clutter and noise. The essense of the signal can often be extracted from the TFD by evaluating strategic slices through the TFD for a series of frequencies. However, TFDs are often computationally intense compared to other methods. This paper demonstrates that quadratic filters may be designed to capture the same information as is available in the specific slices through the TFD at a considerably lower computational cost. The outputs of these filters can be combined to provide a robust impulse-like response to the chosen signal. This is particularly useful when the exact time series representation of the signal is unknown, due to variations and background clutter and noise. It is also noted that Teager's method is closely related to TFDs and are an example of a quadratic filter. Results using an ideal matched filter and the TFD motivated quadratic filter are compared to give insight into their relative responses.

Local spectral and spatial frequency moments of shallow-water sound propagation

Patrick J. Loughlin

Show abstract

Underwater sound propagation is inherently nonstationary, particularly in shallow water where the ocean surface and bottom act like waveguide boundaries, giving rise to structural (or geometric) dispersion. The spectrogram has been a principal means to study the nonstationarities and dispersion characteristics of shallow-water sound propagation. In this paper, we give the low-order conditional time-frequency moments of a wave propagating in a waveguide. Comparison of these results is made to spectrograms of explosive source sound propagation in the Yellow Sea.

Space-time-frequency moment densities

David H. Hughes, Bruce W. Suter

Show abstract

The frequency operator, Ω equivalent to i (partial derivative) δ/δt, is not necessarily Hermitian when acting on nonstationary signals. Central moment densities of Ω and its conjugate, temporal operator Τ, are proportional to the real parts of local central moments derived from the Wigner distribution. A nonrelativistic space-time-frequency Wigner distribution forms the backdrop and motivation for the present investigation.

Characterization of near-field scattering using a multiple weighted summed beamformer

Gordon J. Frazer, Moeness G. Amin

Show abstract

The Sensor-Angle Distribution (SAD) is a recently introduced tool representing the power arriving at each sensor as a function of angle (or spatial frequency). It can be used to characterize near-field scatter environments. The SAD, as originally introduced, under-sampled the spatial correlation of the received signal (measured at each sensor) causing the SAD to be aliased for common source location cases. In this paper we indicate how this may be overcome. Additional results are provided showing that the SAD may be implemented as a multiple weighted subarray beamformer.

Tuning time-frequency methods for the detection of metered HF speech

Douglas J. Nelson, Lawrence H. Smith

Show abstract

Speech is metered if the stresses occur at a nearly regular rate. Metered speech is common in poetry, and it can occur naturally in speech, if the speaker is spelling a word or reciting words or numbers from a list. In radio communications, the CQ request, call sign and other codes are frequently metered. In tactical communications and air traffic control, location, heading and identification codes may be metered. Moreover metering may be expected to survive even in HF communications, which are corrupted by noise, interference and mistuning. For this environment, speech recognition and conventional machine-based methods are not effective. We describe Time-Frequency methods which have been adapted successfully to the problem of mitigation of HF signal conditions and detection of metered speech. These methods are based on modeled time and frequency correlation properties of nearly harmonic functions. We derive these properties and demonstrate a performance gain over conventional correlation and spectral methods. Finally, in addressing the problem of HF single sideband (SSB) communications, the problems of carrier mistuning, interfering signals, such as manual Morse, and fast automatic gain control (AGC) must be addressed. We demonstrate simple methods which may be used to blindly mitigate mistuning and narrowband interference, and effectively invert the fast automatic gain function.

Wavelet-LMS algorithm-based echo cancellers

Lalith Kumar Seetharaman, Sathyanarayana S. Rao

Show abstract

This paper presents Echo Cancellers based on the Wavelet-LMS Algorithm. The performance of the Least Mean Square Algorithm in Wavelet transform domain is observed and its application in Echo cancellation is analyzed. The Widrow-Hoff Least Mean Square Algorithm is most widely used algorithm for Adaptive filters that function as Echo Cancellers. The present day communication signals are widely non-stationary in nature and some errors crop up when Least Mean Square Algorithm is used for the Echo Cancellers handling such signals. The analysis of non-stationary signals often involves a compromise between how well transitions or discontinuities can be located. The multi-scale or multi-resolution of signal analysis, which is the essence of wavelet transform, makes Wavelets popular in non-stationary signal analysis. In this paper, we present a Wavelet-LMS algorithm wherein the wavelet coefficients of a signal are modified adaptively using the Least Mean Square Algorithm and then reconstructed to give an Echo-free signal. The Echo Canceller based on this Algorithm is found to have a better convergence and a comparatively lesser MSE (Mean Square error).

Time-frequency domain reflectometry for smart wiring systems

Yong-June Shin, Eun-Seok Song, Joo-Won Kim, et al.

Show abstract

In this paper, a new high resolution reflectometry scheme, time-frequency domain reflectometry, is proposed to detect and locate a fault in wiring. Traditional reflectometry methods have been achieved in either the time domain or frequency domain only. However, time-frequency domain reflectometry utilizes time and frequency information of a transient signal to detect and locate the fault. The time-frequency domain reflectometry approach described in this paper is characterized by time-frequency reference signal design and post-processing of the reference and reflected signals to detect and locate the fault. Using a computational electromagnetic model of a coaxial cable with a fault, time-frequency domain reflectometry has been demonstrated. Knowledge of time and frequency localized information for the reference and reflected signal gained via time-frequency analysis, allows one to detect the fault and estimate the location accurately.

Modeling the diurnal variation of ionospheric layers via Thom canonical potentials: time-frequency signatures

David H. Hughes

Show abstract

The A₃ Thom canonical polynomial is used in a Newtonian gradient system to model the diurnal variation of ionospheric E and F layers. Time-frequency signatures of the plasma frequency variation over a scaled day are modeled.

Segmented chirp features and hidden Gauss-Markov models for classification of wandering-tone signals

Phillip L. Ainsleigh

Show abstract

A new feature set and decision function are proposed for classifying transient wandering-tone signals. Signals are partitioned in time and modeled as having piecewise-linear instantaneous frequency and piecewise-constant amplitude. The initial frequency, chirp rate, and amplitude are estimated in each segment. The resulting sequences of estimates are used as features for classification. The decision function employs a linear Gaussian dynamical model, or hidden Gauss-Markov model (HGMM). The parameters that characterize the HGMM for each class are estimated from labeled training sequences, and the trained models are used to evaluate the class-conditional likelihoods of an unlabeled signal. The signal is assigned to the class whose model gives the maximum conditional likelihood. Simulation experiments demonstrate perfect classification performance in a three-class forced-choice problem.

Efficient implementation of a projection-based wavefront sensor

John Holder, Stephen C. Cain, Peter Mantica

Show abstract

In this paper, a new wave front sensor design that utilizes the benefits of image projections is described and analyzed. The projection-based wave front sensor is similar to a Shack-Hartman type wave front sensor, but uses a correlation algorithm as opposed to a centroiding algorithm to estimate optical tilt. This allows the projection-based wave front sensor to estimate optical tilt parameters while guiding off of point sources and extended objects at very low signal to noise ratios. The implementation of the projection-based wave front sensor is described in detail showing important signal processing steps on and off of the focal plane array of the sensor. In this paper the design is tested in simulation for speed and accuracy by processing simulated astronomical data. These simulations demonstrate the accuracy of the projection-based wave front sensor and its superior performance to that of the traditional Shack-Hartman wave front sensor. Timing analysis is presented which shows how the collection and processing of image projections is computationally efficient and lends itself to a wave front sensor design that can produce adaptive optical control signals at speeds of up to 500 hz.

Implementation of pattern recognition algorithm based on RBF neural network

Sophie Bouchoux, Vincent Brost, Fan Yang, et al.

Show abstract

In this paper, we present implementations of a pattern recognition algorithm which uses a RBF (Radial Basis Function) neural network. Our aim is to elaborate a quite efficient system which realizes real time faces tracking and identity verification in natural video sequences. Hardware implementations have been realized on an embedded system developed by our laboratory. This system is based on a DSP (Digital Signal Processor) TMS320C6x. The optimization of implementations allow us to obtain a processing speed of 4.8 images (240x320 pixels) per second with a correct rate of 95% of faces tracking and identity verification.

Face tracking and recognition: from algorithm to implementation

Nicolas Malasne, Fan Yang, Michel Paindavoine

Show abstract

This paper describes a system capable of realizing a face detection and tracking in video sequences. In developing this system, we have used a RBF neural network to locate and categorize faces of different dimensions. The face tracker can be applied to a video communication system which allows the users to move freely in front of the camera while communicating. The system works at several stages. At first, we extract useful parameters by a low-pass filtering to compress data and we compose our codebook vectors. Then, the RBF neural network realizes a face detection and tracking on a specific board.

Fixed-point arithmetic for mobile devices: a fingerprinting verfication case study

Yiu Sang Moon, Franklin T. Luk, Ho Ching Ho, et al.

Show abstract

Mobile devices use embedded processors with low computing capabilities to reduce power consumption. Since floating-point arithmetic units are power hungry, computationally intensive jobs must be accomplished with either digital signal processors or hardware co-processors. In this paper, we propose to perform fixed-point arithmetic on an integer hardware unit. We illustrate the advantages of our approach by implementing fingerprint verification on mobile devices.

Iterative refinement techniques for the spectral factorization of polynomials

A. Bacciardi, Luca Gemignani

Show abstract

In this paper we propose a superfast implementation of Wilson's method for the spectral factorization of Laurent polynomials based on a preconditioned conjugate gradient algorithm. The new computational scheme follows by exploiting several recently established connections between the considered factorization problem and the solution of certain discrete-time Lyapunov matrix equations whose coefficients are in controllable canonical form. The results of many numerical experiments even involving polynomials of very high degree are reported and discussed by showing that our preconditioning strategy is quite effective just when starting the iterative phase with a roughly approximation of the sought factor. Thus, our approach provides an efficient refinement procedure which is particularly suited to be combined with linearly convergent factorization algorithms when suffering from a very slow convergence due to the occurrence of roots close to the unit circle.

Orthogonal rational functions and diagonal-plus-semiseparable matrices

Marc Van Barel, Dario Fasino, Luca Gemignani, et al.

Show abstract

The space of all proper rational functions with prescribed real poles is considered. Given a set of points z_i on the real line and the weights w_i, we define the discrete inner product (formula in paper). In this paper we derive an efficient method to compute the coefficients of a recurrence relation generating a set of orthonormal rational basis functions with respect to the discrete inner product. We will show that these coefficients can be computed by solving an inverse eigenvalue problem for a diagonal-plus-semiseparable matrix.

Displacement properties of the product of two finite recursive matrices

Marilena Barnabei, Laura B. Montefusco

Show abstract

We study the displacement properties, with respect to a suitable displacement operator, of the product of two finite sections of recursive matrices, and we give an explicit evaluation of the displacement rank of such a product in the case when the second matrix is a finite Toeplitz or Hankel matrix.

Human vision model for the objective evaluation of perceived image quality applied to MRI and image restoration

Kyle A. Salem, David L. Wilson

Show abstract

We are developing a method to objectively quantify image quality and applying it to the optimization of interventional magnetic resonance imaging (iMRI). In iMRI, images are used for live-time guidance of interventional procedures such as the minimally invasive treatment of cancer. Hence, not only does one desire high quality images, but they must also be acquired quickly. In iMRI, images are acquired in the Fourier domain, or k-space, and this allows many creative ways to image quickly such as keyhole imaging where k-space is preferentially subsampled, yielding suboptimal images at very high frame rates. Other techniques include spiral, radial, and the combined acquisition technique. We have built a perceptual difference model (PDM) that incorporates various components of the human visual system. The PDM was validated using subjective image quality ratings by naive observers and task-based measures defined by interventional radiologists. Using the PDM, we investigated the effects of various imaging parameters on image quality and quantified the degradation due to novel imaging techniques. Results have provided significant information about imaging time versus quality tradeoffs aiding the MR sequence engineer. The PDM has also been used to evaluate other applications such as Dixon fat suppressed MRI and image restoration. In image restoration, the PDM has been used to evaluate the Generalized Minimal Residual (GMRES) image restoration method and to examine the ability to appropriately determine a stopping condition for such iterative methods. The PDM has been shown to be an objective tool for measuring image quality and can be used to determine the optimal methodology for various imaging applications.

A hybrid GMRES and TV-norm-based method for image restoration

Daniela Calvetti, Bryan Lewis, Lothar Reichel

Show abstract

Total variation-penalized Tikhonov regularization is a popular method for the restoration of images that have been degraded by noise and blur. The method is particularly effective, when the desired noise- and blur-free image has edges between smooth surfaces. The method, however, is computationally expensive. We describe a hybrid regularization method that combines a few steps of the GMRES iterative method with total variation-penalized Tikhonov regularization on a space of small dimension. This hybrid method requires much less computational work than available methods for total variation-penalized Tikhonov regularization and can produce restorations of similar quality.

Restoration methods for astronomical images at mid-infrared wavelengths

Mario Bertero, Patrizia Boccacci, A. Custo, et al.

Show abstract

Ground-based astronomical observations at mid-infrared wavelengths (around 10-20 microns) face the problem of extracting the weak astronomical signal from the large background due to atmosphere and telescope emission. The solution is provided by a differential technique, known as chopping and nodding, which can be modeled as the application of a second-difference operator to the image detectable in the absence of background. However, since the chopped and nodded images are distorted by large negative counterparts of the sources, a method for restoring the original non-negative image is required. In our previous work we proposed a viable iterative method which provides, in some cases, restored images affected by annoying artifacts, related to the huge non-uniqueness of the restoration problem. Therefore in this paper we present an alternative method which can be used when the source morphology or the data acquisition strategy allows to reduce the degree of non-uniqueness of the solution. By means of numerical simulations, we show that the new method does not produce the artifacts of the previous one and the implications of this result are briefly discussed.

Application of multigrid techniques to image restoration problems

Raymond Hon-fu Chan, M. Donatelli, Stefano Serra-Capizzano, et al.

Show abstract

We briefly describe a multigrid strategy for unilevel and two-level linear systems whose coefficient matrix A_n belongs either to the Toeplitz class or to the cosine algebra of type III and such that A_n can be naturally associated, in the spectral sense, with a polynomial function f. The interest of the technique is due to its optimal cost of O(N) arithmetic operations, where N is the size of the algebraic problem. We remark that these structures arise in certain 2D image restoration problems or can be used as preconditioners for more complicated image restoration problems.

Matching of a 3D model into a 2D image using a hypothesize-and-test alignment method

Thorsten Koelzow, Lars Krueger

Show abstract

This paper presents three novel matching algorithms, where a hypothesis of a 3D object is matched into a 2D image. The three algorithms are compared with respect to speed and precision on some examples. A hypothesis consists of the object model and its six degrees of freedom. The hypothesis is projected into the image plane using a pinhole camera model. The model of the used object is a feature-attributed 3D geometric model. It contains various local features and their rules of visibility. After the projection into the image plane the local environment of the projected features is searched for the best match value of the various features. There exists a trade-off between the rigidity of the object and the best-match position of the local features in the image. After the matching a 2D-3D pose estimation is run to get an updated pose from the matching. Three novel algorithms for matching the local features under the consideration of their geometric formation are decribed in this paper. The first algorithm combines the local features into a graph. The graph is viewed as a network of springs, where the spring forces constraint the object's rigidity. The quality of the local best matches is represented by additional forces introduced into the nodes of the graph. The second matching algorithm decouples the local features from each other for moving them independently. This does not impose constraints on the rigidity of the object and does not consider the feature quality. The third matching method takes into account the feature quality by using it within the pose estimation.

Large-scale optimization techniques for nonnegative image restorations

Marielba Rojas, Trond Steihaug

Show abstract

We describe an optimization method for large-scale nonnegative regularization. The method is an interior-point iteration that requires the solution of a large-scale and possibly ill-conditioned parameterized trust-region subproblem at each step. The method relies on recently developed techniques for the large-scale trust-region subproblem. We present preliminary numerical results on image restoration problems.

Adaptive coding for joint power control and beamforming over wireless networks

Han Zhu, Kuo Juey Ray Liu

Show abstract

Co-channel interference and time varying nature of channel are two main impairments that degrade the performance of a wireless link. Power control, antenna beamforming and adaptive coding are the approaches for improving the performance in wireless networks by appropriately allocating resources such as energy in time and space domains. It is hard for traditional joint power control and beamforming with fix coding to guarantee each users' quality of service (QoS) during the deep fading. In this work, we introduce adaptive coding for joint power control and beamforming. We use the theory of "water filling" in time domain. When the channels are bad, we use lower coding rates and lower source rates to guarantee the Bit Error Rate (BER). When the channels are good, we use higher coding rates and higher source rates to increase the overall network throughput. We use the rate compatible punctured convolutional (RCPC) codes for adapative coding, because lower rate codes of RCPC codes are compatible with higher rate code and only one RCPC code transceiver is needed for the different rates. At each time, the network throughput is a constant to keep the system performance. For each user, the time average throughput is maintained as a constant to ensure the fairness. From the simulation results, it is shown that our schemes reduce up to 90% of overall transmitted power and increase the network throughput about 40% at BER equal to 10^-3 and 10^-6. We also introduce a sub-optimal algorithm that has the complexity of only O(N²logN) with relative good performance.

Channel allocation in OFDMA-based wireless ad-hoc networks

Gautam Kulkarni, Vijay Raghunathan, Mani B. Srivastava, et al.

Show abstract

Wireless ad hoc networks find great utility in situations where there is no wired base station infrastructure. For large networks wireless channel access needs to be performed in a distributed manner. Orthogonal Frequency Division Multiple Access (OFDMA) is an emerging multiple access technique that is used in several new networking technologies. In this paper we study the use of OFDMA for ad hoc networks. Each point-to-point link does not transmit over the entire band and uses a subset of the total number of available subcarriers. This enables flexible network resource management with the ability to vary the output link capacity depending on the traffic load. Specifically, we address the problem of subcarrier allocation for point-to-point links of ad hoc networks. We present a distributed algorithm for subcarrier allocation and compare its performance to a centralized graph-theoretic heuristic. The simulation results also show that our protocol is robust and scalable.

Performance analysis of space-time block-coded OFDM over correlated Nakagami fading channels

Zhengjiu Kang, Kung Yao, Flavio Lorenzelli

Show abstract

In this paper, we analytically evaluate the bit error rate (BER) performance of space-time coded orthogonal frequency division multiplex (OFDM) transmit diversity over correlated Nakagami-m fading channel. Coherent and incoherent detection of binary frequency shift-keying (FSK) and phase-shift keying (PSK) signals are considered. Numerical results of the BER corresponding to different fading parameters and correlation coefficients are demonstrated.

Computationally efficient ASIC implementation of space-time block decoding

Enver Cavus, Babak Daneshrad

Show abstract

In this paper, we describe a computationally efficient ASIC design that leads to a highly efficient power and area implementation of space-time block decoder compared to a direct implementation of the original algorithm. Our study analyzes alternative methods of evaluating as well as implementing the previously reported maximum likelihood algorithms (Tarokh et al. 1998) for a more favorable hardware design. In our previous study (Cavus et al. 2001), after defining some intermediate variables at the algorithm level, highly computationally efficient decoding approaches, namely sign and double-sign methods, are developed and their effectiveness are illustrated for 2x2, 8x3 and 8x4 systems using BPSK, QPSK, 8-PSK, or 16-QAM modulation. In this work, alternative architectures for the decoder implementation are investigated and an implementation having a low computation approach is proposed. The applied techniques at the higher algorithm and architectural levels lead to a substantial simplification of the hardware architecture and significantly reduced power consumption. The proposed architecture is being fabricated in TSMC 0.18 μ process.

Quasi-orthogonal space-time block codes with full diversity

Weifeng Su, Xiang-Gen Xia

Show abstract

Space-time block codes from orthogonal designs proposed by Alamouti, and Tarokh-Jafarkhani-Calderbank have attracted much attention lately due to their fast maximum-likelihood (ML) decoding and full diversity. However, the maximum symbol transmission rate of a space-time block code from complex orthogonal designs for complex constellations is only 3/4 for three and four transmit antennas. Recently, Jafarkhani,and Tirkkonen-Boariu-Hottinen proposed space-time block codes from quasi-orthogonal designs, where the orthogonality is relaxed to provide higher symbol transmission rates. With the quasi-orthogonal structure, these codes still have a fast ML decodng algorithm, but do not have the full diversity. The performance of these codes is better than that of the codes from orthogonal designs at low SNR, but worse at high SNR. This is due to the fact that the slope of the BER-SNR curve depends on the diversity. In this paper, we design quasi-orthogonal space-time block codes with full diversity by properly choosing the signal constellations. In particular, we propose that half of the symbols in a quasi-orthogonal design are from a signal constellation Α and another half of them are optimal selections from the rotated constellation e^jφΑ. The optimal rotation angles φ are obtained for some commonly used signal contellations. The resulting codes have both full diversity and fast ML decoding. Simulation results show that the proposed codes outperform the codes from orthogonal designs at both low and high SNRs.

Transmit diversity for TDD systems

Jibing Wang, Kung Yao

Show abstract

Multiple input multiple output (MIMO) communications have been a hot research area in recent years. Most literature makes the assumption that the channel information is not known at the transmitter but known perfectly at the receiver. We focus on the situation where both the transmitter and the receiver know the channel information. We consider a transmit diversity scheme that maximizes the signal to noise ratio at the receiver. We analyze its performance in terms of capacity, duality and asymptotic behavior. By simulation, we compare this scheme with Alamouti's transmit diversity to show the advantage of utilizing the channel side information to improve the performance of the wireless systems.

Cramer-Rao bound analysis of wideband source localization and DOA estimation

Lean Yip, Joe C. Chen, Ralph E. Hudson, et al.

Show abstract

In this paper, we derive the Cramér-Rao Bound (CRB) for wideband source localization and DOA estimation. The resulting CRB formula can be decomposed into two terms: one that depends on the signal characteristic and one that depends on the array geometry. For a uniformly spaced circular array (UCA), a concise analytical form of the CRB can be given by using some algebraic approximation. We further define a DOA beamwidth based on the resulting CRB formula. The DOA beamwidth can be used to design the sampling angular spacing for the Maximum-likelihood (ML) algorithm. For a randomly distributed array, we use an elliptical model to determine the largest and smallest effective beamwidth. The effective beamwidth and the CRB analysis of source localization allow us to design an efficient algorithm for the ML estimator. Finally, our simulation results of the Approximated Maximum Likelihood (AML) algorithm are demonstrated to match well to the CRB analysis at high SNR.

Resolution improvement techniques for microwave imaging in random media using small wideband adaptive arrays

Mark Curry, Yasuo Kuga

Show abstract

In this work we review and extend the approach for radar imaging using small, wideband, adaptive arrays by using a ray-tracing algorithm to simulate non-homogeneous environments. This allows the rapid investigation of radar imaging in more realistic environments that might be encountered in practice. We have previously shown that such arrays can be effective for short range backscatter imaging and source localization in free space using experimental arrays of four elements at 1 GHz and 20% bandwidth as well as twelve elements operating from 2-3 GHz. These arrays have been constructed to test the proposed algorithms and have demonstrated good results. We review the spatial resampling technique for array focusing. We discuss Approximate Signal Subspace Projection (ASSP) to suppress clutter. This technique allows more control over the angular resolution and the background clutter level. We review the ray-tracing algorithm that is required to generate data for imaging. Computer simulations are shown to demonstrate the use of adaptive array imaging in several non-homogeneous environments. The motivations for this work are indoor personnel localization systems, automotive radar and radar imaging seekers.

Packed arithmetic on a prefix adder (PAPA)

Neil Burgess

Show abstract

This paper describes a new method for performing packed arithmetic on a prefix adder that enables sub-wordlength additions and subtractions to be performed in parallel on any prefix adder topology. A major benefit of the proposed technique is that the critical path length of the prefix carry tree is unaltered when measured as the number of complex CMOS logic gates. Moreover, there is no restriction on the prefix tree's cell topology and the adder is also capable of performing packed absolute difference and packed rounded average calculations.

Constant-delay MSB-first bit-serial adder

Chang Yong Kang, Earl E. Swartzlander Jr.

Show abstract

A new MSB-first bit-serial adder/subtracter architecture is proposed. The architecture utilizes a modified Manchester carry chain to accommodate the carry from the future LSB's. The carry chain is shown to have the constant delay of two AND gates and one XOR gate regardless of the operand width, which allows a fast constant operational clock frequency. When compared to the conventional parallel addition approach where the operand bits are stored and then added in parallel, the proposed architecture also provides a significant area saving. It is also shown that the proposed architecture can be generalized for radix-r operands.

Number representation optimization for low-power multiplier design

Zhijun Huang, Milos D. Ercegovac

Show abstract

Multipliers using different number representation systems have different power/area/delay characteristics. This paper studies the effects of number representations on power consumption and proposes optimization techniques for two's-complement multipliers. By examining existing radix-4 recoding design schemes, two power-improved designs are proposed for standard cell CMOS technology. With new recoding schemes, the power efficiency of radix-4 multipliers versus radix-2 multipliers are re-investigated. To utilize the power efficiency of sign-magnitude representation, number representation conversion schemes are proposed. For a typical data set from application djpeg, the conversion schemes consume less than 30% power of the baseline schemes.

Design tradeoffs using truncated multipliers in FIR filter implementations

Eugene George Walters III, Michael J. Schulte

Show abstract

This paper presents a general FIR filter architecture utilizing truncated tree multipliers for computation. The average error, maximum error, and variance of error due to truncation are derived for the proposed architecture. A novel technique that reduces the average error of the filter is presented, along with equations for computing the signal-to-noise ratio of the truncation error. A software tool written in Java is described that automatically generates structural VHDL models for specific filters based on this architecture, given parameters such as the number of taps, operand lengths, number of multipliers, and number of truncated columns. We show that a 22.5% reduction in area can be achieved for a 24-tap filter with 16-bit operands, 4 parallel multipliers, and 12 truncated columns. For this implementation, the average reduction error is only 9.18 × 10^-5 ulps, and the reduction error SNR is only 2.4 dB less than the roundoff SNR of an equivalent filter without truncation.

LNS for low-power MPEG decoding

Mark G. Arnold

Show abstract

Floating-point and fixed-point are expensive for portable multimedia devices. Low-cost Logarithmic Number System (LNS) arithmetic can reduce power consumption of MPEG decoding in exchange for barely perceptible video artifacts. Different number representations need different word sizes to produce the same quality image. LNS can produce good visual results using fewer bits than fixed point. Rounding to the nearest is often done with fixed point and floating point, but LNS allows a cheaper unrestricted-faithful-rounding mode that does not degrade the visual quality of MPEG outputs. This paper also describes how the Berkeley MPEG tools were modified to carry out these MPEG arithmetic experiments.

New one-hot RNS structures for high-speed signal processing

Richard Conway, Thomas Conway, John Nelson

Show abstract

New efficient structures using the one-hot residue number system (OHRNS) are presented. Normally the RNS uses a binary representation for the residues, though recently there has been renewed interest in the OHRNS, which uses a simple, but novel representation for the residues. The basic component of the OHRNS is the barrel shifter, making the OHRNS suitable for very high speed applications. The first of the new structures presented reduces the power dissipation in OHRNS adder trees. A modification to the normal barrel shifter is proposed, which reduces the power dissipated by as much as 30%. This improvement is obtained through the use of the modified barrel shifter and the appropriate connection of active-low and active-high stages. This overall power reduction offers the possibility of using the OHRNS in place of a typical full adder based tree in high speed DSP applications. A new storage register for one-hot representations is detailed, which overcomes the problem of having to use a large number of registers. A new architecture is presented for fast OHRNS sign detection. Sign detection is complex and slow to perform in the RNS. A mixed radix conversion (MRC) is typically used for sign detection in the OHRNS. The new sign detection architecture is based on a new property of the Chinese Remainder Theorem (CRT) and is significantly faster than the MRC approach for large moduli sets. Simulation results using SPICE are detailed for the new structures.

Application of symmetric redundant residues for fast and reliable arithmetic

Behrooz Parhami

Show abstract

Despite difficulties in general division, magnitude comparison, and sign detection, residue number system arithmetic has been used for many special-purpose systems in light of its parallelism and modularity for the most common arithmetic operations of addition/subtraction and multiplication. Computation in RNS requires modular reduction, both for the initial conversion from binary to RNS and after each operation to bring the result back to within a valid residue range. Use of redundant residues simplifies this critical operation, leading to even faster arithmetic. One type of redundant mod-m residue, that keeps the representational redundancy to the minimum of 1 bit per residue, has the nearly symmetric range (-m,m) and allows two values for each pseudoresidue: ⟨x⟩_m or ⟨x⟩_m - m. We study the extent of simplification and speed-up in the modular reduction process afforded by such redundant residues and discuss its potential implications to the design of RNS arithmetic circuits. In particular, we show that besides cost and performance benefits, introduction of error checking and fault tolerance in arithmetic computations is facilitated when such redundant residues are used.

Parity-preserving transformations in computer arithmetic

Behrooz Parhami

Show abstract

Parity checking comprises a low-redundancy method for the design of reliable digital systems. While quite effective for detecting single-bit transmission or storage errors, parity encoding has not been widely used for checking the correctness of arithmetic results because parity is not preserved during arithmetic operations and parity prediction requires fairly complex circuits in most cases. We propose a general strategy for designing parity-checked arithmetic circuits that takes advantage of redundant intermediate representations. Because redundancy is often used for high performance anyway, the incremental cost of our proposed method is quite small. Unlike conventional binary numbers, redundant representations can be encoded and manipulated in such a way that parity is preserved in each step. Additionally, lack of carry propagation ensures that the effect of a fault is localized rather than catastrophic. After establishing the framework for our parity-preserving transformations in computer arithmetic, we illustrate some applications of the proposed strategy to the design of parity-checked adder/subtractors, multipliers, and other arithmetic structures used in signal processing.

Digital filtering using the multidimensional logarithmic number system

Vassil S. Dimitrov, Graham A. Jullien, Konrad Walus

Show abstract

We introduce the use of multidimensional logarithmic number system (MDLNS) as a generalization of the classical 1-D logarithmic number system (LNS) and analyze its use in DSP applications. The major drawback of the LNS is the requirement to use very large ROM arrays in implementing the additions and subtraction and it limits its use to low-precision applications. MDLNS allows exponential reduction of the size of the ROMs used without affecting the speed of the computational process; moreover, the calculations over different bases and digits are completely independent, which makes this particular representation perfectly suitable for massively parallel DSP architectures. The use of more than one base has at least two extra advantages. Firstly, the proposed architecture allows us to obtain the final result straightforwardly in binary form, thus, there is no need of the exponential amplifier, used in the known LNS architectures. Secondly, the second base can be optimized in accordance to the specific digital filter characteristics. This leads to dramatic reduction of the exponents used and, consequently, to large area savings. We offer many examples showing the computational advantages of the proposed approach.

Low-power array multiplier design by topology optimization

Zhijun Huang, Milos D. Ercegovac

Show abstract

Left-to-right (L-R) linear array multiplication provides an interesting alternative to the conventional right-to-left (R-L) array multiplication as L-R computation has the potential of saving power and delay. This paper presents topology optimization techniques for low-power L-R array multipliers. These techniques include: interconnect reorganization, addition modules other than 3-to-2 carry save adders for PP reduction, and split array architectures. Our experiments indicate that interconnect reorganization can be a primary choice for L-R array multipliers if power is the critical concern. L-R schemes with optimized interconnect achieve the least power consumption in most cases with relatively small delay. When small power-delay product is the main goal, the more complex split array architectures are good candidates.

Design alternatives for barrel shifters

Matthew R. Pillmeier, Michael J. Schulte, Eugene George Walters III

Show abstract

Barrel shifters are often utilized by embedded digital signal processors and general-purpose processors to manipulate data. This paper examines design alternatives for barrel shifters that perform the following functions: shift right logical, shift right arithmetic, rotate right, shift left logical, shift left arithmetic, and rotate left. Four different barrel shifter designs are presented and compared in terms of area and delay for a variety of operand sizes. This paper also examines techniques for detecting results that overflow and results of zero in parallel with the shift or rotate operation. Several Java programs are developed to generate structural VHDL models for each of the barrel shifters. Synthesis results show that data-reversal barrel shifters have less area and mask-based data-reversal barrel shifters have less delay than other designs. Mask-based data-reversal barrel shifters are especially attractive when overflow and zero detection is also required, since the detection is performed in parallel with the shift or rotate operation.

Parametric time delay modeling for floating point units

Hossam A. H. Fahmy, Albert A. Liddicoat, Michael J. Flynn

Show abstract

A parametric time delay model to compare floating point unit implementations is proposed. This model is used to compare a previously proposed floating point adder using a redundant number representation with other high-performance implementations. The operand width, the fan-in of the logic gates and the radix of the redundant format are used as parameters to the model. The comparison is done over a range of operand widths, fan-in and radices to show the merits of each implementation.

Design of a hybrid prefix adder for nonuniform input arrival times

Youngmoon Choi, Earl E. Swartzlander Jr.

Show abstract

This paper examines the design of a hybrid prefix adder under the condition of non-uniform input signal arrival. This is encountered in the final adder for fast parallel multipliers, which use column compression reduction. The prefix graph scheme efficiently accommodates the non-uniform arrival times. Rules are presented for designing hybrid prefix adders under such conditions. This rule produces adders which are faster and less complex than previous work.

Advanced Signal Processing Algorithms, Architectures, and Implementations XII

Volume Details

Table of Contents

Table of Contents