Stochastic and Neural Methods in Signal Processing, Image Processing, and Computer Vision

Considering multiple-surface hypotheses in a Bayesian hierarchy

Steven M. LaValle, Seth A. Hutchinson

Show abstract

This paper presents a probabilistic approach to segmentation that maintains a set of competing, plausible segmentation hypotheses. This is in contrast to previous approaches, in which probabilistic methods are used to converge to a single segmentation. The benefit of the approach is that belief values associated with segmentation hypotheses can be used to guide the recognition process, and the recognition process can, in turn, exert influence on the belief values associated with segmentation hypotheses in the network. In this way, segmentation and recognition can be coupled together to achieve a combination of expectation-driven segmentation and data-driven recognition. Algorithms were based on the formalism of Bayesian belief networks. By storing segmentation hypotheses in a tree structured network, the storage demands associated with maintaining the competing hypotheses can be limited. An implicit representation for segmentation hypotheses is introduced (without this implicit representation, the power set of region groupings would need to be enumerated). Likelihood measures are used both to control the expansion of the hypothesis tree and to evaluate belief in hypotheses. Local likelihood measures are used during an expansion phase, in which leaf nodes are refined into more specific hypotheses. Global likelihood measures are applied during an evaluation phase. The global likelihood measures are derived by fitting quadric surfaces to the range data. By using this expand and evaluate approach guided by a measure of entropy defined on the leaves of the tree, the application of costly numerical fitting algorithms can be limited to a small number of nodes in the tree.

Example of a Bayes network of relations among visual features

John Mark Agosta

Show abstract

Bayes probability networks, also termed `influence diagrams,' promise to be a versatile, rigorous, and expressive uncertainty reasoning tool. This paper presents an example of how a Bayes network can express constraints among visual hypotheses. An example is presented of a model composed of cylindric primitives, inferred from a line drawing of a plumbing fixture. Conflict between interpretations of candidate cylinders is expressed by two parameters, one for the presence and one for the absence of visual evidence of their intersection. It is shown how `partial exclusion' relations are so generated and how they determine the degree of competition among the set of hypotheses. Solving this network obtains the assemblies of cylinders most likely to form an object.

Recursive computation of a wire-frame representation of a scene from dynamic stereo using belief functions

Arun P. Tirumalai, Brian G. Schunck, Ramesh C. Jain

Show abstract

This paper presents a stereo algorithm to recursively compute a boundary-level structural description of a scene, from a sequence of stereo images. The majority of existing stereo algorithms deal with individual points as the basic primitive to match between two or more images. While this keeps the implementation simple, the output description, which is a depth/disparity map, is represented as a composition of individual points. This is often undesirable as no semblance of the underlying structure of the scene is explicitly represented. A stereo matching algorithm is presented, based on connected line segments as the basic match primitive, which yields a description composed primarily of boundaries of objects in the scene. A description of this nature is very useful for obstacle avoidance and path planning for mobile robots. The stereo matching algorithm is integrated into a dynamic stereo vision system to compute and incrementally refine such a structural description recursively, using belief functions. The stereo camera motion between two viewpoints, which is necessary to register the two views, is recovered as part of the stereo computations. The approach is illustrated with a real dynamic stereo sequence acquired from a mobile robot.

Hierarchical Dempster-Shafer evidential reasoning for image interpretation

Keith M. Andress

Show abstract

A hierarchical evidence accumulation scheme developed for use in blackboard systems is described. This scheme, based on the Dempster-Shafer formalism, uses a computationally efficient variation of Dempster's rule of combination enabling the system to deal with the overwhelming amount of information present in image data. This variation of Dempster's rule allows the reasoning process to be embedded into the abstraction hierarchy by allowing for the propagation of belief values between elements at different levels of abstraction. The evidence accumulation scheme described here was originally designed to be embedded in PSEIKI, a blackboard system for expectation-driven interpretation of image data. PSEIKI performs expectation-driven processing by matching image-elements, such as edges and regions, with model-elements from a supplied expected scene. PSEIKI builds abstraction hierarchies in image data using cues taken from the supplied abstractions in the expected scene. Hypothesized abstractions in the image data are geometrically compared with the known abstractions in the expected scene; the metrics used for these comparisons translate into belief values. The evidence accumulation system is described in detail and a few representative metrics also are presented.

Application of Dempster-Shafer theory to a novel control scheme for sensor fusion

Robin R. Murphy

Show abstract

The combination of imperfect evidence contributed by different sensors is a basic problem for sensor fusion in autonomous mobile robots. Current implementations of sensor fusion systems are restricted to fusing only certain classes of evidence because of the lack of a general framework for the combination of evidence. The authors approach to this problem is to first develop a model of the sensor fusion without committing to a particular theory of evidence, then to formulate a combination of evidence framework based on the requirements of the model. Their previous work has proposed such a model. This paper discusses the evidential demands of the model and one possible implementation using Dempster-Shafer theory. Three drawbacks of DS theory (computational intractability, weak assumptions of statistical independence, and counterintuitive averaging of strongly biased evidence) are eliminated by applying DS theory within the constraints of the model. An example based on simulated sensor data illustrates this application of Dempster-Shafer theory.

Bayesian signal reconstruction from Fourier transform magnitude and x-ray crystallography

Peter C. Doerschuk

Show abstract

A signal reconstruction problem motivated by x-ray crystallography is solved using a Bayesian statistical approach. A Markov random field is used to describe the a priori information concerning the 0 - 1 signal. The data are inaccurate measurements of the magnitudes of the Fourier coefficients of the signal. The solution exploits the parallel between Bayesian statistics and statistical mechanics and uses the spherical model and asymptotic small noise approximations.

Simulated annealing image reconstruction for an x-ray coded source tomograph

Mohsine El Alaoui, Isabelle E. Magnin, Michel Amiel

Show abstract

The reconstruction of a 3-D object from its 2-D coded radiograph is considered. The set up is a restricted view angle tomographic system. The physical acquisition conditions are simulated. A 3-D phantom is reconstructed using the simulated annealing (SA) algorithm. Three initial solutions are envisaged: zero volume; a solution obtained by SA with T equals 0; and an algebraic reconstruction provided by a specific tomosynthesis algorithm. The quality of the reconstructions provided by the initial solutions is enhanced by using the algorithm. The root mean square distance between the coded projection of the actual object and the coded projection of the reconstructed object is minimized. The advantage of starting from a given initial solution is to increase the convergence speed and thus to minimize computation time. A computer simulation is performed in which a 9 source point distribution irradiates a 3-D object (64 X 64 X 64 voxels thick). The initial (first step) and final (second step) reconstructions of the objects are presented. The speed of convergence and the quality of the results are discussed.

General method for accelerating simulated annealing algorithms for Bayesian image restoration

Griff L. Bilbro

Show abstract

A new stochastic technique is described for the Bayesian restoration of gray-level images corrupted by white noise. The proposed technique is related to simulated annealing but generates candidates more efficiently for gray-level images than either the Gibbs sampler or the Metropolis procedure. For a logarithmic cooling schedule, asymptotic convergence of the algorithm is proved by analyzing the corresponding inhomogeneous Markov chain. For an exponential cooling schedule, the new technique is shown experimentally to restore floating point images in 1/50 of the time required for the usual simulated annealing. Experimental restorations of gray-level images corrupted by white noise are presented.

Timbre discrimination of signals with identical pitch using neural networks

Samir I. Sayegh M.D., Carlos A. Pomalaza-Raez, E. Tepper, et al.

Show abstract

Pitch recognition and timbre discrimination for a string instrument is investigated using artificial neural networks. Pitch recognition, the easier task, is realized with a linear classifier while timbre discrimination is achieved with a multiple layer perceptron using gradient back propagation learning.

Fast algorithm for a neocognitron neural network with back-propagation

Kent Pu Qing, Robert W. Means

Show abstract

The neocognitron is a neural network that consists of many layers of partially connected cells. A new neocognitron architecture called the multilayer neocognitron with backpropagation learning (MNEOBP) is proposed, and it is shown that the original Neocognitron that is learned by backpropagation is a special case of what is proposed here. The MNEOBP has a number of advantages: (1) The algorithm for the MNEOBP is four times faster than the earlier algorithm since the number of cells calculated is reduced by a factor of four. (2) During the learning process, the mask (kernel) size is changed, this can speed up the training time by almost a factor of three. (3) The MNEOBP architecture can be implemented with a new digital neural network VLSI chip set called the Vision Processor (ViP). The ViP exploits the convolution structure of the network and can process a single 32 X 32 input layer in only 25.6 microsecond(s) with an 8 X 8 receptive field kernel.

Target identification by means of adaptive neural networks in thermal infrared images

Marc P. J. Acheroy, Wim Mees

Show abstract

A generic method for target recognition is presented. The stress is put on the methods based on the neural networks and more specifically on the adaptive resonance theory (ART) models. This type of artificial neural network (ANN) has the advantage of being unsupervised and adaptive: it is indeed able to acquire and adapt its long-term memory taking into account the context evolution. ART networks very quickly recognize classes that are already known, they also learn new images very fast. Two versions of ART are investigated: ART1, which only works with binary data, and ART2, which is working with analog data. In practice, ART1 seems to need larger images than ART2 to achieve the same efficiency, but is obviously faster. A preprocessor has been developed whose output is invariant to translation, rotation, and scale changes of the input. The most important feature of this preprocessor is its ability to preserve visual interpretation, which is not the case for the more classical methods using Fourier-like and log/polar transforms.

Neural network approach for object orientation classification

Keith K. Yeung, Pierre Zakarauskas, Allan G. McCray

Show abstract

A neural network approach to determine the heading orientation of a known object in a noisy image is presented. The gray-scaled image is first preprocessed by the following four procedures: 1) An edge map of the object is extracted using the Sobel edge operator; 2) The discrete 2-D Fourier transform is applied to the edge map to eliminate the translational variance; 3) The Fourier power coefficients are mapped into a polar coordinate system; 4) The amplitudes of the Fourier coefficients in each five-degree angular sector are summed to form a 1-D input vector to the neural network. A backpropagation neural network with one hidden layer was trained with a sequence of seven noise-free object outlines with heading ranging from 0 to 90 deg in 15 deg increments. After the training was complete, the network was tested with three noisy images taken from randomly selected object orientations. The network successfully classified the appropriate headings in each case. These results illustrate the robustness of this neural network design in performing heading classification from noisy images

Shape discrimination using invariant Fourier representation and a neural network classifier

Hsien-Huang Peter Wu, Robert A. Schowengerdt

Show abstract

A neural network approach for classification of images represented by translation, scale, and rotation invariant features is presented. The invariant features are the Fourier descriptors (FDs) derived from the boundary (shape) of the object. The network is a multilayer perceptron (MLP) classifier with one hidden layer and back propagation training (MLP-BP). Performance of the MLP algorithm is compared to optimal curve matching (OCM) for the recognition of mechanical tools. The test data were 14 objects with eight images per object, each image having significant differences in scaling, translation, and rotation. Only 10 harmonics of the 1024 FD coefficients were used as the input vector. The neural network approach proved to be more stable and faster than the optimal curve matching algorithm in classifying the objects after the training phase. The simple calculations needed for the Fourier descriptors and the small number of coefficients needed to represent the boundary result in an efficient system, excluding training, which can be done off-line. Results are shown comparing the classification accuracy of the OCM method with the MLP-BP algorithm using different size training sets. The method can be extended to any patterns that can be discriminated by shape information.

Bayesian estimation of smooth object motion using data from direction-sensitive velocity sensors

David Yushan Fong, Carlos A. Pomalaza-Raez

Show abstract

A two-stage process involving Bayesian estimates of smooth velocity vectors is used to detect physical object movements in an image sequence containing noisy background motions. This process computes the probability that a velocity vector is `smooth' with respect to a vector in the previous frame. Those vectors with a high probability are assembled into `paths' and paths longer than a threshold are retained. When this process is applied to the output of a velocity- sensitive network, random movements from the background are filtered out from the sequence, retaining only the smooth motion vectors.

Recovering absolute depth and motion of multiple objects from intensity images

Fan Jiang, Brian G. Schunck

Show abstract

This paper reports an algorithm that recovers absolute depth and motion of multiple objects from intensity images. It has been shown that absolute depth of multiple objects can be recovered from relative normal flows. With normal flows and depth estimated from images, motion recovery becomes a linear regression problem. As the depth estimates can be very noisy, the method of least-median of squares (LMS) is used to robustly recover the object motion. To decompose image points into groups that correspond to independently moving objects, strong-edge points at which normal flows are estimated are grouped into segments. The segments are then grouped into rigidly moving objects if they can be interpreted as such. This algorithm is designed for the case of general motion. However, experiments suggest that rotation makes the estimation of quality normal flows difficult and brings large errors into depth and motion recovery. The algorithm is tested on real images of translating objects.

Stochastic field-based object recognition in computer vision

Dongping Zhu, A. A. Beex, Richard W. Conners

Show abstract

This study explores the application of a stochastic texture modeling method toward a machine vision system for log inspection in the forest products industry. This machine vision system uses computerized tomography (CT) imaging to locate and identify internal defects in hardwood logs. To apply CT to these industrial vision problems requires efficient and robust image analysis methods. The paper addresses one aspect of the problem of creating such a computer vision system, i.e., the issue of statistical image texture modeling for wood defect recognition using a stochastic field-based approach. In particular, it describes a parametric model-based method for studying the spatial stochastic processes -- wood grain textures, with each grain texture being modeled by a parametric random field model. A robust algorithm for parameter estimation is applied to obtain model parameters for individual defects occurring inside a log. By making use of the estimated model features, a simple minimum distance classifier is constructed to classify an unknown defect into one of the prototypical defects. Experimental results of the proposed method with CT images from red oaks are given to show the efficacy of the proposed approach.

Robust statistical method for background extraction in image segmentation

Arturo A. Rodriguez, O. Robert Mitchell

Show abstract

A method that adaptively extracts the gray-tone distribution of the background of the image without a priori knowledge is described, and the method's performance is shown to be superior when log-transformed image data is used. The image is decomposed into rectangular regions to adaptively extract the background's gray-tone distribution throughout the image. The background distribution of each region is modeled with a left-half and right-half Gaussian to compensate for its asymmetrical nature. Statistical criteria are used to classify each rectangular region as background-homogeneous, object-homogeneous, or uncertain. Measured statistical parameters of background homogeneous regions are propagated throughout the image to estimate the statistics of object-homogeneous regions and uncertain regions. The local background of each region is extracted by using the measured or estimated statistical parameters to compute the left and right shoulder thresholds of the background distribution. A logarithmic transformation implementation for gray-tone image data and a procedure to map log-transformed data into integer-valued histograms are proposed. The background extraction method is shown to yield superior results when log-transformed image data is used. The proposed algorithm is robust by virtue of the logarithmic transformation implementation; it can perform over a wide range of applications without parameter adjustments or human interaction. The algorithm performs successfully whether the image contains objects darker and/or brighter than the background, or no object at all. The method is demonstrated on background-only scenes imaged under different lighting conditions, industrial scenes, and outdoor scenes of moderate complexity that exhibit different smooth backgrounds within the same image.

Challenges of vision theory: self-organization of neural mechanisms for stable steering of object-grouping data in visual motion perception

Jonathan A. Marshall

Show abstract

Psychophysical studies on motion perception suggest that human visual systems perform certain nonlocal operations. In some cases, data about one part of an image can influence the processing or perception of data about another part of the image, across a long spatial range. In others, data about nearby parts of an image can fail to influence one another strongly, despite their proximity. Several types of nonlocal interaction may underlie cortical processing for accurate, stable perception of visual motion, depth, and form: (1) trajectory-specific propagation of computed moving stimulus information to successive image locations where a stimulus is predicted to appear; (2) grouping operations (establishing linkages among perceptually related data); (3) scission operations (breaking linkages between unrelated data); and (4) steering operations, whereby visible portions of a visual group or object can control the representations of invisible or occluded portions of the same group. Nonlocal interactions like these could be mediated by long-range excitatory horizontal intrinsic connections (LEHICs), discovered in visual cortex of several animal species. LEHICs often span great distances across cortical image space. Typically, they have been found to interconnect regions of like specificity with regard to certain receptive field attributes, e.g., stimulus orientation. It has recently been shown that several visual processing mechanisms can self-organize in model recurrent neural networks using unsupervised `EXIN' (excitatory + inhibitory) learning rules. Because the same rules are used in each case, EXIN networks provide a means to unify explanations of how different visual processing modules acquire their structure and function. EXIN networks learn to multiplex (or represent simultaneously) multiple spatially overlapping components of complex scenes, in a context-sensitive fashion. Modeled LEHICs have been used together with the EXIN learning rules to show how visual experience can shape neural mechanisms for nonlocal, context-sensitive processing of visual motion data.

Learning spatially coherent properties of the visual world in connectionist networks

Suzanna Becker, Geoffrey E. Hinton

Show abstract

In the unsupervised learning paradigm, a network of neuron-like units is presented with an ensemble of input patterns from a structured environment, such as the visual world, and learns to represent the regularities in that input. The major goal in developing unsupervised learning algorithms is to find objective functions that characterize the quality of the network's representation without explicitly specifying the desired outputs of any of the units. The sort of objective functions considered cause a unit to become tuned to spatially coherent features of visual images (such as texture, depth, shading, and surface orientation), by learning to predict the outputs of other units which have spatially adjacent receptive fields. Simulations show that using an information-theoretic algorithm called IMAX, a network can be trained to represent depth by observing random dot stereograms of surfaces with continuously varying disparities. Once a layer of depth-tuned units has developed, subsequent layers are trained to perform surface interpolation of curved surfaces, by learning to predict the depth of one image region based on depth measurements in surrounding regions. An extension of the basic model allows a population of competing neurons to learn a distributed code for disparity, which naturally gives rise to a representation of discontinuities.

Utilizing the central limit theorem for parallel multiple-scale image processing with neural architectures

Jezekiel Ben-Arie

Show abstract

A set of neural lattices that are based on the central limit theorem is described. Each of the described lattices, generates in parallel a set of multiple scale Gaussian smoothing of their input arrays. The recursive smoothing principle of the lattices can be extended to any dimension. In addition, the lattices can generate in real time a variety of multiple scale operators such as Canny's edge detectors, Laplacians of Gaussians, and multidimensional sine, cosine, Fourier, and Gabor transforms.

Heterogeneous input neuration for network-based object recognition architectures

John F. Gnazzo

Show abstract

The utilization of artificial neural networks (ANN) in the area of signal and image processing applications is showing great promise. The simplification of the classical object recognition methodology is illustrated by the network based algorithm development of a simple 2-D character recognition system. The hardware implementation of such a system is also discussed. An example of a network-based solution to a target recognition problem utilizing single sensor acoustic data is also addressed. The term heterogenous input neuration is introduced.

Mean-field theory for grayscale texture synthesis using Gibbs random fields

Ibrahim M. Elfadel, Alan L. Yuille

Show abstract

This paper shows how methods developed in the context of statistical physics can be used to analyze gray-scale texture synthesis procedures based on the probabilistic paradigm of Gibbs random fields (GRF). In particular, using the mean-field equations of the texture GRF, the existence of a bifurcation point indicating the presence of a phase transition in the textural pattern is shown. Using simulations, it is shown that a number of interesting phenomena occur at the bifurcation temperature like a sudden decrease in energy, sharp peaks in similarity measures between textural patterns, and sudden saturation of the mean-field variables. For texture synthesis, it is sufficient to simulate the mean-field equations near the bifurcation temperature.

Application of neural network to restoration of signals degraded by a stochastic, shift-variant impulse response function and additive noise

Mehmet Bilgen, Hsien-Sen Hung

Show abstract

An artificial neural network is adopted for estimating discrete (sampled in time and quantized in amplitude) signals degraded by a stochastic, shift-variant impulse response (blur) function in the presence of noise. The signal restoration problem is formulated as a combinatorial optimization problem wherein a nonlinear cost function, termed stochastic constrained restoration error energy, is to be minimized. By matching the cost function with the energy function of the associated neural network, the interconnection strengths and bias inputs of the neural network are related to the degraded signal, blur statistics, and constraint parameters. The solution which minimizes the energy function of the neural network is thus obtained iteratively by the simulated annealing algorithm. Simulation results show the effectiveness of the proposed algorithm which has, in addition, the capability of imposing level constraints on the original signal.

Image segmentation with genetic algorithms: a formulation and implementation

Gunasekaran Seetharaman, Amruthur Narasimhan, Anand Sathe, et al.

Show abstract

Image segmentation is an important step in any computer vision system. Segmentation refers to the partitioning of the image plane into several regions, such that each region corresponds to a logical entity present in the scene. The problem is inherently NP, and the theory on the existence and uniqueness of the ideal segmentation is not yet established. Several methods have been proposed in literature for image segmentation. With the exception of the state-space approach to segmentation, other methods lack generality. The state-space approach, however, amounts to searching for the solution in a large search space of 22n(2) possibilities for a n X n image. In this paper, a classic approach based on state-space techniques for segmentation due to Brice and Fennema is reformulated using genetic algorithms. The state space representation of a partially segmented image lends itself to binary strings, in which the dominant substrings are easily explained in terms of chromosomes. Also the operations such as crossover and mutations are easily abstracted. In particular, when multiple images are segmented from an image sequence, fusion of constraints from one to the other becomes clear under this formulation.

Markov random fields for texture classification

Chaur-Chin Chen

Show abstract

Texture features obtained by fitting generalized Ising, auto-binomial, and Gaussian Markov random fields to homogeneous textures are evaluated and compared by visual examination and by standard pattern recognition methodology. The Markov random field model parameters capture the strong cues for human perception, such as directionality, coarseness, and/or contrast. The limited experiments for the classification of natural textures and sandpaper textures by using various classifiers suggest that both feature extraction and classifier design be carefully considered.

Statistical image algebra: a Bayesian approach

Jennifer L. Davidson, Noel A. C. Cressie

Show abstract

A mathematical structure used to express image processing transforms, the AFATL image algebra has proven itself useful in a wide variety of applications. The theoretical foundation for the image algebra includes many important constructs for handling a wide variety of image processing problems: questions relating to linear and nonlinear transforms, including decomposition techniques; mapping of transformations to computer architectures; neural networks; recursive transforms; and data manipulation on hexagonal arrays. However, statistical notions have been included only on a very elementary level and on a more sophisticated level in the literature. This paper presents an extension of the current image algebra that includes a Bayesian statistical approach. It is shown how images are modeled as random vectors, probability functions or mass functions are modeled as images, and conditional probability functions are modeled as templates. The remainder of the paper gives a brief discussion of the current image algebra, an example of the use of image algebra to express high-level image processing transforms, and the presentation of the statistical development of the image algebra.

New inverse synthetic aperture radar algorithm for translational motion compensation

Richard P. Bocker, Thomas B. Henderson, Scott A. Jones, et al.

Show abstract

Inverse synthetic aperture radar (ISAR) is an imaging technique that shows real promise in classifying airborne targets in real time under all weather conditions. Over the past few years a large body of ISAR data has been collected and considerable effort has been expended to develop algorithms to form high-resolution images from this data. One important goal of workers in this field is to develop software that will do the best job of imaging under the widest range of conditions. The success of classifying targets using ISAR is predicated upon forming highly focused radar images of these targets. Efforts to develop highly focused imaging computer software have been challenging, mainly because the imaging depends on and is affected by the motion of the target, which in general is not precisely known. Specifically, the target generally has both rotational motion about some axis and translational motion as a whole with respect to the radar. The slant-range translational motion kinematic quantities must be first accurately estimated from the data and compensated before the image can be focused. Following slant-range motion compensation, the image is further focused by determining and correcting for target rotation. The use of the burst derivative measure is proposed as a means to improve the computational efficiency of currently used ISAR algorithms. The use of this measure in motion compensation ISAR algorithms for estimating the slant-range translational motion kinematic quantities of an uncooperative target is described. Preliminary tests have been performed on simulated as well as actual ISAR data using both a Sun 4 workstation and a parallel processing transputer array. Results indicate that the burst derivative measure gives significant improvement in processing speed over the traditional entropy measure now employed.

Some analytical and statistical properties of Fisher information

B. Roy Frieden

Show abstract

The concept of Fisher information I is introduced. Smoothness properties of I, and its relation to entropy, disorder, and uncertainty are explored. Information I is generalized to N- component problems, and is expressed both in direct and Fourier spaces. Applications to ISAR radar imaging and to the derivation of physical laws are discussed.

Robust regularized image restoration

Taek-Mu Kwon, Michael E. Zervakis

Show abstract

Many image processing applications involve long-tailed noise processes, which introduce outliers in the gray-level distribution of the image. The performance of conventional restoration algorithms is highly degraded by such noise processes. A novel restoration approach is introduced, which combines the properties of regularized and robust estimation schemes. Most prominent regularized approaches attempt to compensate for the ill-posedness of the pseudo-inverse solution. Regularization is achieved by constraining the least squares solution in terms of a smoothing criterion. The optimization approach introduced in this paper further modifies the regularized criterion according to the notion of M-estimation. Thus, an influence function is employed in restraining the contribution of large estimate-deviations in the optimization criterion. The modified criterion provides nonlinear estimates, which do not suffer from artifacts due to the presence of long-tailed noise. The computation of the robust regularized estimate is based on the simple structure of a steepest-descent iterative procedure. One of the most important factors associated with the concept of regularization is the regularization parameter. Adaptive schemes concerning the selection of this parameter at every iteration step are introduced. The convergence properties of the robust and the adaptive algorithms introduced are extensively studied. The capabilities of the robust regularized algorithms are demonstrated through restoration examples.

Error probabilities of minimum-distance classifiers

Helene Poublan, Francis Castanie

Show abstract

In the Gaussian case, the Bayes classifiers and the minimum distance classifiers are compared. The comparison is based on the error probability and on the bias introduced by the estimation of the law parameters. It is shown, both theoretically and by simulations, that even a suboptimal use of the minimum distance classifiers may be justified when the finite design sample size is small with regard to dimensionality. An application to signal classification is studied where the best modelization of the signal is not the best representation for classification.

Bayesian matching technique for detecting simple objects in heavily noisy environment

John S. Baras, Emmanuel N. Frantzeskakis

Show abstract

The template matching problem, for binary images corrupted with spatially white, binary, symmetric noise, is studied. Matching is compared based directly on the pixel-valued image data as well as on data coded by two simple schemes: a modification of the Hadamard basis and direct coarsening of resolution. Bayesian matching rules based on M-ary hypothesis tests are developed. The performance evaluation of these rules is provided. A study of the trade-off between the quantization level and the ability of detecting an object in the image is presented. This trade-off depends on the (external) noise generated at the moment the uncoded image is received. The sum-of-pixels and the histogram statistics are introduced in order to reduce the computational load inherent in the correlation statistic, with the resulting penalty of a higher probability of false alarm rate. The present work demonstrates by examples that it is beneficial for recognition to combine an image coding technique with the algorithm extracting some `basic' information from the image. In other words, coding (for compression) helps recognition. Numerical results illustrate this claim.

Two-dimensional signal deconvolution: design issues related to a novel multisensor-based approach

Nicholaos D. Sidiropoulos, John S. Baras, Carlos A. Berenstein

Show abstract

Recent results of analysis in several complex variables are employed to come up with a set of compactly supported approximate deconvolution kernels for the reconstruction of a two- dimensional signal based on multiple linearly degraded versions of the signal with a family of kernels that satisfies suitable technical conditions. The question of convergence of the proposed deconvolution kernels are discussed, simulation results that demonstrate the gain in bandwidth are presented, and two data parallel grid layouts for the off-line computation of the deconvolution kernels are proposed.

Novel transform for image description and compression with implementation by neural architectures

Jezekiel Ben-Arie, Raghunath K. Rao

Show abstract

A general method for signal representation using nonorthogonal basis functions that are composed of Gaussians are described. The Gaussians can be combined into groups with predetermined configuration that can approximate any desired basis function. The same configuration at different scales forms a set of self-similar wavelets. The general scheme is demonstrated by representing a natural signal employing an arbitrary basis function. The basic methodology is demonstrated by two novel schemes for efficient representation of 1-D and 2- D signals using Gaussian basis functions (BFs). Special methods are required here since the Gaussian functions are nonorthogonal. The first method employs a paradigm of maximum energy reduction interlaced with the A* heuristic search. The second method uses an adaptive lattice system to find the minimum-squared error of the BFs onto the signal, and a lateral-vertical suppression network to select the most efficient representation in terms of data compression.

Linear feature SNR enhancement in radon transform space

John R. Meckley

Show abstract

Many image features of interest are either linear in nature or are composed of piecewise linear segments. When the initial imaging process does not produce a signal-to-noise ratio sufficient for detection, a predetection filter is required to enhance the feature SNR. This filter must be invariant to feature position, orientation, and size in order to produce the highest processing gain with minimum distortion. The Fourier transform of the radon transform of linear features is shown to be invariant with respect to position and orientation, while varying slowly with respect to feature size. This permits optimum filtering for SNR enhancement. After filtering the radon transform, the image is reconstructed through a backprojection algorithm. Detection and segmentation of the linear features is significantly enhanced in the filtered image.

Aperture synthesis in astronomical radio-interferometry using maximum entropy on the mean

Guy Le Besnerais, Jorge Navaza, Guy Demoment

Show abstract

A new algorithm for aperture synthesis in radio-astronomy is presented. It is based on the principle of the maximum entropy on the mean. The procedure jointly performs estimation of the unknown phase aberrations which composes the Fourier data and image reconstruction. It partly derives from a preexisting imaging method developed in the field of crystallography. A simulated example of the aperture synthesis indicates the efficiency of the method.

Fractional Brownian motion and its fractal dimension estimation

Peng Zhang, Andrew B. Martinez, Herbert S. Barad

Show abstract

A mathematical model of stochastic processes -- fractional Brownian motion -- is addressed. The power-law behaviors of FBM increments are studied in detail for moments, correlation functions, and power spectra. A moment method is proposed to do model testing of fractional Brownian motion. The results of FBM model testing of six simulators show that the covariance matrix transforming algorithm can provide samples with very good approximation of self-affinity. The self-affinity of the FBM samples generated by Fourier transform filtering is not very obvious. The statistical properties of fractal dimension estimation methods are analyzed. The simulation results show that the variance method provides good performance when only estimates of variances with small time lags are used in the least-squares estimation. For the power spectrum method, the bias is not ignorable because of the aliasing and the window effect.

Computer-generated correlated noise images for various statistical distributions

Holly Wenaas, Arthur Robert Weeks, Harley R. Myler

Show abstract

The evaluation of image processing algorithms generally assumes images that are degraded by known statistical noise. The type of noise distributions that are needed depend on the nature of the application. The noise distributions that are commonly used are the Gaussian, negative exponential, and uniform distributions. Typically, these computer-generated noise images are spatially uncorrelated. It is the purpose of this paper to present computer-generated two- dimensional correlated and uncorrelated noise images that can be readily used in the evaluation of various image processing algorithms. Several statistical distributions including the negative exponential, the Rayleigh, and the K-distribution are generated from Gaussian statistical noise and are presented. For the generation of correlated noise images, the correlation function is defined by either describing the correlation function directly or by specifying the power spectral density function (PSD) using the Weiner-Kinchine theorem. These computer synthesized images are then compared against the expected theoretical results. Additionally, the autocorrelation function for the computer-generated noise images are computed and compared against the specified autocorrelation function. Also, included in the theoretical analysis is the effect of quantization, and finite pixel intensity, i.e., 0 - 255. Finally, several uncorrelated and correlated noise images are presented.

Artificial scenes and simulated imaging

Stephen E. Reichenbach, Stephen K. Park, Rachel Alter-Gartenberg, et al.

Show abstract

A software simulation environment for controlled image processing research is described. The simulation is based on a comprehensive model of the end-to-end imaging process that accounts for statistical characteristics of the scene, image formation, sampling, noise, and display reconstruction. The simulation uses a stochastic process to generate super-resolution digital scenes with variable spatial structure and detail. The simulation of the imaging process accounts for the important components of digital imaging systems, including the transformation from continuous to discrete during acquisition and from discrete to continuous during display. This model is appropriate for a variety of problems that involve image acquisition and display including system design, image restoration, enhancement, compression, and edge detection. By using a model-based simulation, research can be conducted with greater precision, flexibility, and portability than is possible using physical systems and experiments can be replicated on any general-purpose computer.

Frequency-based pattern recognition using neural networks

Simon Wenfeng Lu

Show abstract

A pattern recognition algorithm is proposed, in frequency domain, using a backpropagation neural network. The algorithm extracts distinct frequency features from reference patterns and compares them with the corresponding features of an unknown pattern. The feature sets are learned and recognized through backpropagation neural networks. Multiple neural networks are used to form a classification network. Since the feature set extracted is significantly smaller than the pattern image, the neural network is fast and accurate. Preliminary results indicates that certain features in frequency domain remain consistent for images of the same pattern and it is possible to extract these features for pattern recognition. Experimental results have also indicated that these features can be used to distinguish one pattern from another accurately. The system presented herein exhibits the advantages over previous systems of increasing recognition accuracy, i.e., lower false identification rate, increasing recognition speed, and decreasing data storage space. A method to dynamically identify frequency features in any set of reference patterns and to classify an unknown feature set using neural network is described in detail.

Generalized neocognitron model for facial recognition

Su-Shing Chen, Young-Sik Hong

Show abstract

The Fukushima's neocognitron model is generalized to a parallel neocognitron architecture which is applied for gray-scale facial recognition. Experiments show that the system can recognize human faces after learning. Results using the single neocognitron model for facial recognition or other gray-scale image recognition problems are not satisfactory.

Fuzzy logic and neural networks in artificial intelligence and pattern recognition

Elie Sanchez

Show abstract

With the use of fuzzy logic techniques, neural computing can be integrated in symbolic reasoning to solve complex real world problems. In fact, artificial neural networks, expert systems, and fuzzy logic systems, in the context of approximate reasoning, share common features and techniques. A model of Fuzzy Connectionist Expert System is introduced, in which an artificial neural network is designed to construct the knowledge base of an expert system from, training examples (this model can also be used for specifications of rules in fuzzy logic control). Two types of weights are associated with the synaptic connections in an AND-OR structure: primary linguistic weights, interpreted as labels of fuzzy sets, and secondary numerical weights. Cell activation is computed through min-max fuzzy equations of the weights. Learning consists in finding the (numerical) weights and the network topology. This feedforward network is described and first illustrated in a biomedical application (medical diagnosis assistance from inflammatory-syndromes/proteins profiles). Then, it is shown how this methodology can be utilized for handwritten pattern recognition (characters play the role of diagnoses): in a fuzzy neuron describing a number for example, the linguistic weights represent fuzzy sets on cross-detecting lines and the numerical weights reflect the importance (or weakness) of connections between cross-detecting lines and characters.

Application of neural networks to range-Doppler imaging

Xiaoqing Wu, Zhaoda Zhu

Show abstract

The use of neural networks are investigated for 2-D range Doppler microwave imaging. The range resolution of the microwave image is obtained by transmitting a wideband signal and the cross-range resolution is achieved by the Doppler frequency gradient in the same range bin. Hopfield neural networks are used to estimate the Doppler spectrum to enhance the cross- range resolution and reduce the processing time. There is a large number of neurons needed for the high cross-range resolution. In order to cut down the number of neurons, the reflectivities are replaced with their minimum norm estimates. The original Hopfield networks converge often to a local minina instead of the global minima. Simulated annealing is applied to control the gain of Hopfield networks to yield better convergence to the global minima. Results of imaging a model airplane from real microwave data are presented.

Color space analysis of road detection algorithms

Jill D. Crisman

Show abstract

Color space analysis of classification-based segmentation algorithms provides insights into the capabilities of the color vision system. In standard pattern recognition theory, classification algorithms are categorized by their discriminant functions in feature space. This analysis is applied to color classification methods used in road detection systems. There have been many systems that use classification techniques. By examining these road detection system in color space, a relationship can be seen between their color model representation and their capabilities and limitations.

Optic flow: multiple instantaneous rigid motions

Xinhua Zhuang, Tao Wang, Peng Zhang

Show abstract

In Zhuang et al. (1988), a linear algorithm was presented to estimate a single instantaneous rigid motion from optic flow image point data. In order to obtain reasonable answers, however, the data must be quite accurate. This was shown by a lot of simulated experiments. As was well recognized, all machine vision feature extractors, recognizers, and matchers explicitly or implicitly needed for computing optic flow are unavoidably error prone and seem to make occasional errors which indeed are blunders. The realistic assumption for errors in optic flow should be a contaminated Gaussian noise which is a regular white Gaussian noise with probability 1 - (epsilon) plus an outlier process with probability (epsilon) , Huber (1981). Either the linear algorithm or the least-squares estimator are very sensitive to minor deviations from the Gaussian noise model assumption. In Haralick et al. (1989), the classical M-estimator was successfully applied to solve a single pose estimation from corresponding point data. However, lots of experiments conducted in Haralick et al. (1989) showed that the M-estimator only allowed a low proportion of outliers. For multiple pose segmentation and estimation, an estimator of high robustness is needed. A highly robust estimator called by the MF-estimator for general regression is presented and is applied to an important problem in computer vision, i.e., segmenting and estimating multiple instantaneous rigid motions from optic flow data. To be realistic, the observed or processed optic flow data are contaminated by various noises including outliers. Notationally, `MF' represents an abbreviation of `Model Fitting.' The MF- estimator is a result of partially modeling the unknown log likelihood function.

Transformation from tristimulus RGB to Munsell notation HVC in a colored computer vision system

Guofan Jin, Zimin Zhu, Xinglong Yu

Show abstract

The Transformation from tristimulus RGB to triattribute HVC ofhuman vision is an important problem in computer vision and CAD. The primary obstacle to finding a brief algorithm is the complicated case of Munsell loci appearing in the existing chromatic spaces. This article introduces the concept of fuzzy logic set into the computation and then handles Munsell system by fitting method so as to simplify the complex function. A better result has been achieved and such an approach has been satisfactorily used in a diagnosis system of tongue feacture in traditional Chinese medicine.

Shape-from-focus: surface reconstruction of hybrid surfaces

Su-Shing Chen, Wu-bin Tang, Jianhua Xu

Show abstract

The shape-from-focus method uses different focus levels to obtain a sequence of object images and to estimate depth of image points. Recently, S. K. Nayar and Y. Nakagawa have treated the shape extraction problem of rough surfaces. This paper extends their approach to hybrid surfaces -- partially rough and partially smooth.

Stochastic and Neural Methods in Signal Processing, Image Processing, and Computer Vision

Volume Details

Table of Contents

Table of Contents