Proceedings Volume 5299

Computational Imaging II

cover
Proceedings Volume 5299

Computational Imaging II

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 21 May 2004
Contents: 9 Sessions, 40 Papers, 0 Presentations
Conference: Electronic Imaging 2004 2004
Volume Number: 5299

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Physics-Based Inverse Methods I
  • Temporal Imaging
  • Physics-Based Inverse Methods II
  • Computational Image Processing
  • Image Modeling
  • Multigrid Processing Methods
  • Registration and Mosaicing
  • Geometric Inversion
  • Poster Session
  • Computational Image Processing
  • Multigrid Processing Methods
  • Physics-Based Inverse Methods II
  • Poster Session
Physics-Based Inverse Methods I
icon_mobile_dropdown
Use of the FDTD method for time reversal: application to microwave breast cancer detection
Panagiotis Kosmas, Carey Rappaport
The feasibility of microwave breast cancer detection with a time reversal algorithm is examined. This time reversal algorithm, based on the finite difference time domain method (FDTD), time reverses not only the recorded field, but also the medium. It compensates for the wave decay and therefore is suitable for lossy media. We present two-dimensional (2D) breast models and geometries, and assume knowledge of the system's response in the absence of tumor (distorted wave Born approximation). Our results illustrate the system's detection and localization abilities, and its robustness to dispersion and measurement noise. Good performance using a simple time reversal mirror shows that this method is a promising technique for microwave imaging, and encourages us to further examine its applicability to microwave breast cancer detection.
Linear and nonlinear reconstruction for diffuse optical tomography in an inhomogeneous background
Gregory Boverman, Eric L. Miller, David A. Boas
Diffuse Optical Tomography is a novel approach to imaging the body's optical properties non-invasively using non-ionizing electromagnetic radiation in the visible and near-infrared range. As spectral information at a number of measurement wavelengths can give important information about functional properties of tissue relatively deep withing the body, it is hoped that Optical Tomography will be clinically useful, particularly for detecting breast tumors and distinguishing between tumors and benign lesions. This paper formulates the fully three-dimensional linear and nonlinear inverse problems for Diffuse Optical Tomography and compares the linear and nonlinear reconstructions in a heterogeneous medium for a number of cases. In the first case, the background has a randomly varying lowpass structure. In the second case, the background has a layered structure with a sharp transition between layers, and, in the third case, the structure of the background is known, but not the corresponding optical properties.
Representing scattering functions with spherical harmonics of spectral Fourier components
Huiying Xu, Yinlong Sun
A fundamental problem in imaging science and engineering is to characterize wave scattering from a small region of surface or volume. This behavior is generally described by a multidimensional scattering function. This paper proposes a new representation method of scattering functions to optimize data compression. Our method first performs a Fourier transform in the wavelength dimension and then spherical harmonic transform for each Fourier coefficient in the dimensions for spatial directions. The representation errors are studied numerically for using different levels of spherical harmonics and different numbers of Fourier components. This method has the advantage of efficiently storing data of scattering functions and has a great potential of applications in imaging science and engineering.
Temporal Imaging
icon_mobile_dropdown
Reconstruction of image sequences using motion compensation
Yongyi Yang, Erwan J. Gravier
In this paper we study a motion-compensated approach for simultaneous reconstruction of image frames in a time sequence. We treat the frames in a sequence collectively as a single function of both space and time, and define a temporal prior to account for the temporal correlations in a sequence. This temporal prior is defined in a form of motion-compensation, aimed to follow the curved trajectories of the object motion through space-time. The image frames are then obtained through estimation using the expectation-maximization (EM) algorithm. The proposed algorithm was evaluated extensively using the 4D gated mathematical cardiac-torso (gMCAT) D1.01 phantom to simulate gated SPECT perfusion imaging with Tc99m. Our experimental results demonstrate that the use of motion compensation for reconstruction can lead to significant improvement in image quality and reconstruction accuracy.
Physics-Based Inverse Methods II
icon_mobile_dropdown
Computational algorithm for reconstructing the profile of 2D rough surfaces
The presented reconstruction algorithm is based on merging the fast multipole method (FMM) for the forward solver with the rapidly convergent descent method for the cost function minimization algorithm (by Fletcher and Powell 1964 and 1970). Parametric results are presented showing the potential of the proposed computational algorithm.
On imaging multiple physical parameters in an inverse problems context
The extraction of information regarding multiple, space-varying parameters from limited, tomographic-type data represents and ill-posed inverse problem which is increasingly of interest in a range of application areas. From a physical perspective one would expect some degree of co-variation among the desired quantities; however traditional regularization methods do not exploit such prior information. Thus, here we introduce a correlation-type of metric to enforce a degree of common “structure” among the desired parameters where we consider structure to be defined by the gradients of the individual profiles. The analytical and algorithmic details of our method are presented and its performance evaluated using photothermal nondestructive evaluation as a driving example.
Computational Image Processing
icon_mobile_dropdown
Color filter array design based on a human visual model
To reduce cost and complexity associated with registering multiple color sensors, most consumer digital color cameras employ a single sensor. A mosaic of color filters is overlaid on a sensor array such that only one color channel is sampled per pixel location. The missing color values must be reconstructed from available data before the image is displayed. The quality of the reconstructed image depends fundamentally on the array pattern and the reconstruction technique. We present a design method for color filter array patterns that use red, green, and blue color channels in an RGB array. A model of the human visual response for luminance and opponent chrominance channels is used to characterize the perceptual error between a fully sampled and a reconstructed sparsely-sampled image. Demosaicking is accomplished using Wiener reconstruction. To ensure that the error criterion reflects perceptual effects, reconstruction is done in a perceptually uniform color space. A sequential backward selection algorithm is used to optimize the error criterion to obtain the sampling arrangement. Two different types of array patterns are designed: non-periodic and periodic arrays. The resulting array patterns outperform commonly used color filter arrays in terms of the error criterion.
Inverting color transforms
We consider experimental methods for creating regular grids for applications such as color management where the grids must be estimated from non-grid samples. To estimate the regular grid, we propose applying a generalization of linear interpolation, called linear interpolation with maximum entropy (LIME). Evaluating different estimation methods for this problem is difficult and does not correspond to the standard statistical learning paradigm of using iid training and test sets in order to compare algorithms. In this paper we consider the experimental issues and propose considering the end goal of the regular grid in evaluating an estimated grid's value. Preliminary experimental results compare LIME, traditional linear interpolation, linear regression and ridge regression.
Restoration of images with optical aberrations and quantization in a transform domain
Edmund Y. Lam, Michael K. Ng
Digital images generally suffer from two main sources of degradations. The first includes errors introduced in imaging, such as blurring due to optical aberrations and sensor noise. The second includes errors introduced during the processing. One particular example is the quantization noise arising from lossy compression. While image restoration is concerned with the recovery of the object from these degradations, often we only deal with one type of the error at a time. In this paper, we present a restoration algorithm that handles images with optical aberrations and quantization in a transform domain. We show that it can be cast in a joint optimization setting, and demonstrate how it can be solved efficiently through alternating minimization. We also prove analytically that the algorithm is globally convergent to a unique solution when the restoration uses either H1-norm or TV-norm regularization. Simulation result asserts that this joint minimization produces images with smaller relative errors compared to a standard regularization model.
Optimal unsharp mask for image sharpening and noise removal
We consider the problem of restoring a noisy blurred image using an adaptive unsharp mask filter. Starting with a set of very high quality images, we use models for both the blur and the noise to generate a set of degraded images. With these image pairs, we optimally train the strength parameter of the unsharp mask to smooth flat areas of the image and to sharpen areas with detail. We characterize the blur and the noise for a specific hybrid analog/digital imaging system in which the original image is captured on film with a low-cost analog camera. A silver-halide print is made from this negative; and this is scanned to obtain a digital image. Our experimental results for this imaging system demonstrate the superiority of our optimal unsharp mask compared to a conventional unsharp mask with fixed strength.
A two-stage classifier system for normal mammogram identification
Yajie Sun, Charles F. Babbs, Edward J. Delp III
In this paper, we present a unique two-stage classifier system for identifying normal mammograms. We present methods that extract features from breast regions characterizing normal and cancerous tissue. A subset of the features is used to construct a classifier. This classifier is then used to classify each mammogram as normal or abnormal. We designed a unique two-stage cascading classifier system. A binary decision tree classifier was used in the first stage. Cost constraints can be set to correctly classify cancerous regions. The regions classified as abnormal in the first-stage were used as input to the second-stage classifier, a linear classifier. We will show that the overall performance of our two-stage cascading classifier is better than a single classifier. Results of full-field normal mammogram analysis using this cascading classifier are comparable to a human reader.
Image model: new perspective for image processing and computer vision
We propose a new image model in which the image support and image quantities are modeled using algebraic topology concepts. The image support is viewed as a collection of chains encoding combination of pixels grouped by dimension and linking different dimensions with the boundary operators. Image quantities are encoded using the notion of cochain which associates values for pixels of given dimension that can be scalar, vector, or tensor depending on the problem that is considered. This allows obtaining algebraic equations directly from the physical laws. The coboundary and codual operators, which are generic operations on cochains allow to formulate the classical differential operators as applied for field functions and differential forms in both global and local forms. This image model makes the association between the image support and the image quantities explicit which results in several advantages: it allows the derivation of efficient algorithms that operate in any dimension and the unification of mathematics and physics to solve classical problems in image processing and computer vision. We show the effectiveness of this model by considering the isotropic diffusion.
The likelihood term in restoration of transform-compressed imagery
Compression of imagery by quantization of the data's transform coefficients introduces an error in the imagery upon decompression. When processing compressed imagery, often a likelihood term is used to provide a statistical description of how the observed data are related to the original noise-free data. This work derives the statistical relationship between compressed imagery and the original imagery, which is found to be embodied in a (in general) non-diagonal covariance matrix. Although the derivations are valid for transform coding in general, the work is motivated by considering examples for the specific cases of compression using the discrete cosine transform and the discrete wavelet transform. An example application of motion-compensated temporal filtering is provided to show how the presented likelihood term might be used in a restoration scenario.
Design and optimization of computational imaging systems
A new methodology, called Wavefront Coding, allows the joint optimization of optics, mechanics, detection and signal processing of computational imaging systems. This methodology gives the system designer access to a large design trade space. This trade space can be exploited to enable the design of imaging systems that can image with high quality, with fewer physical components, lighter weight, and less cost compared to traditional optics. This methodology is described through an example conformal single lens IR imaging system. The example system demonstrates a 50% reduction in physical components, and an approximate 45% reduction in weight compared to a traditional two lens system.
Applications of wavefront coded imaging
Ramkumar Narayanswamy, Alan E. Baron, Vladislav Chumachenko, et al.
Imaging systems using aspheric imaging lenses with complementary computation can deliver performance unobtainable in conventional imaging systems. These new imaging systems, termed Wavefront coded imaging systems, use specialized optics to capture a coded image of the scene. Decoding the intermediate image provides the "human-usable" image expected of an imaging system. Computation for the decoding step can be made completely transparent to the user with today's technology. Real-time Wavefront coded systems are feasible and cost-effective. This "computational imaging" technology can be adapted to solve a wide range of imaging problems. Solutions include the ability to provide focus-free imaging, to increase the field of view, to increase the depth of read, to correct for aberrations (even in single lens systems), and to account for assembly and temperature induced misalignment. Wavefront coded imaging has been demonstrated across a wide range of applications, including microscopy, miniature cameras, machine vision systems, infrared imaging systems and telescopes.
Image Modeling
icon_mobile_dropdown
On optimizing knot positions for multidimensional B-spline models
Xiang Deng, Thomas S. Denney Jr.
In this paper, we present a new method for optimizing knot positions for a multi-dimensional B-spline model. Using the results from from univariate polynomial approximation theory, spline approximation theory and multivariate tensor product theory, we develop the algorithm in three steps. First, we derive a local upper bound for the Lerror in a multivariate B-spline tensor product approximation over a span. Second, we use this result to bound the approximation error for a multi-dimensional B-spline tensor product approximation. Third, we developed two knot position optimization methods based on the minimization of two global approximation errors: Lglobal error and L2 global error. We test our method with 2D surface fitting experiments using B-spline models defined in both 2D Cartesian and polar coordinates. Simulation results demonstrate that the optimized knots can fit a surface more accurately than fixed uniformly spaced knots.
Time-frequency analysis with best local cosine bases
We propose new best basis search algorithms for local cosine dictionaries. We provide several algorithms for dictionaries of various complexity. Our framework generalizes the classical best local cosine basis selection based on a dyadic tree.
Multigrid Processing Methods
icon_mobile_dropdown
Alternating minimization multigrid algorithms for transmission tomography
The problem of image formation for X-ray transmission tomography is formulated as a statistical inverse problem. The maximum likelihood estimate of the attenuation function is sought. Using convex optimization methods, maximizing the loglikelihood functional is equivalent to a double minimization of I-divergence, one of the minimizations being over the attenuation function. Restricting the minimization over the attenuation function to a coarse grid component forms the basis for a multigrid algorithm that is guaranteed to monotonically decrease the I-divergence at every iteration on every scale.
Registration and Mosaicing
icon_mobile_dropdown
Multiframe demosaicing and super-resolution from undersampled color images
In the last two decades, two related categories of problems have been studied independently in the image restoration literature: super-resolution and demosaicing. A closer look at these problems reveals the relation between them, and as conventional color digital cameras suffer from both low-spatial resolution and color filtering, it is reasonable to address them in a unified context. In this paper, we propose a fast and robust hybrid method of super-resolution and demosaicing, based on a maximum a posteriori (MAP) estimation technique by minimizing a multi-term cost function. The L1 norm is used for measuring the difference between the projected estimate of the high-resolution image and each low-resolution image, removing outliers in the data and errors due to possibly inaccurate motion estimation. Bilateral regularization is used for regularizing the luminance component, resulting in sharp edges and forcing interpolation along the edges and not across them. Simultaneously, Tikhonov regularization is used to smooth the chrominance component. Finally, an additional regularization term is used to force similar edge orientation in different color channels. We show that the minimization of the total cost function is relatively easy and fast. Experimental results on synthetic and real data sets confirm the effectiveness of our method.
Linear models for multiframe super-resolution restoration under nonaffine registration and spatially varying PSF
Multi-frame super-resolution restoration refers to techniques for still-image and video restoration which utilize multiple observed images of an underlying scene to achieve the restoration of super-resolved imagery. An observation model which relates the measured data to the unknowns to be estimated is formulated to account for the registration of the multiple observations to a fixed reference frame as well as for spatial and temporal degradations resulting from characteristics of the optical system, sensor system and scene motion. Linear observation models, in which the observation process is described by a linear transformation, have been widely adopted. In this paper we consider the application of the linear observation model to multi-frame super-resolution restoration under conditions of non-affine image registration and spatially varying PSF. Reviewing earlier results, we show how these conditions relate to the technique of image warping from the computer graphics literature and how these ideas may be applied to multi-frame restoration. We illustrate the application of these methods to multi-frame super-resolution restoration using a Bayesian inference framework to solve the ill-posed restoration inverse problem.
High-resolution video mosaicing for documents and photos by estimating camera motion
Tomokazu Sato, Sei Ikeda, Masayuki Kanbara, et al.
Recently, document and photograph digitization from a paper is very important for digital archiving and personal data transmission through the internet. Though many people wish to digitize documents on a paper easily, now heavy and large image scanners are required to obtain high quality digitization. To realize easy and high quality digitization of documents and photographs, we propose a novel digitization method that uses a movie captured by a hand-held camera. In our method, first, 6-DOF(Degree Of Freedom) position and posture parameters of the mobile camera are estimated in each frame by tracking image features automatically. Next, re-appearing feature points in the image sequence are detected and stitched for minimizing accumulated estimation errors. Finally, all the images are merged as a high-resolution mosaic image using the optimized parameters. Experiments have successfully demonstrated the feasibility of the proposed method. Our prototype system can acquire initial estimates of extrinsic camera parameters in real-time with capturing images.
Mobile robot control for composition of seamless and high-resolution images in library
Ryuichi Ueda, Toshio Moriya, Chomchana Trevai, et al.
We are developing an assistant robot system for administration of a library. In this system, an autonomous mobile robot obtains images with a camera, and composes seamless and high-resolution images of a bookshelf by using mosaicing and super-resolution techniques. In this paper, we propose a control method for the robot in front of a bookshelf as a part of this system. To obtain images that are suitable for mosaicing, a robot should take images from the same distance and orientation to a bookshelf. Our control method utilizes horizontal edges, which are detected easily in any bookshelf. The robot modifies its orientation with the edge in camera images. We implemented a super-resolution and mosaicing algorithm. Our implementation is simple. However, it can compose a high quality image in an experiment, since the robot obtains preferable images for the image processing.
Geometric Inversion
icon_mobile_dropdown
Color feature and density-based image mosaicing using repeated application of the ICP algorithm
Samuel H Chang, Joseph Fuller, Ali Farsaie, et al.
A Color Feature and Density based (CFD) image mosaicing (IM) algorithm is presented in this paper. In the initial step, color image segmentation is used to provide a global match between an image pair. The well-known Iterated Closest Point (ICP) algorithm is used to find a transformation for global alignment. Finally, the ICP algorithm is applied again to find a transformation for local adjustment. By using this approach, it is shown that we can guarantee global alignment accuracy because the feature based method is used to find the initial matching of two image frames. We achieve local optimal pixel alignment based on the minimization of the Sum of Squares of Differences (SSD) between two images from the same overlapping area.
Using shape distributions as priors in a curve evolution framework
In this paper we propose a framework of constructing and using a shape prior in estimation problems. The key novelty of our technique is a new way to use high level, global shape knowledge to derive a local driving force in a curve evolution context. We capture information about shape in the form of a family of shape distributions (cumulative distribution functions) of features related to the shape. We design a prior objective function that penalizes the differences between model shape distributions and those of an estimate. We incorporate this prior in a curve evolution formulation for function minimization. Shape distribution-based representations are shown to satisfy several desired properties, such as robustness and invariance. They also have good discriminative and generalizing properties. To our knowledge, shape distribution-based representations have only been used for shape classification. Our work represents the development of a tractable framework for their incorporation in estimation problems. We apply our framework to three applications: shape morphing, average shape calculation, and image segmentation.
Statistical-model-based identification of complete vessel-tree frames in coronary angiograms
Til Aach, Alexandru Paul Condurache, Kai Eck, et al.
Coronary angiograms are pre-interventionally recorded moving X-ray images of a patient's beating heart, where the coronary arteries are made visible by a contrast medium. They serve to diagnose, e.g., stenoses, and as roadmaps during the intervention itself. Covering about three to four heart cycles, coronary angiograms consist of three underlying states: inflow, when the contrast medium flows into the vessels, filled state, when the whole vessel tree is visible and outflow, when the contrast medium is washed out. Obviously, only that part of the sequence showing the full vessel tree is useful as a roadmap. We therefore describe methods for automatic identification of these frames. To this end, a vessel map with enhanced vessels and compressed background is first computed. Vessel enhancement is based on the observation that vessels are the locally darkest oriented structures with significant motion. The vessel maps can be regarded as containing two classes, viz. (bright) vessels and (dark)background. From a histogram analysis of each vessel map image, a time-dependent feature curve is computed in which the states inflow, filled state and outflow can already visually be distinguished. We then describe two approaches to segment the feature curve into these states: the first method models the observations in each state by a polynomial, and seeks the segmentation which allows the best fit of three polynomials as measured by a Maximum-Likelihood criterion. The second method models the state sequence by a Hidden Markov model, and estimates it using the Maximum a Posteriori (MAP)-criterion. We will present results for a number of angiograms recorded in clinical routine.
Shape reconstruction of flexible objects from monocular images for industrial applications
In this paper, we will present a novel method for shape reconstruction of flexible objects, such as rubber-tubes, from monocular images. We understand shape as the three-dimensional position of a tube model in world space. Model knowledge that is available through CAD-data is used to infer parameters for an active contour algorithm "snake". Unlike traditional image-based snakes, our active contour algorithm optimizes fully three-dimensional tube-models in world space by projecting a 3d representation of itself onto the image plane. Using a novel method to estimate a 3d tangent of a curve by means of differential texture distortion, we exploit information from the monocular image that is not used in traditional edge-based active contour methods. Integrating both model and image information as energy terms into the active contour algorithm the 3d position of the tube is iteratively refined until an optimum shape of the tube is found.
A new flexible parameterization for the estimation of 3D shape structure from scattered field data
A wide range of applied imaging problems are concerned with the determination of the three dimensional structure of anomalous areas in a larger field of regard. In the context of medical imaging, there is great interest in the characterization of cancerous tumors using non-ionizing modalities such as diffuse optical tomography or quantitative ultrasonic imaging. In this work, we introduce a new, flexible approach to the modeling and estimation of 3D shapes. A complex three dimensional volume is defined by a set of 2D shape "primitives" representing the cross section of the object in essentially arbitrary planes. Each primitive is itself a 2D shape (specifically an ellipse for this paper) the structure of which is easily defined by a low dimensional vector of parameters (center location, axis lengths, and orientation angles). Given a set of primitives, we devise an interpolation scheme that correlates the structure of the individual primitives from one to the next. A nonlinear estimation algorithm is described for determining the parameters of our elliptic primitives i.e., the location of the centers in 3D, the lengths of their axes, and their orientation in space. Simulated results show the effectiveness of this method.
Subspace-based analysis of the ERT inverse problem
In a previous work, we proposed a source-type formulation to the electrical resistance tomography (ERT) problem. Specifically, we showed that inhomogeneities in the medium can be viewed as secondary sources embedded in the homogeneous background medium and located at positions associated with variation in electrical conductivity. Assuming a piecewise constant conductivity distribution, the support of equivalent sources is equal to the boundary of the inhomogeneity. The estimation of the anomaly shape takes the form of an inverse source-type problem. In this paper, we explore the use of subspace methods to localize the secondary equivalent sources associated with discontinuities in the conductivity distribution. Our first alternative is the multiple signal classification (MUSIC) algorithm which is commonly used in the localization of multiple sources. The idea is to project a finite collection of plausible pole (or dipole) sources onto an estimated signal subspace and select those with largest correlations. In ERT, secondary sources are excited simultaneously but in different ways, i.e. with distinct amplitude patterns, depending on the locations and amplitudes of primary sources. If the number of receivers is "large enough", different source configurations can lead to a set of observation vectors that span the data subspace. However, since sources that are spatially close to each other have highly correlated signatures, seperation of such signals becomes very difficult in the presence of noise. To overcome this problem we consider iterative MUSIC algorithms like R-MUSIC and RAP-MUSIC. These recursive algorithms pose a computational burden as they require multiple large combinatorial searches. Results obtained with these algorithms using simulated data of different conductivity patterns are presented.
Poster Session
icon_mobile_dropdown
Computational image processing for a computer vision system using biomimetic sensors and eigenspace object models
Cameron H. G. Wright, Steven F. Barrett, Daniel J. Pack, et al.
Two challenges to an effective, real-world computer vision system are speed and reliable object recognition. Traditional computer vision sensors such as CCD arrays take considerable time to transfer all the pixel values for each image frame to a processing unit. One way to bypass this bottleneck is to design a sensor front-end which uses a biologically-inspired analog, parallel design that offers preprocessing and adaptive circuitry that can produce edge maps in real-time. This biomimetic sensor is based on the eye of the common house fly (Musca domestica). Additionally, this sensor has demonstrated an impressive ability to detect objects at subpixel resolution. However, the format of the image information provided by such a sensor is not a traditional bitmap transfer of the image format and, therefore, requires novel computational manipulations to make best use of this sensor output. The real-world object recognition challenge is being addressed by using a subspace method which uses eigenspace object models created from multiple reference object appearances. In past work, the authors have successfully demonstrated image object recognition techniques for surveillance images of various military targets using such eigenspace appearance representations. This work, which was later extended to partially occluded objects, can be generalized to a wide variety of object recognition applications. The technique is based upon a large body of eigenspace research described elsewhere. Briefly described, the technique creates target models by collecting a set of target images and finding a set of eigenvectors that span the target image space. Once the eigenvectors are found, an eigenspace model (also called a subspace model) of the target is generated by projecting target images on to the eigenspace. New images to be recognized are then projected on to the eigenspace for object recognition. For occluded objects, we project the image on to reduced dimensional subspaces of the original eigenspace (i.e., a “subspace of a subspace” or a “sub-eigenspace”). We then measure how close a match we can achieve when the occluded target image is projected on to a given sub-eigenspace. We have found that this technique can result in significantly improved recognition of occluded objects. In order to manage the combinatorial “explosion” associated with selecting the number of subspaces required and then projecting images on to those sub-eigenspaces for measurement, we use a variation on the A* (called “A-star”) search method. The challenge of tying these two subsystems (the biomimetic sensor and the subspace object recognition module) together into a coherent and robust system is formidable. It requires specialized computational image and signal processing techniques that will be described in this paper, along with preliminary results. The authors believe that this approach will result in a fast, robust computer vision system suitable for the non-ideal real-world environment.
Automatic road extraction based on cross detection in suburb
Go Koutaki, Keiichi Uchimura
Importance for acquiring geographic map data and updating existing data is increasing. The automation of road extraction from aerial imagery has received attention. In the past, many approaches have been considered, however the existing automatic road extraction methods still need too much post editing. In this paper, we propose the method of automatic road extraction from high resolution color aerial images based on the information, such as a position and the direction of road intersection. As road shape recognition, we use an active contour model which is a kind of deformable shape model. The active contour model with a width parameter(called Ribbon Snakes) is useful as a technique to extract the road form, however the method has a problem how to generate a initial contour. We generate a initial contour using result of a road tracking. A road tracking is performed using the information, such as a position and the direction of road intersection. To detect road intersections, we use the template matching like cross form. We report experiments using high resolution (0.5m per pixel) color aerial imagery of residential area in suburb.
Performance analysis of color spaces for optimally fitting the active shape model
Tracking non-rigid objects such as people in video sequences is a daunting task due to computational complexity and unstable performance. Special considerations for digital image processing are required when an object of interest changes its shape between consecutive frames. Traditionally active shape models (ASMs) have not include a color information in their formation. We present several extensions of the ASM for color images using different color-adapted objective functions. We also analyze the performance of color ASM models in RGB, YUV, or HIS color spaces.
Global computational algebraic topology approach for diffusion
One physical process involved in many computer vision problems is the heat diffusion process. Such Partial differential equations are continuous and have to be discretized by some techniques, mostly mathematical processes like finite differences or finite elements. The continuous domain is subdivided into sub-domains in which there is only one value. The diffusion equation comes from the energy conservation then it is valid on a whole domain. We use the global equation instead of discretize the PDE obtained by a limit process on this global equation. To encode these physical global values over pixels of different dimensions, we use a computational algebraic topology (CAT)-based image model. This model has been proposed by Ziou and Allili and used for the deformation of curves and optical flow. It introduces the image support as a decomposition in terms of points, edges, surfaces, volumes, etc. Images of any dimensions can then be handled. After decomposing the physical principles of the heat transfer into basic laws, we recall the CAT-based image model and use it to encode the basic laws. We then present experimental results for nonlinear graylevel diffusion for denoising, ensuring thin features preservation.
Dynamic region-of-interest acquisition and face tracking for intelligent surveillance system
Recently, surveillance systems gain more attraction than simple CCTV systems, especially for complicated security environment. The major purpose of the proposed system is to monitor and track intruders. More specifically, accurate identification of each intruder is more important than simply recording what they are doing. Most existing surveillance systems simply keep recording the fixed viewing area, and some others adopt the tracking technique for wider coverage. Although panning and tilting the camera can extend the viewing area, only a few automatic zoom control techniques for acquiring the optimum ROI has been proposed. This paper describes a system for tracking multiple faces from input video sequences using facial convex hull-based facial segmentation and robust hausdorff distance. The proposed algorithm adapts skin color reference map in the YCbCr color space and hair color reference map in the RGB color space for classifying face region. Then, we obtain an initial face model with preprocessing and convex hull. For tracking, this algorithm computes displacement of the point set between frames using a robust hausdorff distance and the best possible displacement is selected. Finally, the initial face model is updated using the displacement. We provide experimental result to demonstrate the performance of the proposed tracking algorithm, which efficiently tracks rotating, and zooming faces as well as multiple faces in video sequences obtained from at CCD camera.
Frame interpolation of ultrasound images using optical flow
In this paper, we present a frame rate up-conversion method for ultrasound image enhancement. The inherent flexibility of ultrasound imaging and moderate cost without known bio-effects give ultrasound a vital role in the diagnostic process compared with other methods. The conventional mechanical scan method for multi-planar images has a slow frame rate. In the proposed frame rate-up conversion method, new interpolated frames are inserted between two input frames, giving smooth renditions to human eyes. Existing methods employing blockwise motion estimation show block artifacts, in which motion vectors are estimated using a block-matching algorithm (BMA). We propose an optical flow based method to find pixelwise intensity changes that yields more accurate motion estimates for frame interpolation. Consequently, the proposed method can provide detailed and improved images without block artifacts. Interpolated frames may contain hole or overlapped regions due to covered or uncovered areas in motion compensation. Those regions can be easily eliminated by a post processing, in which the similarity of pixel intensity is employed with a ray casting based method. Experimental results with several sets of ultrasound image sequences show the effectiveness of the proposed method.
Image denoising via fundamental anisotropic diffusion and wavelet shrinkage: a comparative study
Noise removal faces a challenge: Keeping the image details. Resolving the dilemma of two purposes (smoothing and keeping image features in tact) working inadvertently of each other was an almost impossible task until anisotropic dif-fusion (AD) was formally introduced by Perona and Malik (PM). AD favors intra-region smoothing over inter-region in piecewise smooth images. Many authors regularized the original PM algorithm to overcome its drawbacks. We compared the performance of denoising using such 'fundamental' AD algorithms and one of the most powerful multiresolution tools available today, namely, wavelet shrinkage. The AD algorithms here are called 'fundamental' in the sense that the regularized versions center around the original PM algorithm with minor changes to the logic. The algorithms are tested with different noise types and levels. On top of the visual inspection, two mathematical metrics are used for performance comparison: Signal-to-noise ratio (SNR) and universal image quality index (UIQI). We conclude that some of the regu-larized versions of PM algorithm (AD) perform comparably with wavelet shrinkage denoising. This saves a lot of compu-tational power. With this conclusion, we applied the better-performing fundamental AD algorithms to a new imaging modality: Optical Coherence Tomography (OCT).
Computational Image Processing
icon_mobile_dropdown
A Bayesian approach to filter design: detection of compact sources
Marcos Lopez-Caniego, Diego Herranz, Rita Belen Barreiro, et al.
We consider filters for the detection and extraction of compact sources on a background. We make a one-dimensional treatment (though a generalization to two or more dimensions is possible) assuming that the sources have a Gaussian profile whereas the background is modeled by an homogeneous and isotropic Gaussian random field, characterized by a scale-free power spectrum. Local peak detection is used after filtering. Then, a Bayesian Generalized Neyman-Pearson test is used to define the region of acceptance that includes not only the amplification but also the curvature of the sources and the a priori probability distribution function of the sources. We search for an optimal filter between a family of Matched-type filters (MTF) modifying the filtering scale such that it gives the maximum number of real detections once fixed the number density of spurious sources. We have performed numerical simulations to test theoretical ideas.
Multigrid Processing Methods
icon_mobile_dropdown
Grouping and segmentation in a hierarchy of graphs
We review multilevel hierarchies under the special aspect of their potential for segmentation and grouping. The one-to-one correspondence between salient image features and salient model features are a limiting assumption that makes prototypical or generic object recognition impossible. The region's internal properties (color, texture, shape, ...) help to identify them and their external relations (adjacency, inclusion, similarity of properties) are used to build groups of regions having a particular consistent meaning in a more abstract context. Low-level cue image segmentation in a bottom-up way, cannot and should not produce a complete final "good" segmentation. We present a hierarchical partitioning of images using a pairwise similarity function on a graph-based representation of an image. This function measures the difference along the boundary of two components relative to a measure of differences of the components' internal differences. Two components are merged if there is a low-cost connection between them. We use this idea to find region borders quickly and effortlessly in a bottom-up way, based on local differences in a specific feature. The aim of this paper is to build a minimum weight spanning tree (MST) in order to find region borders quickly in a bottom-up 'stimulus-driven' way based on local differences in a specific feature.
Graph pyramids as models of human problem solving
Zygmunt Pizlo, Zheng Li
Prior theories have assumed that human problem solving involves estimating distances among states and performing search through the problem space. The role of mental representation in those theories was minimal. Results of our recent experiments suggest that humans are able to solve some difficult problems quickly and accurately. Specifically, in solving these problems humans do not seem to rely on distances or on search. It is quite clear that producing good solutions without performing search requires a very effective mental representation. In this paper we concentrate on studying the nature of this representation. Our theory takes the form of a graph pyramid. To verify the psychological plausibility of this theory we tested subjects in a Euclidean Traveling Salesman Problem in the presence of obstacles. The role of the number and size of obstacles was tested for problems with 6-50 cities. We analyzed the effect of experimental conditions on solution time per city and on solution error. The main result is that time per city is systematically affected only by the size of obstacles, but not by their number, or by the number of cities.
Physics-Based Inverse Methods II
icon_mobile_dropdown
Noise reduction and 3D visualization of confocal microscopy images
Yinlong Sun, Qiqi Wang, Haiying Xu, et al.
In confocal microscopy images, a common observation is that lower image stacks have lower voxel intensities and are usually blurred in comparison with the upper ones. The key reasons are light absorption and scattering by the objects and particles in the volume through which light passes. This paper proposes a new technique to reduce such noise impacts in terms of an adaptive intensity compensation and image-sharpening algorithm. With these image-processing procedures, advanced 3D volume-rendering techniques can be applied to more faithfully visualize confocal microscopy images.
Poster Session
icon_mobile_dropdown
Digital image interpolation using adaptive Gaussian basis functions
Digital image interpolation using Gaussian radial basis functions has been implemented by several investigators, and promising results have been obtained; however, determining the basis function variance has been problematic. Here, adaptive Gaussian basis functions fit the mean vector and covariance matrix of a non-radial Gaussian function to each pixel and its neighbors, which enables edges and other image characteristics to be more effectively represented. The interpolation is constrained to reproduce the original image mean gray level, and the mean basis function variance is determined using the expected image smoothness for the increased resolution. Test outputs from the resulting Adaptive Gaussian Interpolation algorithm are presented and compared with classical interpolation techniques.