Proceedings Volume 6496

Real-Time Image Processing 2007

cover
Proceedings Volume 6496

Real-Time Image Processing 2007

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 7 February 2007
Contents: 6 Sessions, 23 Papers, 0 Presentations
Conference: Electronic Imaging 2007 2007
Volume Number: 6496

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 6496
  • Surveillance
  • Algorithms
  • Video and Compression
  • Hardware
  • Poster Session
Front Matter: Volume 6496
icon_mobile_dropdown
Front Matter: Volume 6496
This PDF file contains the front matter associated with SPIE Proceedings Volume 6496, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.
Surveillance
icon_mobile_dropdown
Two-dimensional statistical linear discriminant analysis for real-time robust vehicle-type recognition
I. Zafar, E. A. Edirisinghe, S. Acar, et al.
Automatic vehicle Make and Model Recognition (MMR) systems provide useful performance enhancements to vehicle recognitions systems that are solely based on Automatic License Plate Recognition (ALPR) systems. Several car MMR systems have been proposed in literature. However these approaches are based on feature detection algorithms that can perform sub-optimally under adverse lighting and/or occlusion conditions. In this paper we propose a real time, appearance based, car MMR approach using Two Dimensional Linear Discriminant Analysis that is capable of addressing this limitation. We provide experimental results to analyse the proposed algorithm's robustness under varying illumination and occlusions conditions. We have shown that the best performance with the proposed 2D-LDA based car MMR approach is obtained when the eigenvectors of lower significance are ignored. For the given database of 200 car images of 25 different make-model classifications, a best accuracy of 91% was obtained with the 2D-LDA approach. We use a direct Principle Component Analysis (PCA) based approach as a benchmark to compare and contrast the performance of the proposed 2D-LDA approach to car MMR. We conclude that in general the 2D-LDA based algorithm supersedes the performance of the PCA based approach.
Determination of vehicle density from traffic images at day and nighttime
In this paper we extend our previous work to address vehicle differentiation in traffic density computations1. The main goal of this work is to create vehicle density history for given roads under different weather or light conditions and at different times of the day. Vehicle differentiation is important to account for connected or otherwise long vehicles, such as trucks or tankers, which lead to over-counting with the original algorithm. Average vehicle size in pixels, given the magnification within the field of view for a particular camera, is used to separate regular cars and long vehicles. A separate algorithm and procedure have been developed to determine traffic density after dark when the vehicle headlights are turned on. Nighttime vehicle recognition utilizes blob analysis based on head/taillight images. The high intensity of vehicle lights are identified in binary images for nighttime vehicle detection. The stationary traffic image frames are downloaded from the internet as they are updated. The procedures are implemented in MATLAB. The results of both nighttime traffic density and daytime long vehicle identification algorithms are described in this paper. The determination of nighttime traffic density, and identification of long vehicles at daytime are improvements over the original work1.
Dual camera system for acquisition of high resolution images
Video surveillance is ubiquitous in modern society, but surveillance cameras are severely limited in utility by their low resolution. With this in mind, we have developed a system that can autonomously take high resolution still frame images of moving objects. In order to do this, we combine a low resolution video camera and a high resolution still frame camera mounted on a pan/tilt mount. In order to determine what should be photographed (objects of interest), we employ a hierarchical method which first separates foreground from background using a temporal-based median filtering technique. We then use a feed-forward neural network classifier on the foreground regions to determine whether the regions contain the objects of interest. This is done over several frames, and a motion vector is deduced for the object. The pan/tilt mount then focuses the high resolution camera on the next predicted location of the object, and an image is acquired. All components are controlled through a single MATLAB graphical user interface (GUI). The final system we present will be able to detect multiple moving objects simultaneously, track them, and acquire high resolution images of them. Results will demonstrate performance tracking and imaging varying numbers of objects moving at different speeds.
Camera position estimation method based on matching of top-view images for running vehicle
Tomoaki Teshima, Hideo Saito, Shinji Ozawa, et al.
In this paper, a method which estimates the trajectory of the vehicle from a single vehicle camera is proposed. The proposed method is a model based method which assumes that the vehicle is running on a planar road. The input image is converted to a Top-View image and a matching (registration) between the next Top-View image is done. The registration is done based on an assumed velocity parameter, and repeated with entire candidate parameters. In this paper, a simple model and the particle filter is introduced to decrease the computation cost. Simple model gives a constraint to the registration of the Top-View images, and the particle filter decreases the number of the candidate parameters. Position of the camera is obtained by accumulating the velocity parameters. Experiments shows 3 results. Enough decreasement of the computation cost, suitable estimated trajectory and small enough computation cost to estimate the trajectory of the vehicle.
Algorithms
icon_mobile_dropdown
Print-from-video: computationally efficient outlier reduction pattern filtering
Print-from-video can be achieved via super-resolution techniques, which involve combining information from multiple low resolution images to generate a high resolution image. Due to inaccuracies of sub-pixel motion estimation and motion modeling, undesired artifacts or outliers are produced when using such techniques. This paper discusses the use of the direct approach for the print-from-video application and introduces an outlier reduction algorithm, named pattern filtering, as part of the super-resolution reconstruction process. The introduced algorithm is non-iterative, making it computationally efficient for deployment on digital camera platforms.
A fast contour descriptor algorithm for supernova image classification
Cecilia R. Aragon, David Bradburn Aragon
We describe a fast contour descriptor algorithm and its application to a distributed supernova detection system (the Nearby Supernova Factory) that processes 600,000 candidate objects in 80 GB of image data per night. Our shapedetection algorithm reduced the number of false positives generated by the supernova search pipeline by 41% while producing no measurable impact on running time. Fourier descriptors are an established method of numerically describing the shapes of object contours, but transform-based techniques are ordinarily avoided in this type of application due to their computational cost. We devised a fast contour descriptor implementation for supernova candidates that meets the tight processing budget of the application. Using the lowest-order descriptors (F1 and F-1) and the total variance in the contour, we obtain one feature representing the eccentricity of the object and another denoting its irregularity. Because the number of Fourier terms to be calculated is fixed and small, the algorithm runs in linear time, rather than the O(n log n) time of an FFT. Constraints on object size allow further optimizations so that the total cost of producing the required contour descriptors is about 4n addition/subtraction operations, where n is the length of the contour.
Fast distance transform computation using dual scan line propagation
Fatih Porikli, Tekin Kocak
We present two fast algorithms that approximate the distance transformation of 2D binary images. Distance transformation finds the minimum distances of all data points from a set of given object points, however, such an exhaustive search for the minimum distances is infeasible in larger data spaces. Unlike the conventional approaches, we extract the minimum distances with no explicit distance computation by using either multi-directional dual scan line propagation or wave propagation methods. We iteratively move on a scan line in opposite directions and assign an incremental counter to underlying data points while checking for object points. To our advantage, the precision of dual scan propagation method can be set according to the available computational power. Alternatively, we start a wavefront from object points and propagate it outward at each step while assigning the number of steps taken as the minimum distance. Unlike the most existing approaches, the computational load of our algorithm does not depend on the number of object points either.
Measuring the complexity of design in real-time imaging software
Raghvinder S. Sangwan, Pamela A. Vercellone-Smith, Phillip A. Laplante
Due to the intricacies in the algorithms involved, the design of imaging software is considered to be more complex than non-image processing software (Sangwan et al, 2005). A recent investigation (Larsson and Laplante, 2006) examined the complexity of several image processing and non-image processing software packages along a wide variety of metrics, including those postulated by McCabe (1976), Chidamber and Kemerer (1994), and Martin (2003). This work found that it was not always possible to quantitatively compare the complexity between imaging applications and nonimage processing systems. Newer research and an accompanying tool (Structure 101, 2006), however, provides a greatly simplified approach to measuring software complexity. Therefore it may be possible to definitively quantify the complexity differences between imaging and non-imaging software, between imaging and real-time imaging software, and between software programs of the same application type. In this paper, we review prior results and describe the methodology for measuring complexity in imaging systems. We then apply a new complexity measurement methodology to several sets of imaging and non-imaging code in order to compare the complexity differences between the two types of applications. The benefit of such quantification is far reaching, for example, leading to more easily measured performance improvement and quality in real-time imaging code.
Video and Compression
icon_mobile_dropdown
A generic software-framework for distributed, high-performance processing of multiview video
This paper presents a software framework providing a platform for parallel and distributed processing of video data on a cluster of SMP computers. Existing video-processing algorithms can be easily integrated into the framework by considering them as atomic processing tiles (PTs). PTs can be connected to form processing graphs that model the data flow of a specific application. This graph also defines the data dependencies that determine which tasks can be computed in parallel. Scheduling of the tasks in this graph is carried out automatically using a pool-of-tasks scheme. The data format that can be processed by the framework is not restricted to image data, such that also intermediate data, like detected feature points or object positions, can be transferred between PTs. Furthermore, the processing can optionally be carried out efficiently on special-purpose processors with separate memory, since the framework minimizes the transfer of data. Finally, we describe an example application for a multi-camera view-interpolation system that we successfully implemented on the proposed framework.
Real-time stabilization of long-range observation system turbulent video
Long Range Observation Systems is a domain, which carries a lot of interest in many fields such as astronomy (i.e. planet exploration), geology, ecology, traffic control, remote sensing, and homeland security (surveillance and military intelligence). Ideally, image quality would be limited only by the optical setup used, but, in such systems, the major cause for image distortion is atmospheric turbulence. The paper presents a real-time algorithm that compensates images distortion due to atmospheric turbulence in video sequences, while keeping the real moving objects in the video unharmed. The algorithm is based on moving objects extraction; hence turbulence distortion compensation is applied only to the static areas of images. For that purpose a hierarchical decision mechanism is suggested. First, a lightweight computational decision mechanism which extracts most stationary areas is applied. Then a second step improves accuracy by more computationally complex algorithms. Finally, all areas in the incoming frame that were tagged as stationary are replaced with an estimation of the stationary scene. The restored videos exhibit excellent stability for stationary objects while retaining real motion. This is achieved in real-time on standard computer hardware.
Real-time aware rendering of scalable arbitrary-shaped MPEG-4 decoder for multiprocessor systems
Milan Pastrnak, Peter H. N. de With, Jef van Meerbergen
The MPEG-4 video standard extends the traditional frame-based processing with the option to compose several video objects (VO) superimposed on a background sprite image. In our previous work, we presented a distributed, multiprocessor based, scalable implementation of an MPEG-4 arbitrary-shaped decoder, which forms together with the background sprite decoder an essential part for further scene rendering. For control of the multiprocessor architecture, we have constructed a Quality-of-Service (QoS) management that monitors the availability of required data and distributes the processing of individual tasks with guaranteed or best-effort services of the platform. However, the proposed architecture with the combined guaranteed and best-effort services poses problems for real-time scene rendering. In this paper, we present a technique for proper run-time rendering of the final scene after decoding one VO Layer. The individual video-object monitors check the data availability and select the highest quality for the final scene rendering. The algorithm operates hierarchically both at the scene level and at the task level of the video object processing. Whereas the earlier work on scalable implementation concentrated only on guaranteed services, we now introduce a new element in the system architecture for the real-time control and fall back mechanism of the best-effort services. This element is based on first, controlling data availability at task level, and second, introducing the propagation service to QoS management. We present our simulation results in the comparison with the standard "frame-skipping" technique that is the only currently available solution to this type of rendering a scalable processing.
Development of new image compression algorithm (Xena)
Yukio Sugita, Akira Watanabe
This paper provides an overall description of new image compression technology, Xena, and the strengths in its lossless compression capability and speed in comparison to JPEG2000 and JPEG_LS, the world standards in the continuous tone image compression field. Xena has achieved an extremely high compression speed over 20 times faster than that of JPEG2000, while the compression capability remains almost the same.
Hardware
icon_mobile_dropdown
Real-time 3D video conference on generic hardware
X. Desurmont, J. L. Bruyelle, D. Ruiz, et al.
Nowadays, video-conference tends to be more and more advantageous because of the economical and ecological cost of transport. Several platforms exist. The goal of the TIFANIS immersive platform is to let users interact as if they were physically together. Unlike previous teleimmersion systems, TIFANIS uses generic hardware to achieve an economically realistic implementation. The basic functions of the system are to capture the scene, transmit it through digital networks to other partners, and then render it according to each partner's viewing characteristics. The image processing part should run in real-time. We propose to analyze the whole system. it can be split into different services like central processing unit (CPU), graphical rendering, direct memory access (DMA), and communications trough the network. Most of the processing is done by CPU resource. It is composed of the 3D reconstruction and the detection and tracking of faces from the video stream. However, the processing needs to be parallelized in several threads that have as little dependencies as possible. In this paper, we present these issues, and the way we deal with them.
Hardware-based JPEG2000 video coding system
In this paper, we discuss a hardware based low complexity JPEG 2000 video coding system. The hardware system is based on a software simulation system, where temporal redundancy is exploited by coding of differential frames which are arranged in an adaptive GOP structure whereby the GOP structure itself is determined by statistical analysis of differential frames. We present a hardware video coding architecture which applies this inter-frame coding system to a Digital Signal Processor (DSP). The system consists mainly of a microprocessor (ADSP-BF533 Blackfin Processor) and a JPEG 2000 chip (ADV202).
Three-dimensional color image processing procedures using DSP
Processing of the vector image information is seemed very important because multichannel sensors used in different applications. We introduce novel algorithms to process color images that are based on order statistics and vectorial processing techniques: Video Adaptive Vector Directional (VAVDF) and the Vector Median M-type K-Nearest Neighbour (VMMKNN) Filters presented in this paper. It has been demonstrated that novel algorithms suppress effectively an impulsive noise in comparison with different other methods in 3D video color sequences. Simulation results have been obtained using video sequences "Miss America" and "Flowers", which were corrupted by noise. The filters: KNNF, VGVDF, VMMKNN, and, finally the proposed VAVDATM have been investigated. The criteria PSNR, MAE and NCD demonstrate that the VAVDATM filter has shown the best performances in each a criterion when intensity of noise is more that 7-10%. An attempt to realize the real-time processing on the DSP is presented for median type algorithms techniques.
High-speed line-scan camera with digital time delay integration
Dealing with high-speed image acquisition and processing systems, the speed of operation is often limited by the amount of available light, due to short exposure times. Therefore, high-speed applications often use line-scan cameras, based on charge-coupled device (CCD) sensors with time delayed integration (TDI). Synchronous shift and accumulation of photoelectric charges on the CCD chip - according to the objects' movement - result in a longer effective exposure time without introducing additional motion blur. This paper presents a high-speed color line-scan camera based on a commercial complementary metal oxide semiconductor (CMOS) area image sensor with a Bayer filter matrix and a field programmable gate array (FPGA). The camera implements a digital equivalent to the TDI effect exploited with CCD cameras. The proposed design benefits from the high frame rates of CMOS sensors and from the possibility of arbitrarily addressing the rows of the sensor's pixel array. For the digital TDI just a small number of rows are read out from the area sensor which are then shifted and accumulated according to the movement of the inspected objects. This paper gives a detailed description of the digital TDI algorithm implemented on the FPGA. Relevant aspects for the practical application are discussed and key features of the camera are listed.
Poster Session
icon_mobile_dropdown
Real-time speckle and impulsive noise suppression in 3D imaging based on robust linear combinations of order statistics
This paper presents an approach based on linear combinations of order statistics for speckle and impulsive noise reduction in the 3-D ultrasound images. The proposed technique uses the Rank M-type (RM) estimator and this one is adapted to 3-D image processing applications. The real-time implementation is presented using real clinical ultrasound images by means of use of the DSP TMS320C6711. In addition, the results from known techniques are compared with the proposed method to demonstrate its performance in terms of noise suppression, fine detail preservation, and processing time criteria.
Real-time quadtree analysis using HistoPyramids
Gernot Ziegler, Rouslan Dimitrov, Christian Theobalt, et al.
Region quadtrees are convenient tools for hierarchical image analysis. Like the related Haar wavelets, they are simple to generate within a fixed calculation time. The clustering at each resolution level requires only local data, yet they deliver intuitive classification results. Although the region quadtree partitioning is very rigid, it can be rapidly computed from arbitrary imagery. This research article demonstrates how graphics hardware can be utilized to build region quadtrees at unprecedented speeds. To achieve this, a data-structure called HistoPyramid registers the number of desired image features in a pyramidal 2D array. Then, this HistoPyramid is used as an implicit indexing data structure through quadtree traversal, creating lists of the registered image features directly in GPU memory, and virtually eliminating bus transfers between CPU and GPU. With this novel concept, quadtrees can be applied in real-time video processing on standard PC hardware. A multitude of applications in image and video processing arises, since region quadtree analysis becomes a light-weight preprocessing step for feature clustering in vision tasks, motion vector analysis, PDE calculations, or data compression. In a sidenote, we outline how this algorithm can be applied to 3D volume data, effectively generating region octrees purely on graphics hardware.
Tracking objects with radical color changes using modified mean shift
Inteck Whang, Kwang Nam Choi, Samuel Henry Chang
This paper presents a new algorithm for color-based tracking of objects with radical color using modified Mean shift. Conventional color-based object tracking using mean shift does not provide appropriate result when initial color distribution disappears. In this proposed algorithm, Mean shift analysis is first used to derive the object candidate with maximum increase of density direction from current position. Then the proposed algorithm is used iteratively to update the object color information if the object color is changed. The implementation of the new algorithm achieves effective real-time tracking of objects with complete color changed by time. The validity of the effective approach is illustrated by the presentation of experimental results obtained using the methods described in the paper.
On the use of real-time agents in distributed video analysis systems
B. Lienard, A. Hubaux, C. Carincotte, et al.
Today's technologies in video analysis use state of the art systems and formalisms like onthologies and datawarehousing to handle huge amount of data generated from low-level descriptors to high-level descriptors. In the IST CARETAKER project we develop a multi-dimensional database with distributed features to add a centric data view of the scene shared between all the sensors of a network. We propose to enhance possibilities of this kind of system by delegating the intelligence to a lot of other entities, also known as "Agents" which are specialized little applications, able to walk across the network and work on dedicated sets of data related to their core domain. In other words, we can reduce, or enhance, the complexity of the analysis by adding or not feature specific agents, and processing is limited to the data concerned by the processing. This article explains how to design and develop an agent oriented systems which can be used by a video analysis datawarehousing. We also describe how this methodology can distribute the intelligence over the system, and how the system can be extended to obtain a self reasoning architecture using cooperative agents. We will demonstrate this approach.
A real-time hierarchical rule-based approach for scale independent human face detection
In this paper, we present a scale independent automatic face location technique which can detect the locations of frontal human faces from images. Our hierarchical approach of knowledge-based face detection composed of three levels. Level 1 consists of a simple but effective eyes model that generates a set of rules to judge whether or not there exists a human face candidate in the current search area in a scale-independent manner and in a single scan of the image. To utilize this model, we define a new operator - extended projection and define two new concepts: single projection line and pair projection line. At level 2, an improved model of Yang's mosaic image model is applied to check the consistency of visual features with respect to the human face within each 3x3 blocks of a candidate face image. At the third level, we apply a SVM based face model, to eliminate the false positives obtained from level 2. Experimental results show the combined rule-based and statistical approach works well in detecting frontal human faces in uncluttered scenes.
Digital architecture for real-time processing in vision systems for control of traffic lights
Jair Garcia-Lamont, Jose L. Gonzalez-Vidal, Marco Acavedo-Mosqueda
Digital architecture for real time processing in vision systems for control of traffic lights is presented. The main idea of this work is to identify cars on intersections, switching traffic lights in order to reduce traffic jam. The architecture is based on a color image segmentation algorithm that comprises three stages. Stage one is a color space transformation in order to measure the color difference properly, image colors are represented in a modified L* u* v* color space. Stage two consists in a color reduction, where image colors are projected into a small set of prototypes using a self-organizing map (SOM). Stage three realizes color clustering, where simulated annealing (SA) seeks the optimal clusters from SOM prototypes. The proposed hardware architecture is implemented in a Virtex II Pro FPGA and tested; having a processing time inferior to 25ms per 128x128 pixels. The implementation comprises 262,479 equivalent gates.