Proceedings Volume 2055

Intelligent Robots and Computer Vision XII: Algorithms and Techniques

David P. Casasent
cover
Proceedings Volume 2055

Intelligent Robots and Computer Vision XII: Algorithms and Techniques

David P. Casasent
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 20 August 1993
Contents: 6 Sessions, 56 Papers, 0 Presentations
Conference: Optical Tools for Manufacturing and Advanced Automation 1993
Volume Number: 2055

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Pattern Recognition in Computer Vision
  • Computer Vision
  • Sampling and Coding in Computer Vision
  • Image Processing
  • Morphological, Wavelet, and Gabor Processing
  • Neural Nets and Fuzzy Logic in Machine Vision
  • Image Processing
  • Neural Nets and Fuzzy Logic in Machine Vision
  • Pattern Recognition in Computer Vision
Pattern Recognition in Computer Vision
icon_mobile_dropdown
Neural net selection methods for Gabor transform detection filters
David P. Casasent, John Scott Smokelin
New Gabor transform (GT) filters to detect candidate object locations independent of the object class, object distortions, and for low contrast objects in clutter are described. A new neural network (NN) technique is described to automate selection of GT parameters and to combine multiple Gabor functions (GFs) into once composite macro GF detection filter. Fusion of real and imaginary GT filter outputs is used to reduce false alarms, (PFA), while maintaining high detection rates (PD). Test results on the TRIM-2 database are provided.
Real-time model-based vision for industrial domains
Steven B. Seida, Michael Magee
This paper describes a model based vision system that has been developed which is able to perform model based reasoning at real-time (or near real-time) rates and for which both the hardware and prototyping costs are low. The basic approach taken is to extract a set of useful features from observed models using a library of feature primitive operators. Scale and orientation invariant combinations of these features are used as indices into a hardware lookup table to establish initial correspondence between similar combinations that will be encountered when examining unknown objects. When performing initial recognition of an unknown object, evidence for an object in a particular spatial pose is accumulated, giving rise to an initial set of hypotheses. The strongest hypotheses are then refined by iteratively hypothesizing new (previously uninstantiated) model/object feature matches and computing a confidence measure associated with the current instantiation set. If confidence increases the newly hypothesized instantiation is retained, otherwise it is discarded.
Automatic system for the classification of cellular categories in cytological images
Marinette Revenu, Abderrahim Elmoataz, Christine Porquet, et al.
In this paper, we describe research carried out within the framework of the optimization of an image analyzer dedicated to rapid detection of abnormalities of ploidy in human tumors. The system takes as its input microscopic images of dissociated cells which are to be segmented in order to extract cellular objects, calculate shape and texture measures, and identify each category of cell, by means of two classification methods that are compared and discussed: classification based on the Bayes decision rule and classification using neural networks.
Automated approach to trend monitoring based on fractal analysis
Scott A. Starks, James Hamilton, Nitin Okhade
As systems become more complex, the monitoring and interpretation of measurement data related to the health of the system becomes increasingly more difficult. Trend monitoring is an important task that involves a prediction of the future state of system health based upon past observations. In many systems, sensors or suites of sensors gather data about the state of health of the system and its processes. Analysis of the power spectrum of the time series resulting from this sort of data collection provides insight into the trends inherent. In this paper, we present a fractal-based approach to the interpretation of the power spectrum of the time series. Using fractal analysis enables the characterization of the power spectrum using a minimal set of parameters. A computational algorithm for the calculation of these parameters is presented and shows promise as a basis for trend monitoring.
Automatic inspection of electronic surface-mount assemblies
Peter Forte, Gary P Brown, Peter Barnwell, et al.
An approach to image analysis is described consisting of five stages: edge detection, thresholding, linking, shape description and shape abstraction. The approach is illustrated by applying the steps to the problem of automatic inspection.
Vision system based on a classification connectionist algorithm
Gabriel A. Oliver, Nuria Piera
In this work, we assume that the degree of confidence in the information provided by a TV camera is a value composed of different aspects of the original scene and image. This fact affects the whole vision system, from the acquisition step to the classification or rejection of the objects observed. The vision system presented in this paper is a simple but efficient system for the classification of plane objects. Images are recorded by a TV camera. After carrying out the contour detection of the objects present in the scene, we go through a process of obtention of some singular points on these contours with which the descriptors of the objects will be built up. Due to the loss of information introduced in the digitalization and in the segmentation of the image, some uncertainty is introduced in the process that has to be taken into account in the classification process. In that sense, it is interesting to use a classification algorithm that can accept some kind of imprecision in the representation of the objects to be classified. The classification algorithm used is based on an incremental clustering methodology termed LAMDA (Learning Algorithm for multivariate data analysis). Such a system takes into consideration the interaction of two processes, learning and recognition. It gives a representation of each class and enables us, if it is deemed necessary, to look for a specification or generalization of previously constructed classes. Examples and experimental results are presented to illustrate the performance of the system.
Extraction of edge-based and region-based features for object recognition
Benjamin Coutts, Srinivas Ravi, Gongzhu Hu, et al.
One of the central problems of computer vision is object recognition. A catalogue of model objects is described as a set of features such as edges and surfaces. The same features are extracted from the scene and matched against the models for object recognition. Edges and surfaces extracted from the scenes are often noisy and imperfect. In this paper algorithms are described for improving low level edge and surface features. Existing edge extraction algorithms are applied to the intensity image to obtain edge features. Initial edges are traced by following directions of the current contour. These are improved by using corresponding depth and intensity information for decision making at branch points. Surface fitting routines are applied to the range image to obtain planar surface patches. An algorithm of region growing is developed that starts with a coarse segmentation and uses quadric surface fitting to iteratively merge adjacent regions into quadric surfaces based on approximate orthogonal distance regression. Surface information obtained is returned to the edge extraction routine to detect and remove fake edges. This process repeats until no more merging or edge improvement can take place. Both synthetic (with Gaussian noise) and real images containing multiple object scenes have been tested using the merging criteria. Results appeared quite encouraging.
Locating facial features for age classification
Young Ho Kwon, Niels da Vitoria Lobo
In this paper, we outline computations for visual age classification from facial images. For now, input images can only be classified into one of three age-groups: babies, adults, and senior adults. The computations are based on cranio-facial development theory, and wrinkle analysis. In the implementation, first primary features of the face are found, followed by secondary feature analyses. Preliminary results with real data are presented.
Line grouping using perceptual saliency and structure prediction for car detection in traffic scenes
Sandra Denasi, Giorgio Quaglia
Autonomous and guide assisted vehicles make a heavy use of computer vision techniques to perceive the environment where they move. In this context, the European PROMETHEUS program is carrying on activities in order to develop autonomous vehicle monitoring that assists people to achieve safer driving. Car detection is one of the topics that are faced by the program. Our contribution proposes the development of this task in two stages: the localization of areas of interest and the formulation of object hypotheses. In particular, the present paper proposes a new approach that builds structural descriptions of objects from edge segmentations by using geometrical organization. This approach has been applied to the detection of cars in traffic scenes. We have analyzed images taken from a moving vehicle in order to formulate obstacle hypotheses: preliminary results confirm the efficiency of the method.
Expert systems as design aids for artificial vision systems: a survey
Daniel Crevier
The development of software that would be to computer vision what expert system shells are to expert systems has been the subject of considerable inquiry over the last ten years; this paper reviews the pertinent publications and tries to present a coherent view of the field. We start by outlining two major differences between would be `vision shells' and conventional expert system shells. The first is the need for an intermediate level of symbolic representation between image pixels and the knowledge base. The second is that the mental operations that people perform to interpret images lie almost totally below the threshold of consciousness. Vision system designers therefore cannot, as domain experts normally do, examine their own mental processes and cast them into rules to extract information from images. The vision shell should thus contain, in addition to the usual knowledge engineering toolbox, knowledge on the pertinence of specific imaging operations towards various goals. After a review of the role of explicit knowledge in artificial vision, we examine the architecture a vision shell should have, and look at ways of facilitating the entry of domain-pertinent knowledge into it. Final remarks are made on knowledge representation and acquisition aspects particular to industrial applications.
Vision-based orientation and position detection of ICs and PCBs
Hong Kyu Chung, Rae-Hong Park
In this paper, a high-resolution algorithm for detecting the orientation and position of an IC, and an algorithm for compensating the position and skew angle of a PCB, are proposed. The proposed algorithm for the first topic consists of two parts. Its first part is a preprocessing step, in which corner points of an IC are detected and are separated into two groups. Then the coarse angle of the principal axis is obtained by line fitting. The second part is a main processing step, in which the Hough transform over the limited range of angles is applied to the corner points to detect precisely the orientation of an IC or a surface mounting device (SMD). The position of an IC or SMD is determined by using its four corner points. The proposed algorithm for the second topic is the one which detects a rotation angle and translation parameters of a PCB using a template matching method. The PCB is compensated by the detected parameters. The computer simulation shows that the parameters obtained by proposed algorithms are more accurate than those by the several conventional methods considered. The proposed algorithms can be applied to the fast and accurate automatic inspection systems.
Establishing identity and pose of objects from a library using reciprocal basis set and direction of arrival techniques
Charles R. Wright, Richard F. Vaz, David Cyganski
In many object recognition problems, the object to be identified is one of a fixed set (library) of objects. The problem of identifying which object is present then shares characteristics of the signal detection and parameter estimation problem: which signal is present and what are its parameters? The Reciprocal Basis Set/Direction of Arrival (RBS/DOA) technique is a recently developed technique for object pose determination. It uses a single, comprehensive analytic object model representing a suite of views of an object. Object orientation can be directly established from single 2-D views of the object, without a costly search of the pose parameter space, and without need for the views to be related by a geometric image transformation. This paper describes how one can construct reciprocal basis sets to simultaneously determine object identity and pose from a single 2-D image. Results are presented which demonstrate this ability for a single unknown pose parameter using synthetic and camera-acquired images.
Computer Vision
icon_mobile_dropdown
Beyond pure static shape in function-based object recognition
Kevin W. Bowyer, Louise Stark
There has recently been growing interest in exploiting the concept of reasoning about function for object recognition. In a function-based approach to object recognition, recognition of an object means labeling it as belonging to some category of objects according to the function that it could serve. The few function-based recognition systems which have so far been described in the literature have all assumed that the input to the problem is a pure static shape description. By `pure' shape we mean that the only object property that the systems have reasoned about is their abstract shape. By `static' shape we mean that the systems have reasoned about an object from only a single (assumed rigid) abstract shape instance. This paper discusses some of the issues which must be addressed in extending the function-based approach to handle non-shape properties (such as material properties) and dynamic shape descriptions.
Finding landmark features under a broad range of lighting conditions
Claude L. Fennema Jr.
Whether computer vision is used to steer a robot, to determine the location of an object, to model the environment, or to perform recognition, it is usually necessary to have a simple, yet robust method for finding features. Over the last few decades many methods have been devised for locating these features using line or region based approaches. A problem that faces most approaches, however, is that variations in lighting due to changes in ambient light or to shadows make the procedure very complex or error prone. This paper describes a relatively simple method for finding features that is tolerant of wide variations in ambient lighting and works well in scenes containing shadows. The method makes use of correlation based template matching but derives most of its strength from the way it transforms the image data as matching is performed. In addition to a description of the method, the paper presents results of its use in experiments performed under a broad variety of conditions and discusses its role in model-based navigation and stereo matching.
Progress in high-level exploratory vision
Matthew Brand
We have been exploring the hypothesis that vision is an explanatory process, in which causal and functional reasoning about potential motion plays an intimate role in mediating the activity of low-level visual processes. In particular, we have explored two of the consequences of this view for the construction of purposeful vision systems: Causal and design knowledge can be used to (1) drive focus of attention, and (2) choose between ambiguous image interpretations. An important result of visual understanding is an explanation of the scene's causal structure: How action is originated, constrained, and prevented, and what will happen in the immediate future. In everyday visual experience, most action takes the form of motion, and most causal analysis takes the form of dynamical analysis. This is even true of static scenes, where much of a scene's interest lies in how possible motions are arrested. This paper describes our progress in developing domain theories and visual processes for the understanding of various kinds of structured scenes, including structures built out of children's constructive toys and simple mechanical devices.
Grounding language in perception
Jeffrey Mark Siskind
We describe an implemented computer program that recognizes the occurrence of simple spatial motion events in simulated video input. The program receives an animated line-drawing as input and produces as output a semantic representation of the events occurring in that movie. We suggest that the notions of support, contact, and attachment are crucial to specifying many simple spatial motion event types and present a logical notation for describing classes of events that incorporates such notions as primitives. We then suggest that the truth values of such primitives can be recovered from perceptual input by a process of counterfactual simulation, predicting the effect of hypothetical changes to the world on the immediate future. Finally, we suggest that such counterfactual simulation is performed using knowledge of naive physical constraints such as substantiality, continuity, gravity, and ground plane. We describe the algorithms that incorporate these ideas in the program and illustrate the operation of the program on sample input.
Integrating task-directed planning with reactive object recognition
Sven J. Dickinson, Suzanne Stevenson, Eugene Amdur, et al.
We describe a robot vision system that achieves complex object recognition with two layers of behaviors, performing the tasks of planning and object recognition, respectively. The recognition layer is a pipeline in which successive stages take in images from a stereo head, recover relevant features, build intermediate representations, and deposit 3-D objects into a world model. Each stage is an independent process that reacts automatically to output from the previous stage. This reactive system operates continuously and autonomously to construct the robot's 3-D model of the environment. Sitting above the recognition pipeline is the planner which is responsible for populating the world model with objects that satisfy the high-level goals of the system. For example, upon examination of the world model, the planner can decide to direct the head to another location, gating new images into the recognition pipeline, causing new objects to be deposited into the world model. Alternatively, the planner can alter the recognition behavior of the pipeline so that objects of a certain type or at a certain location appear in the world model.
Behaviors for active object recognition
David R. Wilkes, John K. Tsotsos
The concept of active object recognition is introduced and a proposal for its solution is described. It is argued that single-view object recognition is fraught with problems, mainly due to viewpoint-related ambiguities, occlusions and coincidences. Recognition which is active, that is, that has the ability to vary viewpoint according to the interpretation status, is both more viable and more closely related to the manner in which humans recognize different objects. An active system may be achieved via a simple modification of the recognition hardware. The camera is mounted on the end of a robot arm on a mobile base. The system exploits the mobility of the camera by using low-level image data to drive the camera to a special viewpoint with respect to an unknown object. From such a viewpoint, the object recognition task is reduced to a two-dimensional pattern recognition problem. This paper describes the behavior-based approach to camera motion, that ensures robust acquisition of special views of the object to be recognized.
Fast, cheap, and easy system for outside vision on Mars
Andrew S. Gavin, Masaki Yamamoto
In the design and construction of mobile robots, vision has always been one of the most potentially useful sensory systems. In practice however, it has also become the most difficult to successfully implement. At the MIT Mobile Robotics (Mobot) Lab we have designed a small, light, cheap, and low power Mobot Vision System that can be used to guide a mobile robot in a constrained environment. The target environment is the surface of Mars, although we believe the system should be applicable to other conditions as well. It is our belief that the constraints of the Martian environment will allow the implementation of a system that provides vision based guidance to a small mobile rover. The purpose of this vision system is to process realtime visual input and provide as output information about the relative location of safe and unsafe areas for the robot to go. For the first part of the project, which is nearly complete, we have built a small self contained vision system. As to the second half of the project, it is our hypothesis that a simple vision algorithm does not require huge amounts of computation and that goals such as constructing a complete three dimensional map of the environment are difficult, wasteful, and possibly unreachable.
Real-time vision system for a mobile robot using cheap hardware
Ian Horswill
In this paper I describe work in progress on a low cost vision-based robot designed to give primitive tours. The system is very simple, robust and efficient, and runs on a hardware platform which could be duplicated for less than $10K. Much of the system's efficiency is due to its implicit knowledge about the structure and appearance of its environment. This knowledge is not represented explicitly but rather is encoded in the structure of the system. I give an overview of the system, discuss the properties of its environment, and show how they can be used to simplify the design of the system.
High-speed low-latency portable visual-sensing system
Anne Wright
My work in the field of computer vision has focused on building high speed visual tracking systems to provide position feedback for the control of unstable tasks. In order to make a juggling robot or a robotic helicopter stable, three qualities are essential -- high update rate, low latency, and decent accuracy and resolution. We want to put multiple vision systems on small robotic devices with limited payloads. Therefore the system must be very small, light, and cheap. However, we allow marking of the environment so the sophistication of the vision task can be extremely low. These constraints guided the development of AVS (which, due to lack of a better name, stands for Anne's Vision System).
General visual robot controller networks via artificial evolution
David Cliff, Inman Harvey, Philip Husbands
We discuss recent results from our ongoing research concerning the application of artificial evolution techniques (i.e., an extended form of genetic algorithm) to the problem of developing `neural' network controllers for visually guided robots. The robot is a small autonomous vehicle with extremely low-resolution vision, employing visual sensors which could readily be constructed from discrete analog components. In addition to visual sensing, the robot is equipped with a small number of mechanical tactile sensors. Activity from the sensors is fed to a recurrent dynamical artificial `neural' network, which acts as the robot controller, providing signals to motors governing the robot's motion. Prior to presentation of new results, this paper summarizes our rationale and past work, which has demonstrated that visually guided control networks can arise without any explicit specification that visual processing should be employed: the evolutionary process opportunistically makes use of visual information if it is available.
Automatic tissue characterization from ultrasound imagery
Yasser M. Kadah, Aly A. Farag, Abou-Bakr M. Youssef, et al.
In this work, feature extraction algorithms are proposed to extract the tissue characterization parameters from liver images. Then the resulting parameter set is further processed to obtain the minimum number of parameters representing the most discriminating pattern space for classification. This preprocessing step was applied to over 120 pathology-investigated cases to obtain the learning data for designing the classifier. The extracted features are divided into independent training and test sets and are used to construct both statistical and neural classifiers. The optimal criteria for these classifiers are set to have minimum error, ease of implementation and learning, and the flexibility for future modifications. Various algorithms for implementing various classification techniques are presented and tested on the data. The best performance was obtained using a single layer tensor model functional link network. Also, the voting k-nearest neighbor classifier provided comparably good diagnostic rates.
Improving visibility method for underwater robots
Shangqing Liu, Yu-Xing Xia, Jian Bao, et al.
For improving underwater visibility for robots, a method is presented using a special intervallic series of light pulses, two acts of emitting and receiving are performed by two specially designed optical shutters, and the control signals are delay synchronized. Calculations show that the received backscattering light can be reduced and so the visible distance increased. This method has several advantages which can make it practical for underwater robots and other actual uses. Its feasibility has been demonstrated in the experiments, while only in principle due to the limitation of the experimental conditions at present.
Sampling and Coding in Computer Vision
icon_mobile_dropdown
Image analysis of photochromic ink for security applications
Bruce G. Batchelor, Nelson M. Stephens
Photochromic materials exist in two different color states, with switching between states being achieved by irradiation, with ultra-violet and visible light. By printing patterns and data using both photochromic ink and ordinary ink, it is possible to create a document that is difficult to forge and easy to authenticate. Security is achieved only by public ignorance about and the relative rarity of photochromic materials. Very high levels of security are possible, using modern data encipherment techniques. These are so secure that no known algorithmic method exists for breaking them in a practical amount of time. It should be understood that encipherment algorithms provide a way of protecting a message. Guaranteeing the authenticity of a complete document is better achieved using photochromic materials. This article describes a scheme which employs both techniques to achieve higher overall security than either can provide individually. Central to this idea is the ability to sense the presence of photochromic materials using machines, prior to recognizing specified patterns and reading text.
Convex shape refinement using dynamic programming for surface defect classification on wooden materials
This paper deals with a two-step segmentation algorithm for 2-D convex objects. First the objects are approximated by an elliptic shape description, and then the boundary of the object is refined using dynamic programming. The reason for refinement is accurate shape classification.
Image coding and image activity measurement
Farhad Keissarian, Mohammad K. Ibrahim, Mohammad Farhang Daemi
In this paper, a novel image analysis technique is proposed, which may be performed prior to coding in order to decide what is the most significant information to encode. In the proposed system, the image to be coded is first partitioned into a large number of sub-blocks of N*N pixels. The blocks can then be stored into two major classes according to the level of the visual activity present. The classification is based on analyzing the local histogram within each sub-block. In this paper, we initially analyze the image blocks to separate uniform blocks from those that can be classified as non-uniform blocks. Adjacent uniform blocks with the same statistics are merged to form large blocks. These blocks can then be coded by their mean values. It is also shown that the non-uniform blocks may also be classified into three categories with different levels of activity.
Image Processing
icon_mobile_dropdown
Acquisition of range images in an integrated vision system environment
M. Arif Wani, Bruce G. Batchelor
The paper presents a simple laser based triangulation system to acquire range images. The problems of high speed serpentine/raster scanning and sensor-output bottle-neck are addressed here by using a thin plane fan of laser light, and a linear variable neutral density filter. It also proposes a Prolog based integrated vision system environment for acquisition, processing, and understanding of range images.
Intelligent control of robotic paint stripping using color-vision feedback
Dennis N. Harvey, Thomas W. Rogers
A robotic work cell for stripping paint from both large and small aircraft is currently under development at Pratt & Whitney - Waterjet Systems (formerly USBI Co., Advanced Programs). The primary objectives for this system are to reduce the time required to strip an aircraft, improve the quality and consistency of the stripping operation, and reduce the need for environmentally objectionable stripping processes such as chemical methods. One of the keys to the overall success of this automated paint stripping system is the integration of a sensor that will provide for adaptive process control. This sensor must be able to indicate when a layer of paint has been completely stripped from the aircraft, and measure the approximate size of any remaining unstripped patches. The sensor should also be able to indicate when surface roughening is occurring on the material beneath the paint.
Color image enhancement using color constancy based on modified IHS coordinate system
Jeong Yeop Kim, Yeong-Ho Ha
Color image enhancement to restore natural color by excluding the effect of the ambient illumination is important in recent image processing. In this paper, a new color image enhancement method using color constancy based on pseudo-linearly modified IHS coordinate system is proposed. Since the color constancy processing preserves only hue while reducing the dynamic range of lightness and saturation, the technique of dynamic range increase is used to compensate them. The proposed method, which analyzes the relationship between the RGB and modified IHS coordinate system, transforms and increases lightness and saturation simultaneously to avoid the complexity in the related transformation.
Robust line extraction and matching algorithm
Bassam Hussien, Banavar Sridhar
This paper presents an algorithm for extracting straight lines from intensity images and describes a line matching algorithm for solving the line correspondence problem. The line extraction process begins by detecting edges in the intensity image. Next, line support regions are formed where image points (pixels) have similar gradient orientation. A line fitting algorithm is then used to fit a line to the points in the line support region based on a least means square fitting algorithm. Finally, line segments are linked together to form the final lines by using an adaptive line linking method, this results in much stronger lines and a smaller set of lines to be considered. Once the lines are detected in a sequence of images, a line matching algorithm is used to match lines in one image to the lines in the other image. The images are either from a motion or stereo sequence. The matched lines may then be used with the sensor position and orientation data to estimate range to objects corresponding to the lines. We present results based on applying the line extraction and line matching algorithms to a synthetic image and an outdoor scene captured by a camera on a helicopter.
Informed edge linking using a directional potential function
Victoria Riordan, Quiming Zhu
Low level edge detection operators usually do not generate contiguous edges, leaving objects in images with discontinuous borders. This, coupled with inherent signal noises, makes it difficult to identify objects in images. Here we describe a new algorithm that connects disjoint edge pixels to form continual object boundaries. We model the edge images as potential fields deployed with energies at the edge pixel positions all over the images. Pixels at the edge disjoint positions are charged by the combining forces of these edge pixels in proportion to the relative distances and directions of these pixels. An intrinsic part of the process is the identification of terminal edge pixels (TEP), accompanying with a classification of edge pixels in terms of the pixel connection patterns, to provide critical information for possible connectivity of edge segments. The algorithm applies a potential evaluation function to measure the likelihood of edge linking in certain directions for given TEPs. To reduce the computational overhead and improve the efficiency of the algorithm, an informed search method is used to locate significant edge pixels that present the most strong linking forces to a given TEP. The potential value for the TEP is calculated with respect to the edge directions dominated by the linking forces. When the potential value exceeds a given threshold in a direction, an extension is made at the TEP position in that direction. The process iterates until desired results are attained, using a global edge pattern evaluation scheme.
New automatic threshold selection algorithm for edge detection
Amar Aggoun, Mohammad K. Ibrahim, Mohammad Farhang Daemi
In this paper, a novel approach is proposed for selecting the thresholds of edge strength maps from its local histogram. This threshold selection technique is based on finding the threshold for small blocks of the edge map. For each block the threshold is chosen using an iterative procedure. The effect of the choice of the size of the block is discussed. In this paper, the edge strength map is quantized to reduce the computation of the iterative threshold selection algorithm as well as the memory requirement. It is shown that the quantization of the edge map improves the performance of the local iterative threshold selection algorithm. Typical examples of the tests carried out are presented.
Adaptive direction filtering and continuous variable threshold image segmentation method for speckle pattern
Yibing Yang, Zhenya He
A new adaptive directional filtering and continuous variable threshold image segmentation method for speckle pattern is proposed in this paper. In accordance with its characteristics, a speckle pattern is first divided into several subimages, in which each only contains single direction's fringes, and then, these subimages are filtered respectively along the fringe's direction. Furthermore, the fringe enhancement method based on local statistical characteristics is used to improve fringe contrast. On the basis of the directional filtering and enhancement, a continuous variable threshold image segmentation method is presented, in which, a N X N subimage is divided into a lot of N X 7 or N X 9 narrow image regions perpendicular to local fringe direction, and then each region is segmented by a corresponding threshold curve. Because the threshold curves can follow fringe's extremum changes, it can avoid the effect of the inhomogeneous grey level distribution caused by diffraction halo and accurate segment fringe image. Moreover, binary directional filtering is made on the segmented speckle pattern.
New entropy operator for edge abstraction
Xiao-Hua Xie, Xue-Qin Tian, Limin Luo
This paper presents a new exponential behavior based entropy operator for extracting image edges. Edges can be detected by computing the entropy of brightness or hue in a local region of a picture. The entropy depends not only on the rate of change of brightness or hue, but also on the average brightness or hue. The experimental result verifies the efficiency of the new entropy operator.
Morphological, Wavelet, and Gabor Processing
icon_mobile_dropdown
Wavelet image representation and applications in computer vision
Aly A. Farag, Bin Wang
In this paper, a number of spatial/spatial-frequency image representations are reviewed. Wavelets have recently generated much interest, both in applied areas as well as in more theoretical ones. Wavelet transform relative to some basic wavelets provides a flexible time- frequency window which automatically narrows when observing high frequency phenomena and widens when studying low frequency environments. As a result, it is suitable for visual information representation. Applications in computer vision such as image compression and image enhancement are examined.
Wavelet transform in depth recovery
MawKae Hor, Jemmy Y.M. Chen, Kuo-Shen Chen
In this paper, a number of spatial/spatial-frequency image representations are reviewed. Wavelets have recently generated much interest, both in applied areas as well as in more theoretical ones. Wavelet transform relative to some basic wavelets provides a flexible time- frequency window which automatically narrows when observing high frequency phenomena and widens when studying low frequency environments. As a result, it is suitable for visual information representation. Applications in computer vision such as image compression and image enhancement are examined. method is presented, in which, a N X N subimage is divided into a lot of N X 7 or N X 9 narrow image regions perpendicular to local fringe direction, and then each region is segmented by a corresponding threshold curve. Because the threshold curves can follow fringe's extremum changes, it can avoid the effect of the inhomogeneous grey level distribution caused by diffraction halo and accurate segmen space, a projection operator is used in the spatial-variant deconvolution. Nevertheless, experimental results show that this approximation mechanism can generate the depth map of different images successfully.
Morphological granulometric shape recognition in noise
Ying-Chong Cheng, Edward R. Dougherty
Shape classification via linear granulometric moments is examined for patterns suffering varying degrees of edge noise. It is seen that recognition is quite poor even for modest amounts of noise and remains poor even when the patterns are first filtered by a close-open filter. Recognition accuracy is greatly improved, for both unfiltered and filtered images, by employing exterior granulometries. These are constructed by applying the various linear structuring-element sequences to the corresponding linear convex hulls of the noisy patterns. The resulting granulometric distributions are then not corrupted by noise-induced probability mass at the left of the pattern spectrum, thereby greatly diminishing the detrimental effects on the pattern spectrum moments.
Morphological segmentation of multiprobe fluorescence images for immunophenotyping in melanoma tissue sections
Alasdair I. Dow, Steven A. Shafer, Alan S. Waggoner
A fundamental task in studying the action of cancer chemotherapy is to determine the quantity and spatial relationship of tumor-infiltrating lymphocyte populations. Classically this is performed by staining thin tissue sections with antibodies by immunoperoxidase amplification. The staining technique is practically limited to locating a single cell type per tissue section. Full immunophenotyping requires successive staining of serial sections, using statistical analysis to correlate the results. This paper describes a system that brings together multi- parameter fluorescence imaging and morphological segmentation techniques to provide a fast, accurate, and automatic analysis of the lymphocyte infiltrate in tissue sections. With fluorescence techniques a single section can be stained with up to four distinct fluorescently labelled antibodies to determine cell phenotypes. To harness this potential computer vision techniques are required to analyze the images. A routine based on the water shed algorithm has been developed that segments the nuclei image with an accuracy of greater than 90%. By matching the nuclei boundaries to the local peak fluorescence, cell boundary estimates are obtained in the antigen images. By then extracting two measurements from the boundary signal the cells can be classified according to their antigen expression. Determining cell expression of multiple antigens simultaneously provides a more detailed and accurate picture of the tumor infiltrate than single parameter analysis, and increases understanding of the immune response associated with the chemotherapy.
Adaptation of Gabor filters for simulation of human preattentive mechanism for a mobile robot
Naren Kulkarni, Golshah A. Naghdy
Vision guided mobile robot navigation is complex and requires analysis of tremendous amounts of information in real time. In order to simplify the task and reduce the amount of information, human preattentive mechanism can be adapted [Nag90]. During the preattentive search the scene is analyzed rapidly but in sufficient detail for the attention to be focused on the `area of interest.' The `area of interest' can further be scrutinized in more detail for recognition purposes. This `area of interest' can be a text message to facilitate navigation. Gabor filters and an automated turning mechanism are used to isolate the `area of interest.' These regions are subsequently processed with optimal spatial resolution for perception tasks. This method has clear advantages over the global operators in that, after an initial search, it scans each region of interest with optimum resolution. This reduces the volume of information for recognition stages and ensures that no region is over or under estimated.
Texture characterization: morphological approach
Venu Mahadasa, T. Ch. Malles Rao, B. S. Prakasa Rao, et al.
Textural filters that are designed in the texture spectrum domain are used by slightly changing rules in the calculation of the texture unit number by ordering the elements of the texture unit with respect to the positions of maximum and minimum gray level values as their initial positions in the 3 X 3 subimage. Following that, maximum and minimum values, in addition to the average value of standardized nine elements, are also considered in the transformation processes. Condition morphological filter is applied on the texture filtered image of 200 X 300 pixel data. The results are showing promising potential for geological feature recognition.
Neural Nets and Fuzzy Logic in Machine Vision
icon_mobile_dropdown
General learning scheme for robot coordinate transformations using dynamic neural network
Madan M. Gupta, Dandina Hulikunta Rao
By virtue of their functional approximation, learning and adaptive capabilities, the computational neural networks can be suitably employed for learning robot coordinate transformations. The major drawback of conventional static feedforward neural networks based on back-propagation learning algorithm is in their very large convergence time for a given task. Any attempts to accelerate the learning process by increasing the values of learning constants in the algorithm often result in unstable systems. The intent of this paper is to describe a neural network structure called dynamic neural processor (DNP), and examine briefly how it can be used in developing a learning scheme for computing robot inverse kinematic transformations. The architecture and learning algorithm of the proposed dynamic neural network structure, the DNP, are described. Computer simulations are provided to demonstrate the effectiveness of the proposed learning scheme using the DNP.
Spatiotemporal topology and temporal sequence identification with an adaptive time-delay neural network
Daw-Tung Lin, Panos A. Ligomenides, Judith E. Dayhoff
Inspired from the time delays that occur in neurobiological signal transmission, we describe an adaptive time delay neural network (ATNN) which is a powerful dynamic learning technique for spatiotemporal pattern transformation and temporal sequence identification. The dynamic properties of this network are formulated through the adaptation of time-delays and synapse weights, which are adjusted on-line based on gradient descent rules according to the evolution of observed inputs and outputs. We have applied the ATNN to examples that possess spatiotemporal complexity, with temporal sequences that are completed by the network. The ATNN is able to be applied to pattern completion. Simulation results show that the ATNN learns the topology of a circular and figure eight trajectories within 500 on-line training iterations, and reproduces the trajectory dynamically with very high accuracy. The ATNN was also trained to model the Fourier series expansion of the sum of different odd harmonics. The resulting network provides more flexibility and efficiency than the TDNN and allows the network to seek optimal values for time-delays as well as optimal synapse weights.
Sensor fusion of 2D and 3D data for the processing of images of dental imprints
Jean-Francois Methot, Marielle Mokhtari, Denis Laurendeau, et al.
This paper presents a computer vision system for the acquisition and processing of 3-D images of wax dental imprints. The ultimate goal of the system is to measure a set of 10 orthodontic parameters that will be fed to an expert system for automatic diagnosis of occlusion problems. An approach for the acquisition of range images of both sides of the imprint is presented. Range is obtained from a shape-from-absorption technique applied to a pair of grey-level images obtained at two different wavelengths. The accuracy of the range values is improved using sensor fusion between the initial range image and a reflectance image from the pair of grey-level images. The improved range image is segmented in order to find the interstices between teeth and, following further processing, the type of each tooth on the profile. Once each tooth has been identified, its accurate location on the imprint is found using a region- growing approach and its shape is reconstructed with third degree polynomial functions. The reconstructed shape will be later used by the system to find specific features that are needed to estimate the orthodontic parameters.
Pyramidal neurovision architecture for vision machines
Madan M. Gupta, George K. Knopf
The vision system employed by an intelligent robot must be active; active in the sense that it must be capable of selectively acquiring the minimal amount of relevant information for a given task. An efficient active vision system architecture that is based loosely upon the parallel-hierarchical (pyramidal) structure of the biological visual pathway is presented in this paper. Although the computational architecture of the proposed pyramidal neuro-vision system is far less sophisticated than the architecture of the biological visual pathway, it does retain some essential features such as the converging multilayered structure of its biological counterpart. In terms of visual information processing, the neuro-vision system is constructed from a hierarchy of several interactive computational levels, whereupon each level contains one or more nonlinear parallel processors. Computationally efficient vision machines can be developed by utilizing both the parallel and serial information processing techniques within the pyramidal computing architecture. A computer simulation of a pyramidal vision system for active scene surveillance is presented.
Pattern recognition using stochastic neural networks
Ying Liu
In this paper, we study pattern recognition using stochastic artificial neural networks (SANN). A learning system can be defined by three rules: the encoding rule, the rule of internal change, and the quantization rule. In our system, the data encoding is to store an image in a stable distribution of a SANN. Given an input image f (epsilon) F, one can find a SANN t (epsilon) T such that the equilibrium distribution of this SANN is the given image f. Therefore, the input image, f, is encoded into a specification of a SANN, t. This mapping from F (image space) to T (parameter space of SANN) defines SANN transformation. SANN transformation encodes an input image into a relatively small vector which catches the characteristics of the input vector. The internal space T is the parameter space of SANN. The internal change rule of our system uses a local minima algorithm to encode the input data. The output data of the encoding stage is a specification of a stochastic dynamical system. The quantization rule divides the internal data space T by sample data.
VLSI design of a dynamic neural processor for vision applications
Madan M. Gupta, Dean Hockley
A programmable dynamic neuron with a biologically motivated design has been implemented using analog current-mode techniques. The circuit can exhibit several different processing modes with only slight adjustment in the model parameters. The relevant parameters are varied using electronically modifiable current mirrors. The circuit is very compact, and requires only around 20 transistors. In addition the circuit can be configured as a traditional neuron simply by leaving the sampling switches closed. The initial design allows that all the model parameters can be adjusted. Presently the dynamic neuron is restricted to only the nearest neighborhood connections on a chip, but this lack of connectivity can be alleviated by adopting a multiple chip processing scheme in a pyramidal architecture.
VLSI implementation of a reduced symmetric fuzzy singleton set
Yi-Chieh Chang, Kung Chris Wu
A fuzzy logic controller (FLC) has been proposed and implemented in many control systems to deliver smooth and more reliable outputs than the traditional control systems. In most of the existing VLSI FLC chips, the architectures are based on general purpose microcontroller structure tailored to fuzzy logic implementation. The drawbacks in these types of FLC VLSI chips are low speed, high cost, and long design time. Moreover, an expensive development system is also needed to program a general purpose microcontroller for a specific fuzzy logic control system. In order to alleviate the drawbacks in existing VLSI fuzzy logic circuits, a reduced symmetric fuzzy singleton set (RSFSS) is proposed in this paper. The proposed RSFSS system can handle three input variables, nine rules for each input variable, and produces two output values. Each rule is based on a symmetric triangular membership function. The triangular membership functions of each state variable are defined symmetrically with respect to the centroid of the universe of discourse. Since the hardware complexity is greatly reduced, the entire FLC based on the RSFS structure can be implemented on a VLSI chip with a dimension of 2.22 mm X 2.22 mm.
Design and implementation of a fuzzy logic yaw controller
Kung Chris Wu, Andrew H. Swift, W. Lionel Craver Jr., et al.
This paper describes a fuzzy logic controller (FLC) designed and implemented to control the yaw angle of a 10 kW fixed speed teetered-rotor wind turbine presently being commissioned at the University of Texas at El Paso. The technical challenge of this project is that the wind turbine represents a highly stochastic nonlinear system. The problems associated with the wind turbine yaw control are of a similar nature as those experienced with position control of high inertia equipment like tracking antenna, gun turrets, and overhead cranes. Furthermore, the wind turbine yaw controller must be extremely cost-effective and highly reliable in order to be economically viable compared to the fossil fueled power generators.
Classifier neural net with complex-valued weights and square-law nonlinearities
David P. Casasent, Sanjan Natarajan
A new pattern recognition classifier neural net (NN) is described that uses complex-valued weights and square-law nonlinearities. We show that these weights and nonlinearities inherently produce higher-order decision surfaces and thus we expect better classification performance (PC). We refer to this as the piecewise hyperquadratic neural net (PQNN) since each hidden layer neuron inherently provides a hyperquadratic decision surface and the combination of neurons provides piecewise hyperquadratic decision surfaces. We detail the learning algorithm for this PQNN and provide initial results on synthetic data showing its advantages over the backpropagation and other NNs. We also note a new technique to provide improved classification results when there are significantly different numbers of samples per class.
Image Processing
icon_mobile_dropdown
Vertices and corners: a maximum likelihood approach
Raashid Malik, Siu So
Scenes of polyhedral objects may be accurately represented in 2-D using line sketches. An aim of low level image processing is to generate useful binary images from grey scale images. The binary images generated by enhancement/threshold edge detectors are usually unrefined outlines of the underlying 3-D scene. Such images must be further processed to isolate and identify region boundaries; which, in the case of polyhedra, consist of line segments. The intersection or connection points of these line segments are known as vertices or corners. The work reported in this paper employs a decision theoretic approach to detect vertices in grey scale images.
Vertices and corners: normalized average detection
Raashid Malik, Hui Ren
Most of the information regarding the shape of polyhedral objects is preserved in the edges and the vertices of these objects. Gray level images of scenes containing such objects are often processed to extract edge and vertex information to produce equivalent line sketches. An accurate line sketch of a scene serves as an effective input to high level vision systems concerned with scene understanding or object recognition. The performance of these systems is therefore greatly dependent on the accuracy of the line sketch. The work reported in this paper addresses the issues associated with generating accurate line sketches from gray level images. The methods described here have been implemented and tested with real and synthetic images and are compared to other vertex or corner detection techniques. The performance of the vertex detector is assessed using simulation runs on images with varied signal-to-noise ratios. The computational performance of this algorithm is evaluated and assessed by operating directly on the gray-scale image.
Crest lines detection in gray-level images
Nazha Selmaoui, C. Leschi, Hurbert Emptoz
We aim in this paper to propose a new process for the extreme lines detection based on the pretopological model. We set a new algorithm to analyze images of lines (images of the third type), it partly uses the principle of functioning of clustering algorithms proposed by H. Emptoz in his thesis, but has a different `philosophy' to interpret the results.
Neural Nets and Fuzzy Logic in Machine Vision
icon_mobile_dropdown
Computing statistical properties of hue distributions for color image analysis
Daniel Crevier
Color images can be analyzed using two kinds of coordinate systems: rectangular systems based on primary colors (RGB), and cylindrical systems based on hue, saturation, and intensity (HSI). HSI systems match our intuitive understanding of colors and make it possible to name colors in knowledge bases, a significant advantage given the mushrooming use of declarative knowledge for image analysis. On the other hand, HSI systems give rise to singularities which result in undesirable instabilities, notably with respect to the statistical properties of hue distributions. Computing the mean and variance of a split distribution in the conventional manner would yield an unrealistically large variance and a mean hue in the blue-green region. The paper presents alternative ways of computing means and variances that avoid these effects. At the cost of a relatively slight numerical overhead, these computations generate results in agreement with our intuitive understanding of colors in split peak situations, and reduce to the standard definitions in well-behaved histograms. Recursive formulas are given for the calculation of these statistics, and an efficient algorithm is presented. Equivalence conditions between the results of the introduced procedures and conventional calculations are stated. Examples are given using actual color images.
Pattern Recognition in Computer Vision
icon_mobile_dropdown
Review of automated visual inspection 1983-1993, Part I: conventional approaches
Eduardo J. Bayro-Corrochano
This review represents an extensive and systematic survey of the state of the art of automated visual inspection. This is a multidisciplinary research field, comprising aspects of physics, mathematics, computer science, artificial intelligence and engineering. With the aim of achieving a comprehensive overview of the subject, the reviewer has examined computer vision algorithms for inspection, expert systems and neural networks, the elements of vision inspection systems, advances in hardware for image processing and many diverse industrial applications. This review is divided in two parts: part I involves conventional methods and part II considers approaches to intelligent systems. After discussing the major factors involved in industrial machine vision, in each part some problems and trends in current research are indicated, and possible areas for future investigation are suggested on the basis of the surveyed literature.
Review of automated visual inspection 1983-1993, Part II: approaches to intelligent systems
Eduardo J. Bayro-Corrochano
This review represents an extensive and systematic survey of the state of the art of automated visual inspection. This review is divided in two parts: part I involves conventional methods and part II considers approaches to intelligent systems. This complementary paper considers advanced computer vision algorithms for inspection, expert systems, fuzzy logic, neural networks and diverse industrial applications using intelligent approaches. Finally, some problems and trends in current research are indicated, and possible areas for future investigation are suggested on the basis of the surveyed literature.