Proceedings Volume 1661

Machine Vision Applications in Character Recognition and Industrial Inspection

Donald P. D'Amato, Wolf-Ekkehard Blanz, Byron E. Dom, et al.
cover
Proceedings Volume 1661

Machine Vision Applications in Character Recognition and Industrial Inspection

Donald P. D'Amato, Wolf-Ekkehard Blanz, Byron E. Dom, et al.
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 1 August 1992
Contents: 7 Sessions, 38 Papers, 0 Presentations
Conference: SPIE/IS&T 1992 Symposium on Electronic Imaging: Science and Technology 1992
Volume Number: 1661

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Image Enhancement, Segmentation, and Pre-Recognition Analysis
  • Character Recognition I
  • Character Recognition II
  • Contextual Analysis, Control Structures, and Parallel Processing
  • IC Inspection and Measurement
  • Ancillary Techniques
  • Packaging
Image Enhancement, Segmentation, and Pre-Recognition Analysis
icon_mobile_dropdown
Characteristics of digitized images of technical articles
Mahesh Viswanathan, George Nagy
Document image blocks characterized using projection profiles is proposed. We have collected statistical information on scanned pages of technical articles as a by-product of digitized document analysis. Specifically, in our hierarchical block segmentation and labeling approach (syntactic), 65 training and test pages from two publications were used. Additional information on compression and profiles was also used. Pixel-level information is required as input whether the analyzing tool is an expert system or something else. The issues covered are: (1) Profile characteristics of document objects like text, line drawings, tables, and half-tones, and the variation of these profiles with block size, type size, and direction of scan. (2) Speckle noise: sizes and distribution. (3) For the hierarchical (syntactic) approach, number of tree nodes at each level along with their areas, and comparison with node areas derived from transition-cut trees. (4) CCITT-Group 4 compression statistics on document sub-blocks and whole pages. (5) Size of postscript files and postscript commands used in printing these page files. We believe that these results would allow predicting some characteristics of a printed page digitized at any specified sampling rate.
Using morphology and associative memories to associate salt-and-pepper noise with OCR error rates in document images
Vicente P. Concepcion, Matthew P. Grzech
For each scanned text image, we generate the morphological pattern spectrum which captures image object shape and size information. We use the spectrum to characterize the noise content of a text document image by considering only the region of the spectrum near the origin. Noise is known to affect many image processing operations and we chose to consider optical character recognition (OCR) in this experiment. We associate noise that is characterized by a partial pattern spectrum with OCR performance as measured by an error rate by using a linear distributed associative memory (DAM). The DAM is trained to recognize the spectra of three classes of images: with high, medium, and low OCR error rates. The DAM is not forced to make a classification every time. It is allowed to reject as unknown a spectrum presented that does not closely resemble any that has been stored in the DAM. The DAM was fairly accurate with noisy images but conservative (i.e., rejected several text images as unknowns) when there was little noise.
Contrast enhancement of mail piece images
Yong-Chul Shin, Ramalingam Sridhar, Victor Demjanenko, et al.
A New approach to contrast enhancement of mail piece images is presented. The contrast enhancement is used as a preprocessing step in the real-time address block location (RT-ABL) system. The RT-ABL system processes a stream of mail piece images and locates destination address blocks. Most of the mail pieces (classified into letters) show high contrast between background and foreground. As an extreme case, however, the seasonal greeting cards usually use colored envelopes which results in reduced contrast osured by an error rate by using a linear distributed associative memory (DAM). The DAM is trained to recognize the spectra of three classes of images: with high, medium, and low OCR error rates. The DAM is not forced to make a classification every time. It is allowed to reject as unknown a spectrum presented that does not closely resemble any that has been stored in the DAM. The DAM was fairly accurate with noisy images but conservative (i.e., rejected several text images as unknowns) when there was little ground and foreground degradations without affecting the nondegraded images. This approach provides local enhancement which adapts to local features. In order to simplify the computation of A and (sigma) , dynamic programming technique is used. Implementation details, performance, and the results on test images are presented in this paper.
Extraction of text boxes from engineering drawings
Ian Chai, Dov Dori
Textboxes are minimum size rectangles enclosing blocks of text in engineering drawings. Their separation from the graphics surrounding them is a first step in character recognition, and is a part of the Machine Drawing Understanding System, currently under development. Textbox extraction is preceded by orthogonal zig-zag vectorization, arc segmentation, and arrowhead recognition. It is done by clustering the remaining short bars that are located close to each other through a region growing process. Further refinements follow which improve the ability of the process to outline the real textboxes.
Document understanding using layout styles of title page images
Louis H. Sharpe II, Basil Manns
An important problem in the application of compound document architectures is the input of data from raster images. One technique is to use visual, syntactic cues found in the layout of the raster document to infer its logical structure or semantics. Another is to use context derived from characters recognized within a given block of raster data. Both character- and image- based information are considered here. A well-constrained environment is defined for use in developing rules that can be applied to basic book title page understanding. This paper identifies the attributes of title page layout objects which aid in mapping them into the fields of a simple bibliographic format. Using as input the raster images of the title page and the verso of the title page along with the ASCII output of a generic character recognition engine from these same images, a system of rules is defined for generating a marked-up text wherein key bibliographic fields may be identified.
Segmenting handwritten text lines into words using distance algorithms
Giovanni Seni, Edward Cohen
This paper explores different distance algorithms that can group connected components of a handwritten text line into words. A binarized handwritten text image normally consists of many connected components, where each component is a character fragment, an isolated character, or a group of characters. When the writing style is unconstrained, recognition of individual components is unreliable so the components must be grouped into words before recognition algorithms (which may require dictionaries) can be used. Algorithms that compute the distance between connected components can indicate how the connected components should be clustered into words. We show that fast straightforward distance algorithms (such as using the horizontal distance between the component''s bounding boxes) have mediocre performance. Euclidean distance algorithms perform well but are computationally slow. This paper describes original methods of computing distances. These algorithms include combining a set of horizontal distances between components (applied to each pixel row) with the Euclidean and bounding box methods to achieve high performance and reasonable speed. We examine six distance algorithms and each is tested on unconstrained handwritten address images.
Evaluation system for handwritten characters
Takahito Kato
One of the most difficult problems in handwritten Kanji (Chinese) character recognition is the variety of writing styles. In Japan, children are taught to write beautiful characters in elementary school. However, the shape of handwritten characters can vary in such propositions that the recognition rate greatly depends on character quality. Therefore, developing an automatic evaluation system of handwritten characters is very effective in improving our writing and also helps to build an accurate character recognition system. For this purpose, a real-time evaluation system consisting of a new algorithm for evaluating handwritten character quality has been developed. This objective evaluation is carried out by assuming that character quality can be expressed as a linear function of quantitative measures. Such measures were derived from the results of an analysis of character evaluation factors. The scanner digitizes handwritten characters and the workstation calculates their quality. The accuracy of this system was confirmed, comparing it to subjective evaluation by human observers. The possibility of objective evaluation of handwritten character quality was demonstrated.
Recognition of characteristic symbols in engineering drawings
Chan Pyng Lai, Rangachar Kasturi
An algorithm to recognize characteristic symbols in engineering drawings is presented. Characteristic symbols help in the classification of dimensioning text into one of the following four classes: linear, angular, diametric, and radial. Recognition of characteristic symbols is an important tool for the extraction of dimension-sets in a machine drawing understanding system. Text in drawings is used to provide dimensioning and tolerance information, and to describe part details such as material, finish, etc. A text/graphics separation algorithm followed by a post-processing step is used to separate potential dimensioning text from graphics. The characteristic symbol in each dimensioning text block is recognized by the algorithm described here. This algorithm includes extraction of end points and critical points for each character/symbol in the text block, and a rule-based recognition system based on simple heuristics. Experimental results are presented.
Character Recognition I
icon_mobile_dropdown
Estimation of linear stroke parameters using iterative total least squares methods
Jan A. Van Mieghem, Hadar I. Avi-Itzhak, Roger D. Melen
In this paper we present an algorithm to enhance the accuracy of the estimation of the parameters of linear stroke segments in a two-dimensional printed character image. The algorithm achieves high accuracy in comparatively less computational time than most traditional methods. It is invariant under rotation and translation and no a priori information about the image is required. The Iterative Total Least Squares (ITLS) method begins at a randomly assigned initial approximation of the line parameters. A rectangular window is centered using the current stroke approximation, and a new line estimate is generated by making a total least squares fit through the pixels contained within the window. This is then repeated until convergence is reached. Adaptive adjustments of the window size and choice of profile can further improve the obtained accuracy. In addition, a `fast'' ITLS method has been developed.
Gray-scale character recognition using boundary features
Stephen W. Lam, Anthony C. Girardin, Sargur N. Srihari
Optical character recognition (OCR) traditionally applies to binary-valued imagery although text always scanned and stored in gray-scale. Binarization of multivalued image may remove important topological information from characters and introduce noise to character background. Low quality imagery, produced by poor print text and improper image lift, magnifies the shortcomings of this process. A character classifier is proposed to recognize gray-scale characters by extracting structural features from character outlines. A fast local contrast based gray-scale edge detector has been developed to locate character boundaries. A pixel is considered as an edge-pixel if its gray value is below a threshold and has a neighbor whose gray value is above the threshold. Edges are then thinned to one pixel wide. Extracting structural features from edges is performed by convolving the edges with a set of feature templates. Currently, 16 features, such as strokes, curves, and corners, are considered. Extracted features are compressed to form a binary vector with 576 features and it is used as input to a classifier. This approach is being tested on machine-printed characters which are extracted from mail address blocks. Characters are sampled at 300 ppi and quantized with 8 bits. Experimental results also demonstrate that recognition rates can be improved by enhancing image quality prior to boundary detection.
Morphological approach to machine-printed character recognition: a feasibility study
Radovan V. Krtolica, Brian Warner
A connected skeleton is obtained from the image of a character on a rectangular grid with binary values, by use of discrete morphological processing with respect to a set of structuring elements. The skeletonizing procedure is based on sequential thinning, a well known morphological operation involving multiple use of the hit-miss transform. However, the sequence of thinning operations is carefully chosen to provide for robustness of resulting skeleton and its characteristic points (intersections and extremes) that are identified subsequently. Connectivity graphs of intersections and extremes of a character image are an affine invariant feature useful for character recognition. Ambiguities in character classification based on this feature are due to the fact that the graph adjacency (connectivity) matrix does not tell the difference between characteristic points connected by a stroke representing a straight line and a stroke representing a curve. An orthogonal fitting technique is proposed that discriminates between curved and straight strokes. Straight lines are then represented by graph edges, while the curves are replaced by a few additional mutually connected graph vertices. Experimental results show good discrimination properties of the extended connectivity graphs on 12 points Courier font characters.
Recognition of poorly printed text by direct extraction of features from gray scale
Theo Pavlidis, Li Wang, Jiangying Zhou, et al.
Omnifont optical character recognition proceeds by computing features on the input image and then classifying the image. Past omnifont optical character recognition techniques that use features have always binarized the image by comparing the brightness of an input pixel with a threshold level and then labeling it as `black'' or `white'' and then computing the features for each character. However, for poorly printed text such binarization results into broken or merged characters and consequently incorrect features. We propose a method for the direct computation of geometrical features, such as strokes, directly from the gray scale image. To this aim we use a model of the image forming process, namely the convolution of the original binary image with the point spread function of the digitizer. We also estimate how printing distortions and noise affect the result so that we can deduce how different parts of a printed character should appear under those conditions. Detected features are then clustered for each set of samples of the training set. The clustering guides the selection of prototypes and the final classification is made by graph matching between prototypes and new (unknown) characters.
Noisy Hangul character recognition with fuzzy tree classifier
Seong-Whan Lee
Decision trees have been applied to solve a wide range of pattern recognition problems. In a tree classifier, a sequence of decision rules are used to assign an unknown sample to a pattern class. The main advantage of a decision tree over a single stage classifier is that the complex global decision making process can be divided into a number of simpler and local decisions at different levels of the tree. At each stage of the decision process, the feature subset best suited for that classification task can be selected. It can be shown that this approach provides better results than the use of the best feature subset for a single decision classifier. In addition, in large set problems where the number of classes is very large, the tree classifier can make a global decision much more quickly than the single stage classifier. However, a major weak point of a tree classifier is its error accumulation effect when the number of classes is very large. To overcome this difficulty, a fuzzy tree classifier with the following characteristics is implemented: (1) fuzzy logic search is used to find all `possible correct classes,'' and some similarity measures are used to determine the `most probable class;'' (2) global training is applied to generate extended terminals in order to enhance the recognition rate; (3) both the training and search algorithms have been given a lot of flexibility, to provide tradeoffs between error and rejection rates, and between the recognition rate and speed. Experimental results for the recognition of 520 most frequently used noisy Hangul character categories revealed a very high recognition rate of 99.8 percent and very high speed of 100 samples/sec, when the program was written in C and run on general purpose SUN4 SPARCstation
Regression approach to combination of decisions by multiple character recognition algorithms
Tin Kam Ho, Jonathan J. Hull, Sargur N. Srihari
A regression method is proposed to combine decisions of multiple character recognition algorithms. The method computes a weighted sum of the rank scores produced by the individual classifiers and derive a consensus ranking. The weights are estimated by a logistic regression analysis. Two experiments are discussed where the method was applied to recognize degraded machine-printed characters and handwritten digits. The results show that the combination outperforms each of the individual classifiers.
Use of a priori knowledge for character recognition
Gilles Houle, Kie Bum Eom
Research and applications in recognition of machine-printed characters have been active for more than 30 years. It is believed to be a solved problem when the quality of the text is acceptable (minimum fragmented or touching characters). However, when characters are broken or touching, character segmentation still poses a challenge in machine reading. Yet, humans are capable of recognizing very degraded characters with and without context. This paper presents a system for recognizing degraded machine-printed characters. This system relies on a priori knowledge of character shapes. Because classification performance is strongly dependent on the input feature set, this paper focuses on the creation of features from curvature estimation. A set of `clean'' characters from multiple fonts was used to emphasize our belief that a set of clean characters can be used to build an inference engine to recognize noisy characters. The word `noisy'' is used in the generic sense to indicate variations not found in the training set, such as shape variation of new fonts, or broken and touching characters. In addition, we present some concepts on the design of an inference engine that can recognize very degraded images. The inference engine is an assembly of networks (neural and knowledge-based) in which each network stores a flexible representation of a character. The topological features used allow approximate matching for position, direction, curvature, and piecewise incomplete stroke. An image begins to be recognized as soon as some features are detected. Excited networks help to focus attention on uncovering, reinforcing, or ignoring neighboring features. Ultimately, the networks activities are stabilized, and the output consists of a ranked list of possible candidates.
Character Recognition II
icon_mobile_dropdown
Character recognition: a unified approach
Nassrin Tavakoli
Three primary processes are utilized in most pattern recognition systems. (1) The representation process in which the raw digitized data is mapped into a higher level form. (2) The storage process which generates a known base containing the high-level representation of all known patterns in a problem domain. (3) The identification process which classifies the unknown pattern, given its high-level representation and the known base. This paper introduces a new and unified approach to character recognition in which the generation of the known base and the identification process are independent of the representation process. The concept of learning is applied where additional known data can be added to the known base thus allowing for better recognition. The concept of fuzziness is applied to the identification process to find the best match. Algorithms have been developed and applied to digitized characters. The processes of representation, storage, and identification used for this implementation along with results are discussed.
One view of the methodology in handwriting character recognition
Leon A. Pintsov
The paper deals with the general methodology of structural handwriting character recognition systems. It is based on personal observations in developing such systems for commercial and postal applications. A model of handwriting is formulated as a human to human communication model and various implications of this model for handwriting recognition algorithms are considered including optimal criteria for digitization, size, and representativeness of training database and ultimate performance level. Arguments are given in favor of considering handwritten characters as a general class of geometrical curves bounded by some natural constraints as opposed to considerably redundant character shapes typical for machine printed characters. This approach favors a generative model of character formation rather than a transformative model and leads to a natural description of character shapes. Features and feature selection criteria are presented based on psychophysiological and information-theoretic ideas. Significance of classification technique is examined. The advantages and limitations of artificial neural networks for handwritten character recognition are briefly discussed. Various aspects of the complexity of character recognition algorithms are exposed.
Training feed-forward neural networks using conjugate gradients
James L. Blue, Patrick J. Grother
Neural networks for optical character recognition are still being trained using back propagation, even though conjugate gradient methods have been shown to be much faster. Most multilayer perceptron network training results in the literature are obtained for small and unrealistic problems or from data sets that are proprietary and not available for comparison testing. We present results on a large realistic pattern set containing 2000 training and 1434 testing exemplars. Each pattern is composed of 32 Gabor coefficients obtained from a 32 by 32 pixel binary image of a handwritten digit segmented from the NIST Handwriting Image Data Base. These sets are believed to have approximately 1 segmentation errors. Comparative results for Moller''s scaled conjugate gradient method and for standard back propagation are presented for runs on a serial scientific workstation and a highly parallel computer. Typical training on a network with 32 inputs, 32 hidden nodes, and 10 output nodes gives a 98 recognition for the training set and 95 for the test set. Training with conjugate gradients requires fewer than 200 iterations; times are about 20 to 40 minutes on a scientific workstation and 6 minutes on the highly parallel computer. Testing (classification) is done at the rate of 600 to 1600 patterns per second on the scientific workstation and on the highly parallel computer respectively. These results suggest that commercial handwritten character recognition systems with great economic potential are feasible.
Comparison of neural network classifiers for optical character recognition
Thomas E. Baker, Hal McCartor
The recognition of handwritten characters is an important technology for document processing and for advanced user interfaces. Recent advances in artificial neural network (ANN) classifiers have shown impressive pattern recognition results when using noisy data. One advantage of ANN algorithms is that they are parallel by design, which allows a natural implementation on high-speed parallel architectures. The availability of standard databases of handwritten characters permits a fair comparison between different OCR classifiers. This paper compares the classification performance of two popular ANN algorithms: Back Propagation and Learning Vector Quantization. A set of digits from the National Institute of Standards and Technology''s Handwritten Database is used to test the two classifiers. Each algorithm''s execution time and memory efficiency is also compared, based on an implementation for Adaptive Solutions'' highly parallel CNAPS architecture. We also show that a fair comparison cannot be made between OCR research that does not use the same set of characters for testing.
Automated optical recognition of degraded handwritten characters
Emade Darwiche, Abhijit S. Pandya, Anil D. Mandalia
This paper reports on a new approach in the field of automated optical recognition of handwritten characters. The approach combines geometrical and topological features, distribution of points, and Alopex based neural network to achieve a high recognition rate. A considerable enhancement in speed is achieved by implementing the process on a compressed image. Distortion tolerant features along with noise removal and region merging permit the handling of degraded documents and characters. Software implementation of the system experimented on the NIST database yields to a recognition rate of 92.4 for numerals and upper-case letters.
Syntactic neural network for character recognition
Viktor A. Jaravine
This article presents a synergism of syntactic 2-D parsing of images and multilayered, feed- forward network techniques. This approach makes it possible to build a written text reading system with absolute recognition rate for unambiguous text strings. The Syntactic Neural Network (SNN) is created during image parsing process by capturing the higher order statistical structure in the ensemble of input image examples. Acquired knowledge is stored in the form of hierarchical image elements dictionary and syntactic network. The number of hidden layers and neuron units is not fixed and is determined by the structural complexity of the teaching set. A proposed syntactic neuron differs from conventional numerical neuron by its symbolic input/output and usage of the dictionary for determining the output. This approach guarantees exact recognition of an image that is a combinatorial variation of the images from the training set. The system is taught to generalize and to make stochastic parsing of distorted and shifted patterns. The generalizations enables the system to perform continuous incremental optimization of its work. New image data learned by SNN doesn''t interfere with previously stored knowledge, thus leading to unlimited storage capacity of the network.
Offline recognition of handwritten cursive words
John T. Favata, Sargur N. Srihari
A robust algorithm for offline cursive script recognition is described. The algorithm uses a generate-and-test paradigm to analyze cursive word images. The generate phase of the algorithm intelligently segments the word after analyzing certain structural features present in the word. The test phase determines the most likely character candidates among the segmentation points by using a recognition algorithm trained on generalized cursive letter shapes. In a sense, word recognition is done by sliding a variable sized window across the word looking for recognizable characters and strokes. The output of this system is a list of all plausible interpretations of the word. This list is then analyzed by a two-step contextual post- processor which first matches all of the interpretations to a supplied dictionary using a string matching algorithm. This eliminates the least likely interpretations. The remaining candidates are then analyzed for certain character spatial relationships (local reference line finder) to finally rank the dictionary. The system has the advantage of not requiring explicit word training yet is able to recognize many handwriting styles. This system is being successfully tested on a database of handwritten words extracted from live mail with dictionary sizes of up to 300 words. Planned extensions include developing a multilevel generate-and-test paradigm which can handle any type of handwritten word.
Contextual Analysis, Control Structures, and Parallel Processing
icon_mobile_dropdown
System for line drawings interpretation
L. Boatto, Vincenzo Consorti, Monica Del Buono, et al.
This paper describes an automatic system that extracts information from line drawings, in order to feed CAD or GIS systems. The line drawings that we analyze contain interconnected thin lines, dashed lines, text, and symbols. Characters and symbols may overlap with lines. Our approach is based on the properties of the run representation of a binary image that allow giving the image a graph structure. Using this graph structure, several algorithms have been designed to identify, directly in the raster image, straight segments, dashed lines, text, symbols, hatching lines, etc. Straight segments and dashed lines are converted into vectors, with high accuracy and good noise immunity. Characters and symbols are recognized by means of a recognizer, specifically developed for this application, designed to be insensitive to rotation and scaling. Subsequent processing steps include an `intelligent'' search through the graph in order to detect closed polygons, dashed lines, text strings, and other higher-level logical entities, followed by the identification of relationships (adjacency, inclusion, etc.) between them. Relationships are further translated into a formal description of the drawing. The output of the system can be used as input to a Geographic Information System package. The system is currently used by the Italian Land Register Authority to process cadastral maps.
Model-based control strategy for document image analysis
Frank Fein, Frank Hoenes
Generally, document analysis and understanding involves many processing steps, like unskewing, segmentation, logical labeling, text recognition, and text analysis. Most of these steps can be subdivided into different tasks depending on the problem-solving methods available. All of the techniques are more or less specialized to certain input, but some are also competitive. As a consequence, a document analysis system incorporating many analysis methods must properly schedule and control these methods to obtain an optimal result. In this paper, we present a model for the control strategy of a document image analysis system as well as mechanisms for its interpretation that describe three important aspects: which specialist can be applied to which object in which analysis state. The analysis model comprises all possible sequences of processing steps which are relevant for the analysis tasks. The underlying document architecture supports the analysis specialists by corresponding knowledge and provides a framework for representing the analysis results.
Contextual analysis of machine-printed addresses
Peter B. Cullen, Tin Kam Ho, Jonathan J. Hull, et al.
The assignment of a nine digit ZIP Code (ZIP + 4 Code) to the digital image of a machine printed address block is a problem of central importance in automated mail sorting. This problem is especially difficult since most addresses do not contain ZIP + 4 Codes and often the information that must be read to match an address to one of the 28 million entries in the ZIP + 4 file is either erroneous, incomplete, or missing altogether. This paper discusses a system for interpreting a machine printed address and assigning a ZIP + 4 Code that uses a constraint satisfaction approach. Words in an address block are first segmented and parsed to assign probable semantic categories. Word images are then recognized by a combination of digit, character, and word recognition algorithms. The control structure uses a constraint satisfaction problem solving approach to match the recognition results to an entry in the ZIP + 4 file. It is shown how this technique can both determine correct responses as well as compensate for incomplete or erroneous information. Experimental results demonstrate the success of this system. In a recent test on over 1000 machine printed address blocks, the ZIP + 4 encode rate was over 73 percent. This compares to the success rate of current postal OCRs which is about 45 percent. Additionally, the word recognition algorithm recognizes over 92 percent of the input images (over 98 percent in the top 10 choices.
Massively parallel implementation of character recognition systems
Michael D. Garris, Charles L. Wilson, James L. Blue, et al.
A massively parallel character recognition system has been implemented. The system is designed to study the feasibility of the recognition of handprinted text in a loosely constrained environment. The NIST handprint database, NIST Special Database 1, is used to provide test data for the recognition system. The system consists of eight functional components. The loading of the image into the system and storing the recognition results from the system are I/O components. In between are components responsible for image processing and recognition. The first image processing component is responsible for image correction for scale and rotation, data field isolation, and character data location within each field; the second performs character segmentation; and the third does character normalization. Three recognition components are responsible for feature extraction and character reconstruction, neural network-based character recognition, and low-confidence classification rejection. The image processing to load and isolate 34 fields on a scientific workstation takes 900 seconds. The same processing takes only 11 seconds using a massively parallel array processor. The image processing components, including the time to load the image data, use 94 of the system time. The segmentation time is 15 ms/character and segmentation accuracy is 89 for handprinted digits and alphas. Character recognition accuracy for medium quality machine print is 99.8. On handprinted digits, the recognition accuracy is 96 and recognition speeds of 10,100 characters/second can be realized. The limiting factor in the recognition portion of the system is feature extraction, which occurs at 806 characters/second. Through the use of a massively parallel machine and neural recognition algorithms, significant improvements in both accuracy and speed have been achieved, making this technology effective as a replacement for key data entry in existing data capture systems.
Neural network approach to text processing
S. Sunthankar
There is a great need for fast accurate text retrieval systems to support many intelligent activities. The text search problem can be broken down into two main tasks; database searching and message routing. Database searching consists of searching through a large database of text from certain key words, phrases, or other simple functions of strings. Message routing is classifying incoming messages and sending them to the appropriate `mail box.'' These are actually very similar tasks. Both are really just pattern matching tasks. What matters are the methods used. In addition to searching and classifying, it would be nice to perform other tasks such as inferencing and prediction, so these are discussed briefly. We discuss and compare current leading edge solutions to this problem and introduce some new ideas based on recent neural network theories and experiments. All text-search and retrieval technology is predicted on the assumption that the semantic content of text can be predictd from its syntactic properties: specifically, the existence, frequency, or absence of certain character strings or words; the relationship clustering among words and phrases; the occurrence of particular patterns in particular fields within the document.
IC Inspection and Measurement
icon_mobile_dropdown
Recent advances in inspecting integrated circuits for pattern defects
Virginia H. Brecher, Byron E. Dom
Automatic inspection has become an essential part of manufacturing technology for integrated circuit chips. Three trends today in the geometries of integrated circuits and the chips they comprise have serious implications for inspection. The individual devices are getting smaller, with smallest features on some advanced products already crossing the optical resolution threshold; the chip areas are getting larger; and the chips consist of more layers and undergo more processing steps. Not only are the smallest defects harder to see due to the resolution limit, they are much rarer because the tolerable defect density decreases as chip area increases. This paper addresses automated integrated circuit inspection, surveying recent advances, and future challenges. An overview of all inspection operations performed on integrated circuit chips during the manufacturing process is followed by a detailed discussion of pattern defect inspection (PDI) and its unique requirements, such as detection probability, false alarm rate, throughput, and minimum defect size. The core material of the paper consists of a discussion of approaches and systems for PDI, emphasizing recent developments but reviewing older work to set the proper context. Both work reported in the literature and commercial systems are covered.
Minimization of false defect reporting in a patterned silicon wafer inspection system
John R. Dralla, John C. Hoff
The detection of defects in sub-micron semiconductor devices has reached new limits in recent years. Detection sensitivity limits are now at 0.25 microns and are being driven to 0.1 microns for 256M DRAM production. At these levels of defect detection sensitivity the need for discrimination between true defects and false defects becomes extremely important. Users of such systems cannot afford to manually sort through the large numbers of total reported defects, true and false, and thereby isolate which defects will cause yield loss. This paper discusses a system that has been developed for semiconductor in-process wafer inspection which incorporates a proprietary `statistical image'' technique and other user adjustable system parameters to minimize the frequency of false positives. The results of this study indicate that this is particularly important for the later stages of device manufacture where the thin films contain random texture which results in the potential for many false positives.
Wafer examination and critical dimension estimation using scattered light
Richard H. Krukar, Susan M. Gaspar, Scott R. Wilson, et al.
We have applied optical scatter techniques to improve several aspects of microelectronic manufacturing. One technique involves characterizing light scattered from two dimensional device structures, such as those from VLSI circuitry etched on a wafer, using a frosted dome which is imaged by a CCD camera. Previously, limited dynamic range available from affordable digital imaging systems has prevented the study of two dimensional scatter patterns. We have demonstrated a simple technique to increase the dynamic range by combining multiple images taken at different intensities. After the images have been acquired, image processing techniques are used to find and catalog the diffraction orders. Techniques such as inverse least squares, principal component analysis, and neural networks are then used to evaluate the dependence of the light scatter on a particular wafer characteristic under examination. Characterization of surface planarization over a VLSI structure and measurement of line edge roughness of diffraction gratings are presented as examples.
Wafer pattern inspection using a Coherent Optical Processor
Xian-Yang Cai, Frank Kvasnik
Microscope coherent optical processor (M-COP) has been configured and used to inspect the micro-patterns on a silicon wafer in real time. A technique for the measurement of the scale change of this pattern has been devised. Theoretical and experimental results showing the viability of this technique are presented.
Efficient Fourier image analysis algorithm for aligned rectangular and trapezoidal wafer structures
Chiao-Fe Shu, Ramesh C. Jain
We present an efficient algorithm to compute the critical dimensions of aligned rectangular and trapezoidal wafer structures using images generated by a Fourier imaging system. We show that the Fourier images of aligned rectangular and trapezoidal structures are separable functions. This allows us to project them onto x and y coordinates and simplifies the computation process. We compute the critical dimensions of rectangular structures by estimating the distance between either peaks or zeros, or by estimating the distance between zeros for trapezoidal structures. For each projected 1-dimensional signal, we apply a zero- crossing technique to find the peaks or zeros, and then compute the critical dimensions at sub- pixel resolution.
Ancillary Techniques
icon_mobile_dropdown
Calibration, setup, and performance evaluation in an IC inspection system
Byron E. Dom, Virginia H. Brecher
Many papers on automatic inspection systems ignore the issues of calibration, setup and performance evaluation, assuming (apparently) that they merely involve `straightforward engineering.'' In reality developing effective and robust procedures and algorithms to implement these features can be a demanding process. In fact, unbeknownst to the developers or users, the performance of many inspection systems could be significantly improved through better setup and calibration routines. In this tutorial paper we discuss both theoretical and practical issues. We start by reviewing the statistical framework underlying performance evaluation. Next we examine possible sources of inspection performance degradation. Last we describe calibration, setup and performance evaluation procedures and associated image analysis algorithms for an automated IC inspection system. While these procedures are specific to a particular system, we attempt to generalize them wherever possible.
Three-dimensional inspection of integrated circuits: a depth from focus approach
Xavier Binefa, Jordi M. Vitria, Juan Jose Villanueva
Optical inspection of integrated circuit images is an application of image understanding which is becoming an important field in the microelectronics industry. When the inspection is performed by optical microscopy, a small depth of field of an optical system causes problems dealing with focusing when structures with large depth variations are imaged. We propose a method which takes advantage of this effect to make a depth map of the circuit surface. The method is based on the analysis of defocusing on a series of images obtained by continuously varying the distance between the circuit surface and the center of the optical system. Morphological algorithms working in three-dimensional spaces are used to ensure the coherence of the results.
CCD photoresponse calibration and contrast adjustment for reliable material discrimination in the inspection of electronic packages
Inspection of complex electronic packages requires discrimination between the various materials used in such packages. Variations in the appearance of these materials and in the equipment''s illumination complicates the segmentation process. In addition, some materials have similar reflectance and absorption characteristics. As a result, the segmentation process is sensitive to small variations in the illumination settings, photoresponse nonuniformity, and contrast fluctuations. In this paper, we present two techniques that reduce these variations: (1) a new method to calibrate and correct the photoresponse characteristics of optical inspection systems, and (2) a method to automatically correct for contrast variations between the inspected packages. This results in a more repetitive appearance of the used packaging materials, which in turn results in improved segmentation performance. The photoresponse correction procedure, models the output of each photosite as a linear function of input illumination and the parameters of the model are measured. The response is corrected using image processing hardware. Experimental results show that the nonuniformity is corrected to within +/- 1 of the A/D dynamic range which agrees with the error analysis. The contrast adjustment method adjusts the image contrast based on histogram features and is adjusted using vendor and custom developed hardware. The relationship between the two techniques is also discussed.
Alignment mark detection using signed-contrast gradient edge maps
John Raymond Jordan III
Fast, accurate alignment plays an important role in microelectronics manufacturing. As a result, the automation of alignment mark detection via image processing algorithms has been widely investigated. Algorithms for automatic alignment mark detection traditionally fall into one of two classes -- correlation-based or contour-based. Despite considerable success in these investigations, current algorithms are often time-consuming and sensitive to changes in illumination and rotation. In this paper, a novel algorithm for detecting alignment marks in grey-level images is presented. The fundamental advantage of this algorithm is the use of signed-contrast gradient edge maps, which represent intensity edges in terms of locally normalized intensity gradients. Appropriate use of these signed-contrast gradient edge maps yields alignment mark detection which is faster and more robust than traditional methods such as normalized cross-correlation. The efficacy of this algorithm is demonstrated by applying it to the automatic detection and location of alignment marks for semiconductor wafer alignment and directly comparing its performance against that of normalized cross-correlation in terms of accuracy, invariance to changes in mark rotation, and algorithm speed.
Packaging
icon_mobile_dropdown
Advanced Via Inspection Tool
Douglas Y. Kim, Kurt Muller, Lawrence D. Thorp, et al.
The AVIT inspects vias on unfired ceramic layers used in microelectronic packaging for paste fill depth spacing and size. Four shadow images from different angles and one vertical image determine the fill status of 100 89um vias in the area of 160mm by 160mm. Each via is filled with a conductive paste. The cycle time to inspect a greensheet is 3. 75 seconds. The paste depth is determined by the area of shadow given by each oblique direction. The greater area of shadow is used to determine via depth. In order to achieve the cycle time of 3. 75 seconds we incorporate a laser scanner rotating at 28 rpm telecentric scan lens and dedicated signal processing one of the electronic racks uses ECL running at 87 MHz. The other electronics are predominant Fast-TTL.
Automated vision system for inspection of wedge bonds
Koduri K. Sreenivasan, Mandayam D. Srinath, Alireza R. Khotanzad
One of the problems in increasing reliability in the manufacture of integrated circuit devices is inspection of the bonds connecting the bond pads to the lead fingers to the device. The continuing increase in packing density of VLSI circuits requires that the inspection process be completely automated. Here we present a method for visual inspection of bonds which is intended to automatically extract parameters of significance in determining their quality froni two-dimensional images taken from the top of the IC wafer.