Intelligent Robots and Computer Vision XXIV: Algorithms, Techniques, and Active Vision

Automated synthesis of distortion-invariant filters: AutoMinace

David Casasent, Rohit Patnaik

Show abstract

This paper presents our automated filter-synthesis algorithm for the minimum noise and correlation energy (MINACE) distortion-invariant filter (DIF). We discuss use of this autoMinace filter in face recognition and automatic target recognition (ATR), in which we consider both true-class object classification and rejection of non-database objects (impostors in face recognition and confusers in ATR). We use at least one Minace filter per object class to be recognized; a separate Minace filter or a set of Minace filters is synthesized for each object class. The Minace parameter c trades-off distortion-tolerance (recognition) versus discrimination (impostor/confuser/clutter rejection) performance. Our automated Minace filter-synthesis algorithm (autoMinace) automatically selects the Minace filter parameter c and selects the training set images to be included in the filter, so that we achieve both good recognition and good impostor/confuser and clutter rejection performance; this is achieved using a training and validation set. No impostor/confuser, clutter or test set data is present in the training or validation sets. Use of the peak-to-correlation energy (PCE) ratio is found to perform better than the correlation peak height metric. The use of circular versus linear correlations is addressed; circular correlations require less storage and fewer online computations and are thus preferable. Representative test results for three different databases - visual face, IR ATR, and SAR ATR - are presented. We also discuss an efficient implementation of Minace filters for detection applications, where the filter template is much smaller than the input target scene.

Emerging directions in human-robotic space exploration technologies: remote operation of complex systems

Paul S. Schenker

Show abstract

Through the Vision for Space Exploration-hence Vision-announced by George W. Bush in February 2004, NASA has been chartered to conduct progressively staged human-robotic (H/R) exploration of the Solar System. This exploration includes autonomous robotic precursors that will pave the way for a later durable H/R presence first at the Moon, then Mars and beyond. We discuss operations architectures and integrative technologies that are expected to enable these new classes of space missions, with an emphasis on open design issues and R&D challenges.

A telepresence robot system realized by embedded object concept

Tero Vallius, Juha Röning

Show abstract

This paper presents the Embedded Object Concept (EOC) and a telepresence robot system which is a test case for the EOC. The EOC utilizes common object-oriented methods used in software by applying them to combined Lego-like software-hardware entities. These entities represent objects in object-oriented design methods, and they are the building blocks of embedded systems. The goal of the EOC is to make the designing embedded systems faster and easier. This concept enables people without comprehensive knowledge in electronics design to create new embedded systems, and for experts it shortens the design time of new embedded systems. We present the current status of a telepresence robot created with second-generation Atomi-objects, which is the name for our implementation of the embedded objects. The telepresence robot is a relatively complex test case for the EOC. The robot has been constructed using incremental device development, which is made possible by the architecture of the EOC. The robot contains video and audio exchange capability and a controlling system for driving with two wheels. The robot is built in two versions, the first consisting of a PC device and Atomi-objects, and the second consisting of only Atomi-objects. The robot is currently incomplete, but most of it has been successfully tested.

Pedestrian detection in crowded scenes with the histogram of gradients principle

O. Sidla, M. Rosner, Y. Lypetskyy

Show abstract

This paper describes a close to real-time scale invariant implementation of a pedestrian detector system which is based on the Histogram of Oriented Gradients (HOG) principle. Salient HOG features are first selected from a manually created very large database of samples with an evolutionary optimization procedure that directly trains a polynomial Support Vector Machine (SVM). Real-time operation is achieved by a cascaded 2-step classifier which uses first a very fast linear SVM (with the same features as the polynomial SVM) to reject most of the irrelevant detections and then computes the decision function with a polynomial SVM on the remaining set of candidate detections. Scale invariance is achieved by running the detector of constant size on scaled versions of the original input images and by clustering the results over all resolutions. The pedestrian detection system has been implemented in two versions: i) fully body detection, and ii) upper body only detection. The latter is especially suited for very busy and crowded scenarios. On a state-of-the-art PC it is able to run at a frequency of 8 - 20 frames/sec.

Remote imagery for unmanned ground vehicles: the future of path planning for ground robotics

Philip A. Frederick, Bernard L. Theisen, Derek Ward

Show abstract

Remote Imagery for Unmanned Ground Vehicles (RIUGV) uses a combination of high-resolution multi-spectral satellite imagery and advanced commercial off-the-self (COTS) object-oriented image processing software to provide automated terrain feature extraction and classification. This information, along with elevation data, infrared imagery, a vehicle mobility model and various meta-data (local weather reports, Zobler Soil map, etc...), is fed into automated path planning software to provide a stand-alone ability to generate rapidly updateable dynamic mobility maps for Manned or Unmanned Ground Vehicles (MGVs or UGVs). These polygon based mobility maps can reside on an individual platform or a tactical network. When new information is available, change files are generated and ingested into existing mobility maps based on user selected criteria. Bandwidth concerns are mitigated by the use of shape files for the representation of the data (e.g. each object in the scene is represented by a shape file and thus can be transmitted individually). User input (desired level of stealth, required time of arrival, etc...) determines the priority in which objects are tagged for updates. This paper will also discuss the planned July 2006 field experiment.

New experimental diffractive-optical data on E.Land's Retinex mechanism in human color vision: Part I

N. Lauinger

Show abstract

A correlator-optical imaging system with three-dimensional nano- and micro-structured diffraction gratings in aperture and in image space allows an adaptive optical correlation of local RGB data in image space with global RGB data (light from overall illumination in the visual field scattered from the aperture of the optical imaging system into image space), diffracted together into reciprocal grating space (photoreceptor space). This correlator-optical hardware seems to be a decisive part of the human eye and leads to new interpretations of color vision and of adaptive color constancy performances in human vision. In Part I, the up to now available data and their corresponding interpretation, together explaining paradoxically colored shadows as well as the data from E.Land's Retinex experiments, will be described. They will serve as premises for the planned experimental setup. In Part II these premises will experimentally be controlled by experiments in a part of an actually starting R+D project and the results will be described in 2007.

Embodying a cognitive model in a mobile robot

D. Paul Benjamin, Damian Lyons, Deryle Lonsdale

Show abstract

The ADAPT project is a collaboration of researchers in robotics, linguistics and artificial intelligence at three universities to create a cognitive architecture specifically designed to be embodied in a mobile robot. There are major respects in which existing cognitive architectures are inadequate for robot cognition. In particular, they lack support for true concurrency and for active perception. ADAPT addresses these deficiencies by modeling the world as a network of concurrent schemas, and modeling perception as problem solving. Schemas are represented using the RS (Robot Schemas) language, and are activated by spreading activation. RS provides a powerful language for distributed control of concurrent processes. Also, The formal semantics of RS provides the basis for the semantics of ADAPT's use of natural language. We have implemented the RS language in Soar, a mature cognitive architecture originally developed at CMU and used at a number of universities and companies. Soar's subgoaling and learning capabilities enable ADAPT to manage the complexity of its environment and to learn new schemas from experience. We describe the issues faced in developing an embodied cognitive architecture, and our implementation choices.

Visualization of pallets

Roger Bostelman, Tsai Hong, Tommy Chang

Show abstract

The National Institute of Standards and Technology (NIST) has been studying pallet visualization for the automated guided vehicle (AGV) industry. Through a cooperative research and development agreement with Transbotics, an AGV manufacturer, NIST has developed advanced sensor processing and world modeling algorithms to verify pallet location and orientation with respect to the AGV. Sensor processing utilizes two onboard AGV, single scan-line, laser-range units. The "Safety" sensor is a safety unit located at the base of a forktruck AGV and the "Panner" sensor is a panning laser-ranger rotated 90 degrees, mounted on a rotating motor, and mounted at the top, front of the AGV. The Safety sensor, typically used to detect obstacles such as humans, was also used to detect pallets and their surrounding area such as the walls of a truck to be loaded with pallets. The Panner, was used to acquire many scan-lines of range data which was processed into a 3D point cloud and segment out the pallet by a priori, approximate pallet load or remaining truck volumes. A world model was then constructed and output to the vehicle for pallet/truck volume verification. This paper will explain this joint government/industry project and results of using LADAR imaging methods.

Indoor environment modeling for interactive robot security application

Sangwoo Jo, Qonita M. Shahab, Yong-Moo Kwon, et al.

Show abstract

This paper presents our simple and easy to use method to obtain a 3D textured model. For expression of reality, we need to integrate the 3D models and real scenes. Most of other cases of 3D modeling method consist of two data acquisition devices. One is for getting a 3D model and another for obtaining realistic textures. In this case, the former device would be 2D laser range-finder and the latter device would be common camera. Our algorithm consists of building a measurement-based 2D metric map which is acquired by laser range-finder, texture acquisition/stitching and texture-mapping to corresponding 3D model. The algorithm is implemented with laser sensor for obtaining 2D/3D metric map and two cameras for gathering texture. Our geometric 3D model consists of planes that model the floor and walls. The geometry of the planes is extracted from the 2D metric map data. Textures for the floor and walls are generated from the images captured by two 1394 cameras which have wide Field of View angle. Image stitching and image cutting process is used to generate textured images for corresponding with a 3D model. The algorithm is applied to 2 cases which are corridor and space that has the four walls like room of building. The generated 3D map model of indoor environment is shown with VRML format and can be viewed in a web browser with a VRML plug-in. The proposed algorithm can be applied to 3D model-based remote surveillance system through WWW.

Biomimetic sensory abstraction using hierarchical quilted self-organizing maps

Jeffrey W. Miller, Peter H. Lommel

Show abstract

We present an approach for abstracting invariant classifications of spatiotemporal patterns presented in a high-dimensionality input stream, and apply an early proof-of-concept to shift and scale invariant shape recognition. A model called Hierarchical Quilted Self-Organizing Map (HQSOM) is developed, using recurrent self-organizing maps (RSOM) arranged in a pyramidal hierarchy, attempting to mimic the parallel/hierarchical pattern of isocortical processing in the brain. The results of experiments are presented in which the algorithm learns to classify multiple shapes, invariant to shift and scale transformations, in a very small (7×7 pixel) field of view.

A comparison of two- and three-dimensional imaging

Ernest Hall, Donald Rosselot, Mark Aull, et al.

Show abstract

Three dimensional visual recognition and measurement are important in many machine vision applications. In some cases, a stationary camera base is used and a three-dimensional model will permit the measurement of depth information from a scene. One important special case is stereo vision for human visualization or measurements. In cases in which the camera base is also in motion, a seven dimensional model may be used. Such is the case for navigation of an autonomous mobile robot. The purpose of this paper is to provide a computational view and introduction of three methods to three-dimensional vision. Models are presented for each situation and example computations and images are presented. The significance of this work is that it shows that various methods based on three-dimensional vision may be used for solving two and three dimensional vision problems. We hope this work will be slightly iconoclastic but also inspirational by encouraging further research in optical engineering.

Feature optimization and creation of a real time pattern matching system

E. Wildling, O. Sidla, M. Rosner

Show abstract

State of the art algorithms for people or vehicle detection should not only be accurate in terms of detection performance and low false alarm rate, but also fast enough for real time applications. Accurate algorithms are usually very complex and tend to have a lot of calculated features to be used or parameters available for adjustments. So one big goal is to decrease the amount of necessary features used for object detection while increasing the speed of the algorithm and overall performance by finding an optimum set of classifier variables. In this paper we describe algorithms for feature selection, parameter optimisation and pattern matching especially for the task of pedestrian detection based on Histograms of Oriented Gradients and Support Vector Machine classifiers. Shape features were derived with the Histogram of Oriented Gradients algorithm which resulted in a feature vector of 6318 elements. To decrease computation time to an acceptable limit for real-time detection we reduced the full feature vector to sizes of 1000, 500, 300, 200, and 160 elements with a genetic feature selection method. With the remaining features a Support Vector Machine classifier was build and its classification parameters further optimized to result in less support vectors for further improvements in processing speed. This paper compares the classification performance, of the different SVM's on real videos (some sample images), visualizes the chosen features (which histogram bins on which location in the image search feature) and analyses the performance of the final system with respect to execution time and frame rate.

Vehicle detection methods for surveillance applications

O. Sidla, E. Wildling, Y. Lypetskyy

Show abstract

The efficient monitoring of traffic flow as well as related surveillance and detection applications demand an increasingly robust recognition of vehicles in image and video data. This paper describes two different methods for vehicle detection in real world situations: Principal Component Analysis and the Histogram of Gradients principle. Both methods are described and their detection capabilities as well as advantages and disadvantages are compared. A large sample dataset which contains images of cars from the backside and frontside in day and night conditions is the basis for creating and optimizing both variants of the proposed algorithms. The resulting two detectors allow recognition of vehicles in frontal view +- 30 deg and views from behind +- 30 deg. The paper demonstrates that both detection methods can operate effectively even under difficult lighting situations with high detection rates and a low number of false positives.

High-quality and small-capacity e-learning video featuring lecturer-superimposing PC screen images

Yoshihiko Nomura, Michinobu Murakami, Ryota Sakamoto, et al.

Show abstract

Information processing and communication technology are progressing quickly, and are prevailing throughout various technological fields. Therefore, the development of such technology should respond to the needs for improvement of quality in the e-learning education system. The authors propose a new video-image compression processing system that ingeniously employs the features of the lecturing scene. While dynamic lecturing scene is shot by a digital video camera, screen images are electronically stored by a PC screen image capturing software in relatively long period at a practical class. Then, a lecturer and a lecture stick are extracted from the digital video images by pattern recognition techniques, and the extracted images are superimposed on the appropriate PC screen images by off-line processing. Thus, we have succeeded to create a high-quality and small-capacity (HQ/SC) video-on-demand educational content featuring the advantages: the high quality of image sharpness, the small electronic file capacity, and the realistic lecturer motion.

A parallel unmixing algorithm for hyperspectral images

Stefan A. Robila, Lukasz G. Maciak

Show abstract

We present a new algorithm for feature extraction in hyperspectral images based on source separation and parallel computing. In source separation, given a linear mixture of sources, the goal is to recover the components by producing an unmixing matrix. In hyperspectral imagery, the mixing transform and the separated components can be associated with endmembers and their abundances. Source separation based methods have been employed for target detection and classification of hyperspectral images. However, these methods usually involve restrictive conditions on the nature of the results such as orthogonality (in Principal Component Analysis - PCA and Orthogonal Subspace Projection - OSP) of the endmembers or statistical independence (in Independent Component Analysis - ICA) of the abundances nor do they fully satisfy all the conditions included in the Linear Mixing Model. Compared to this, our approach is based on the Nonnegative Matrix Factorization (NMF), a less constraining unmixing method. NMF has the advantage of producing positively defined data, and, with several modifications that we introduce also ensures addition to one. The endmember vectors and the abundances are obtained through a gradient based optimization approach. The algorithm is further modified to run in a parallel environment. The parallel NMF (P-NMF) significantly reduces the time complexity and is shown to also easily port to a distributed environment. Experiments with in-house and Hydice data suggest that NMF outperforms ICA, PCA and OSP for unsupervised endmember extraction. Coupled with its parallel implementation, the new method provides an efficient way for unsupervised unmixing further supporting our efforts in the development of a real time hyperspectral sensing environment with applications to industry and life sciences.

An embedded vision system for an unmanned four-rotor helicopter

Kirt Lillywhite, Dah-Jye Lee, Beau Tippetts, et al.

Show abstract

In this paper an embedded vision system and control module is introduced that is capable of controlling an unmanned four-rotor helicopter and processing live video for various law enforcement, security, military, and civilian applications. The vision system is implemented on a newly designed compact FPGA board (Helios). The Helios board contains a Xilinx Virtex-4 FPGA chip and memory making it capable of implementing real time vision algorithms. A Smooth Automated Intelligent Leveling daughter board (SAIL), attached to the Helios board, collects attitude and heading information to be processed in order to control the unmanned helicopter. The SAIL board uses an electrolytic tilt sensor, compass, voltage level converters, and analog to digital converters to perform its operations. While level flight can be maintained, problems stemming from the characteristics of the tilt sensor limits maneuverability of the helicopter. The embedded vision system has proven to give very good results in its performance of a number of real-time robotic vision algorithms.

A framework for autonomy

Richard Hildebrant

Show abstract

The development of autonomous planning and control system software often results in a custom design concept and software specific to a particular control application. This paper describes a software framework for orchestrating the planning and execution of autonomous activities of an unmanned vehicle, or a group of cooperating vehicles, that can apply to a wide range of autonomy applications. The framework supports an arbitrary span of autonomous capability, ranging from simple low level tasking, requiring much human intervention, to higher level mission-oriented tasking, requiring much less. The approach integrates the four basic functions of all intelligent devises or agents (plan development, plan monitoring, plan diagnosing, and plan execution), with the mathematical discipline of hierarchical planning and control. The result is a domain-independent software framework, to which domain-dependent modules for planning, monitoring, and diagnosing are easily added. This framework for autonomy, combined with the requisite logic for vehicle control, can then be deployed to realize the desired level of autonomous vehicle operation.

Moving target detection through omni-orientational vision fixed on AGV

Shu-Ying Yang, Zuo-Liang Cao, Pei-Lian He

Show abstract

Extremely wide view of the omni-vision performs highly advanced for the vehicle navigation and target detection. However moving targets detection through omni-vision fixed on AGV (Automatic Guided Vehicle) involves more complex environments, where both the targets and the vehicle are in the moving condition. The moving targets will be detected in a moving background. After analyzing the character on omniorientational vision and image, we propose to use the estimation in optical flow fields, Gabor filter over optical flow fields for detecting moving objects. Because polar angle θ and polar radius R of polar coordinates are being changed as the targets moving, we improved optical flow approach which can be calculated based on the polar coordinates at the omniorientational center. We constructed Gabor filter which has 24 orientations every 15°, and filter optical flow fields at 24 orientations. By the contrast of the Gabor filter images at the same orientation and the same AGV position between the situation which there aren't any moving targets in the environment and the situation which there are some moving targets in the same environment, the moving targets' optical flow fields could be recognized. Experiment results show that the proposed approach is feasible and effective.

Intelligent robot control using an adaptive critic with a task control center and dynamic database

E. L. Hall, M. Ghaffari, X. Liao, et al.

Show abstract

The purpose of this paper is to describe the design, development and simulation of a real time controller for an intelligent, vision guided robot. The use of a creative controller that can select its own tasks is demonstrated. This creative controller uses a task control center and dynamic database. The dynamic database stores both global environmental information and local information including the kinematic and dynamic models of the intelligent robot. The kinematic model is very useful for position control and simulations. However, models of the dynamics of the manipulators are needed for tracking control of the robot's motions. Such models are also necessary for sizing the actuators, tuning the controller, and achieving superior performance. Simulations of various control designs are shown. Also, much of the model has also been used for the actual prototype Bearcat Cub mobile robot. This vision guided robot was designed for the Intelligent Ground Vehicle Contest. A novel feature of the proposed approach is that the method is applicable to both robot arm manipulators and robot bases such as wheeled mobile robots. This generality should encourage the development of more mobile robots with manipulator capability since both models can be easily stored in the dynamic database. The multi task controller also permits wide applications. The use of manipulators and mobile bases with a high-level control are potentially useful for space exploration, certain rescue robots, defense robots, and medical robotics aids.

Obstacle avoidance using predictive vision based on a dynamic 3D world model

D. Paul Benjamin, Damian Lyons, Tom Achtemichuk

Show abstract

We have designed and implemented a fast predictive vision system for a mobile robot based on the principles of active vision. This vision system is part of a larger project to design a comprehensive cognitive architecture for mobile robotics. The vision system represents the robot's environment with a dynamic 3D world model based on a 3D gaming platform (Ogre3D). This world model contains a virtual copy of the robot and its environment, and outputs graphics showing what the virtual robot "sees" in the virtual world; this is what the real robot expects to see in the real world. The vision system compares this output in real time with the visual data. Any large discrepancies are flagged and sent to the robot's cognitive system, which constructs a plan for focusing on the discrepancies and resolving them, e.g. by updating the position of an object or by recognizing a new object. An object is recognized only once; thereafter its observed data are monitored for consistency with the predictions, greatly reducing the cost of scene understanding. We describe the implementation of this vision system and how the robot uses it to locate and avoid obstacles.

A design approach for small vision-based autonomous vehicles

Barrett B. Edwards, Wade S. Fife, James K. Archibald, et al.

Show abstract

This paper describes the design of a small autonomous vehicle based on the Helios computing platform, a custom FPGA-based board capable of supporting on-board vision. Target applications for the Helios computing platform are those that require lightweight equipment and low power consumption. To demonstrate the capabilities of FPGAs in real-time control of autonomous vehicles, a 16 inch long R/C monster truck was outfitted with a Helios board. The platform provided by such a small vehicle is ideal for testing and development. The proof of concept application for this autonomous vehicle was a timed race through an environment with obstacles. Given the size restrictions of the vehicle and its operating environment, the only feasible on-board sensor is a small CMOS camera. The single video feed is therefore the only source of information from the surrounding environment. The image is then segmented and processed by custom logic in the FPGA that also controls direction and speed of the vehicle based on visual input.

High-speed-tracking active cameras for obtaining clear object image

Hiroshi Oike, Haiyuan Wu, Chunsheng Hua, et al.

Show abstract

In this paper, we propose a high performance object tracking system for obtaining high quality images of a high-speed moving object at video rate by controlling a pair of active cameras mounted on two fixed view point pan-tilt-zoom units. In this paper, the 'High quality object image' means that the image of the object is in focus and not blurred, the S/N ratio is high enough, the size of the object in the image is kept unchanged, and the position the object appearing at the image center. To achieve our goal, we use K-means tracker algorithm for tracking object in image sequence which taken by the active cameras. We use the result of the K-means tracker to control the angular position and speed of each pan-tilt-zoom unit by employing PID control scheme. By using two cameras, binocular stereo vision algorithm can be used to obtain 3D position and velocity of the object. These results are used for adjust the focus and the zoom. Moreover, our system let two cameras fix its eyes on one point in 3D space. However, this system may be unstable, when time response loses by interfering in a mutual control loop too much, or by hard restriction of cameras action. In order to solve these problems, we introduced a concept of reliability into K-means tracker, and propose a method for controlling active cameras by using relative reliability. We produce the prototype system. Though extensive experiments we confirmed that we can obtain in focus and motion-blur-free images of a high-speed moving object at video rate.

The 14TH Annual Intelligent Ground Vehicle Competition: intelligent teams creating intelligent ground robots

Bernard L. Theisen, Dmitri Nguyen

Show abstract

The Intelligent Ground Vehicle Competition (IGVC) is one of three, unmanned systems, student competitions that were founded by the Association for Unmanned Vehicle Systems International (AUVSI) in the 1990s. The IGVC is a multidisciplinary exercise in product realization that challenges college engineering student teams to integrate advanced control theory, machine vision, vehicular electronics, and mobile platform fundamentals to design and build an unmanned system. Teams from around the world focus on developing a suite of dual-use technologies to equip ground vehicles of the future with intelligent driving capabilities. Over the past 14 years, the competition has challenged undergraduate, graduate and Ph.D. students with real world applications in intelligent transportation systems, the military and manufacturing automation. To date, teams from over 50 universities and colleges have participated. This paper describes some of the applications of the technologies required by this competition and discusses the educational benefits. The primary goal of the IGVC is to advance engineering education in intelligent vehicles and related technologies. The employment and professional networking opportunities created for students and industrial sponsors through a series of technical events over the three-day competition are highlighted. Finally, an assessment of the competition based on participant feedback is presented.

Autonomous robot navigation using vision- and sensor-based algorithm

Susmita Bhandari, Allison Mathis, Kashif Mohiuddin, et al.

Show abstract

ALVIN-VII is an autonomous vehicle designed to compete in the AUVSI Intelligent Ground Vehicle Competition (IGVC). The competition consists of two events, the Autonomous Challenge and Navigation Challenge. Using tri-processor control architecture the information from sonar sensors, cameras, GPS and compass is effectively integrated to map out the path of the robot. In the Autonomous Challenge, the real time data from two Firewire web cameras and an array of four sonar sensors are plotted on a custom-defined polar grid to identify the position of the robot with respect to the obstacles in its path. Depending on the position of the obstacles in the grid, a state number is determined and a command of action is retrieved from the state table. The image processing algorithm comprises a series of steps involving plane extraction, morphological analysis, edge extraction and interpolation, all of which are statistically based allowing optimum operation at varying ambient conditions. In the Navigation Challenge, data from GPS and sonar sensors are integrated on a polar grid with flexible distance thresholds and a state table approach is used to drive the vehicle to the next waypoint while avoiding obstacles. Both algorithms are developed and implemented using National Instruments (NI) hardware and LabVIEW software. The task of collecting and processing information in real time can be time consuming and hence not reactive enough for moving robots. Using three controllers, the image processing is done separately for each camera while a third controller integrates the data received through an Ethernet connection.

A simple, inexpensive, and effective implementation of a vision-guided autonomous robot

Beau Tippetts, Kirt Lillywhite, Spencer Fowers, et al.

Show abstract

This paper discusses a simple, inexpensive, and effective implementation of a vision-guided autonomous robot. This implementation is a second year entrance for Brigham Young University students to the Intelligent Ground Vehicle Competition. The objective of the robot was to navigate a course constructed of white boundary lines and orange obstacles for the autonomous competition. A used electric wheelchair was used as the robot base. The wheelchair was purchased from a local thrift store for $28. The base was modified to include Kegresse tracks using a friction drum system. This modification allowed the robot to perform better on a variety of terrains, resolving issues with last year's design. In order to control the wheelchair and retain the robust motor controls already on the wheelchair the wheelchair joystick was simply removed and replaced with a printed circuit board that emulated joystick operation and was capable of receiving commands through a serial port connection. Three different algorithms were implemented and compared: a purely reactive approach, a potential fields approach, and a machine learning approach. Each of the algorithms used color segmentation methods to interpret data from a digital camera in order to identify the features of the course. This paper will be useful to those interested in implementing an inexpensive vision-based autonomous robot.

A method for sensor processing, sensor integration, and navigation in mobile autonomous ground vehicles

Robert N. Riggins, Bruce V. Mutter, Scott Baker, et al.

Show abstract

This paper presents an algorithm for solving three challenges of autonomous navigation: sensor signal processing, sensor integration, and path-finding. The algorithm organizes these challenges into three steps. The first step involves converting the raw data from each sensor to a form suitable for real-time processing. Emphasis in the first step is on image processing. In the second step, the processed data from all sensors is integrated into a single map. Using this map as input, during the third step the algorithm calculates a goal and finds a suitable path from robot to the goal. The method presented in this paper completes these steps in this order and the steps repeat indefinitely. The robotic platform designed for testing the algorithm is a six-wheel mid-wheel drive system using differential steering. The robot, called Anassa II, has an electric wheelchair base and a custom-built top and it is designed to participate in the Intelligent Ground Vehicle Competition (IGVC). The sensors consist of a laser scanner, a video camera, a Differential Global Positioning System (DGPS) receiver, a digital compass, and two wheel encoders. Since many intelligent vehicles have similar sensors, the approach presented here is general enough for many types of autonomous mobile robots.

Obstacle recognition using region-based color segmentation techniques for mobile robot navigation

Robert T. McKeon, Mohan Krishnan, Mark Paulik

Show abstract

This work has been performed in conjunction with the ECE Department's autonomous vehicle entry in the 2006 Intelligent Ground Vehicle Competition (www.igvc.org). The course to be traversed in the competition consists of a lane demarcated by paint lines on grass along with other challenging artifacts such as a sandpit, a ramp, potholes, colored tarps, and obstacles set up using orange and white construction barrels. In this paper an enhanced obstacle detection and mapping algorithm based on region-based color segmentation techniques is described. The main purpose of this algorithm is to detect obstacles which are not properly identified by the LADAR (Laser Detection and Ranging) system optimally mounted close to the ground, due to "shadowing" occasionally resulting in bad navigation decisions. On the other hand, the camera that is primarily used to detect the lane lines is mounted at 6 feet. In this work we concentrate on the identification of orange/red construction barrels. This paper proposes a generalized color segmentation technique which is potentially more versatile and faster than traditional full or partial color segmentation approaches. The developed algorithm identifies the shadowed items within the camera's field of vision and uses this to complement the LADAR information, thus facilitating an enhanced navigation strategy. The identification of barrels also aids in deleting bright objects from images which contain lane lines, which improves lane line identification.

Lane identification and path planning for autonomous mobile robots

Robert T. McKeon, Mark Paulik, Mohan Krishnan

Show abstract

This work has been performed in conjunction with the University of Detroit Mercy's (UDM) ECE Department autonomous vehicle entry in the 2006 Intelligent Ground Vehicle Competition (www.igvc.org). The IGVC challenges engineering students to design autonomous vehicles and compete in a variety of unmanned mobility competitions. The course to be traversed in the competition consists of a lane demarcated by painted lines on grass with the possibility of one of the two lines being deliberately left out over segments of the course. The course also consists of other challenging artifacts such as sandpits, ramps, potholes, and colored tarps that alter the color composition of scenes, and obstacles set up using orange and white construction barrels. This paper describes a composite lane edge detection approach that uses three algorithms to implement noise filters enabling increased removal of noise prior to the application of image thresholding. The first algorithm uses a row-adaptive statistical filter to establish an intensity floor followed by a global threshold based on a reverse cumulative intensity histogram and a priori knowledge about lane thickness and separation. The second method first improves the contrast of the image by implementing an arithmetic combination of the blue plane (RGB format) and a modified saturation plane (HSI format). A global threshold is then applied based on the mean of the intensity image and a user-defined offset. The third method applies the horizontal component of the Sobel mask to a modified gray scale of the image, followed by a thresholding method similar to the one used in the second method. The Hough transform is applied to each of the resulting binary images to select the most probable line candidates. Finally, a heuristics-based confidence interval is determined, and the results sent on to a separate fuzzy polar-based navigation algorithm, which fuses the image data with that produced by a laser scanner (for obstacle detection).

Analysis on the jumping of a spherical rolling robot

Hanxu Sun, Liangqing Wang, Qingxuan Jia

Show abstract

The movement of a spherical rolling robot before jumping is analyzed by use of phase plane on the basis of kinematics and dynamics and the jumping condition is gotten. The dynamic model of the spherical rolling robot after jumping is developed through the D'Alembet principle. The model is simulated and an experiment is completed. The simulation and the experiment have demonstrated the feasibility and validity of the theoretical analysis for the spherical rolling robot both in climbing and jumping.

Virtual performer: single camera 3D measuring system for interaction in virtual space

Kunio Sakamoto, Shoto Taneji

Show abstract

The authors developed interaction media systems in the 3D virtual space. In these systems, the musician virtually plays an instrument like the theremin in the virtual space or the performer plays a show using the virtual character such as a puppet. This interactive virtual media system consists of the image capture, measuring performer's position, detecting and recognizing motions and synthesizing video image using the personal computer. In this paper, we propose some applications of interaction media systems; a virtual musical instrument and superimposing CG character. Moreover, this paper describes the measuring method of the positions of the performer, his/her head and both eyes using a single camera.

A method of 3D measuring for finger pointing using a single camera

Hironobu Nakayama, Kunio Sakamoto

Show abstract

A finger pointing system is described that can specify the position of a pointed target. A 3D position is generally detected and measured using stereoscopic images of multi-lens camera system. This pair of stereo images allows us to obtain the 3D information about the object. However it is possible to estimate the position of fingertip using a single image without stereo images. The geometric model of human gives estimates of the length of an arm and the 3D position of a fingertip. We approximately find out the length of an arm from stature and shoulder lengths, then the 3D position of a fingertip is estimated using the geometric model. We have developed a prototype finger pointing system using a single camera and a personal computer. This paper proposes the measuring method of the positions of finger and pointed target using a single camera.

Functions of images

Juha Lehtonen, Alexey Andriyashin, Jussi Parkkinen, et al.

Show abstract

The visual quality of images is outward in image presentation, compression and analysis. Depending on the use, the quality of images may give more information or more experiences to the viewer. However, the relations between mathematical and human methods for grouping the images are not obvious. For example, different humans think differently and so, they make the grouping differently. However, there may be some connections between image mathematical features and human selections. Here we try to find such relations that could give more possibilities for developing the actual quality of images for different purposes. In this study, we present some methods and preliminary results that are based on psychological tests to humans, MPEG-7 based features of the images and face detection methods. We also show some notes and questions belonging to this problem and plans for the future research.

A new algorithm for fruit shape classification based on level set

Jiangsheng Gui, Yibin Ying, Xiuquin Rao

Show abstract

In this research, a new algorithm for fruit shape classification was proposed. The level set representations according to signed distance transforms were used, which are a simple, robust, rich and efficient way to represent shapes. Based on these representations, the rigid transform was adopted to align shapes within the same class, and the simplest possible criterion, the sum of square differences was considered. After align procedure, the average shape representations can easily be derived and shape classification was performed by the nearest neighbor method. Promising results were obtained on experiments showing the efficiency and accurate of our algorithm. Key words: machine vision, shape classification, fruit sorting, level set

A 2D CMAC neural net algorithm for a positioning system of automated agriculture vehicle

Fangming Zhang, Yibin Ying

Show abstract

In a machine vision-based guidance system, a camera must be corrected precisely to calculate the position of vehicle, however, it is not easy to obtain the intrinsic and extrinsic parameters of the camera, while neural nets have the advantage to set up a mapping relationship for a nonlinear system. We intended to use the CMAC neural net to construct two map relationships: image coordinates and offsets of the vehicle, and image coordinates and the heading angle of the vehicle. The net inputs were the coordinates of top and bottom points in the detected guidance line in the image coordinate system. The outputs were offsets and heading angles. The verified results show that the RMS of inferred offset is 10.5 mm, and the STD is 11.3 mm; the RMS of inferred heading is 1.1°, and the STD is 0.99°.

Mearsurement and control system for agricultural robot

Tong Sun, Fangming Zhang, Yibin Ying

Show abstract

Automation of agricultural equipments in the near term appears both economically viable and technically feasible. This paper describes measurement and control system for agriculture robot. It consists of a computer, a pair of NIR cameras, one inclinometer, one potentionmeter and two encoders. Inclinometer, potentionmeter and encoders are used to measure obliquity of camera, turning angle of front-wheel and velocity of rear wheel, respectively. These sensor data are filtered before sending to PC. The test shows that the system can measure turning angle of front-wheel and velocity of rear wheel accurately whether robot is at stillness state or at motion state.

Intelligent Robots and Computer Vision XXIV: Algorithms, Techniques, and Active Vision

Volume Details

Table of Contents

Table of Contents