Proceedings Volume 9528

Videometrics, Range Imaging, and Applications XIII

cover
Proceedings Volume 9528

Videometrics, Range Imaging, and Applications XIII

Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 18 May 2015
Contents: 9 Sessions, 36 Papers, 0 Presentations
Conference: SPIE Optical Metrology 2015
Volume Number: 9528

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 9528
  • Light Field Videometry: Point Cloud Analysis
  • Structured Light and Fringe Analysis
  • Calibration and Accuracy
  • Metrology Applications
  • Image Sequences and Tracking: UAV Applications
  • Image Matching and Surface Models
  • Range Imaging Modelling and Analysis
  • Poster Session
Front Matter: Volume 9528
icon_mobile_dropdown
Front Matter: Volume 9528
This PDF file contains the front matter associated with SPIE Proceedings Volume 9528, including the Title Page, Copyright information, Table of Contents, Invited Panel Discussion, and Conference Committee listing.
Light Field Videometry: Point Cloud Analysis
icon_mobile_dropdown
Light-field camera design for high-accuracy depth estimation
M. Diebold, O. Blum, M. Gutsche, et al.
Light-field imaging is a research field with applicability in a variety of imaging areas including 3D cinema, entertainment, robotics, and any task requiring range estimation. In contrast to binocular or multi-view stereo approaches, capturing light fields means densely observing a target scene through a window of viewing directions. A principal benefit in light-field imaging for range computation is that one can eliminate the error-prone and computationally expensive process of establishing correspondence. The nearly continuous space of observation allows to compute highly accurate and dense depth maps free of matching. Here, we discuss how to structure the imaging system for optimal ranging over a defined volume - what we term a bounded frustum. We detail the process of designing the light-field setup, including practical issues such as camera footprint and component size influence the depth of field, lateral and range resolution. Both synthetic and real captured scenes are used to analyze the depth precision resulting from a design, and to show how unavoidable inaccuracies such as camera position and focal length variation limit depth precision. Finally, inaccuracies may be sufficiently well compensated through calibration and must be eliminated at the outset.
Knowledge guided object detection and identification in 3D point clouds
Modern instruments like laser scanner and 3D cameras or image based techniques like structure from motion produce huge point clouds as base for further object analysis. This has considerably changed the way of data compilation away from selective manually guided processes towards automatic and computer supported strategies. However it’s still a long way to achieve the quality and robustness of manual processes as data sets are mostly very complex. Looking at existing strategies 3D data processing for object detections and reconstruction rely heavily on either data driven or model driven approaches. These approaches come with their limitation on depending highly on the nature of data and inability to handle any deviation. Furthermore, the lack of capabilities to integrate other data or information in between the processing steps further exposes their limitations. This restricts the approaches to be executed with strict predefined strategy and does not allow deviations when and if new unexpected situations arise. We propose a solution that induces intelligence in the processing activities through the usage of semantics. The solution binds the objects along with other related knowledge domains to the numerical processing to facilitate the detection of geometries and then uses experts’ inference rules to annotate them. The solution was tested within the prototypical application of the research project “Wissensbasierte Detektion von Objekten in Punktwolken für Anwendungen im Ingenieurbereich (WiDOP)”. The flexibility of the solution is demonstrated through two entirely different USE Case scenarios: Deutsche Bahn (German Railway System) for the outdoor scenarios and Fraport (Frankfort Airport) for the indoor scenarios. Apart from the difference in their environments, they provide different conditions, which the solution needs to consider. While locations of the objects in Fraport were previously known, that of DB were not known at the beginning.
The analysis of selected orientation methods of architectural objects’ scans
Jakub S. Markiewicz, Irmina Kajdewicz, Dorota Zawieska
The terrestrial laser scanning is commonly used in different areas, inter alia in modelling architectural objects. One of the most important part of TLS data processing is scans registration. It significantly affects the accuracy of generation of high resolution photogrammetric documentation. This process is time consuming, especially in case of a large number of scans. It is mostly based on an automatic detection and a semi-automatic measurement of control points placed on the object. In case of the complicated historical buildings, sometimes it is forbidden to place survey targets on an object or it may be difficult to distribute survey targets in the optimal way. Such problems encourage the search for the new methods of scan registration which enable to eliminate the step of placing survey targets on the object. In this paper the results of target-based registration method are presented The survey targets placed on the walls of historical chambers of the Museum of King Jan III's Palace at Wilanów and on the walls of ruins of the Bishops Castle in Iłża were used for scan orientation. Several variants of orientation were performed, taking into account different placement and different number of survey marks. Afterwards, during next research works, raster images were generated from scans and the SIFT and SURF algorithms for image processing were used to automatically search for corresponding natural points. The case of utilisation of automatically identified points for TLS data orientation was analysed. The results of both methods for TLS data registration were summarized and presented in numerical and graphical forms.
Structured Light and Fringe Analysis
icon_mobile_dropdown
3D measurement with active triangulation for spectacle lens optimization and individualization
Julia Gehrmann, Markus Tiemann, Peter C. Seitz
We present for the first time an active triangulation technique for video centration. This technique requires less manual selection than current methods and thus enables faster measurements while providing the same resolution. The suitability to measure physiological parameters is demonstrated in a measurement series. The active triangulation technique uses a laser line for illumination which is positioned such that it intersects with the pupils of the subject to be measured. For the illumination of human eyes, the wavelength and output power were carefully investigated to ensure photobiological safety at all times and reduce irritation of the subject being measured. A camera with a known orientation to the laser line images the subject. Physiological features on the subject and the frame are then selected in the acquired image yielding directly a 3D position if lying on the illuminated laser line. Distances to points off the laser line can be estimated from a scaling at the same depth. Focus is on two parameters: interpupillary distance (PD) and corneal face form angle (FFA). In our study we examined the repeatability of the measurements. We found an excellent repeatability with small deviations to the reference value. Furthermore a physiological study was carried out with the setup showing the applicability of this method for video centration measurements. A comparison to a reference measurement system shows only small differences.
Detection of defects in a transparent polymer with high resolution tomography using white light scanning interferometry and noise reduction
A. Leong-Hoï, R. Claveau, M. Flury, et al.
Transparent layers such as polymers are complex and can contain defects which are not detectable with classical optical inspection techniques. With an interference microscope, tomographic analysis can be used to obtain initial structural information over the depth of the sample by scanning the fringes along the Z axis and performing appropriate signal processing to extract the fringe envelope. By observing the resulting XZ section, low contrast, sub-μm sized defects can be lost in the noise which is present in images acquired with a CCD camera. It is possible to reduce temporal and spatial noise from the camera by applying image processing methods such as image averaging, dark frame subtraction or flat field division. In this paper, we present some first results obtained by this means with a white light scanning interferometer on a Mylar polymer, used currently as an insulator in electronics and micro-electronics. We show that sub-μm sized structures contained in the layer, initially lost in noise and barely observable, can be detected by applying a combination of image processing methods to each of the scanned XY images along the Z-axis. In addition, errors from optical imperfections such as dust particles on the lenses or components of the system can be compensated for with this method. We thus demonstrate that XZ section images of a transparent sample can be denoised by improving each of the XY acquisition images. A quantitative study of the noise reduction is presented in order to validate the performance of this technique.
3D reconstruction with single image pairs and structured light projection for short-term ultra-high-speed applications
Christian Bräuer-Burchardt, Stefan Heist, Patrick Dietrich, et al.
A new approach for a 3D reconstruction algorithm using a single pair of a stereo-camera setup and a structured light projection based on spatial correlation is introduced. In comparison to existing methods using sequences of temporally consecutive images, sufficient 3D-reconstruction quality is achieved, even in the case of ultra-high-speed cameras. This is obtained by iterative application of correspondence finding and filtering operators. The calculation effort of the evaluation, filling, filtering, and outlier removing operators is relative high and may prevent a permanent application of the algorithm to high-resolution long-term recordings. The favored application scenario of the new method is the rough 3D reconstruction and motion tracking of quickly moving objects in short-term processes (few seconds), e.g. in the analysis of crash-test situations. Here, the complete recorded image sequence can be analyzed off-line which allows an afterwards optimization of the parameters. An advantage of the new technique regarding high-speed applications is that fixed single patterns instead of pattern sequences can be used for moving objects and hence no synchronization between projection and cameras is necessary.
Handheld underwater 3D sensor based on fringe projection technique
Christian Bräuer-Burchardt, Matthias Heinze, Ingo Schmidt, et al.
A new, handheld 3D surface scanner was developed especially for underwater use until a diving depth of about 40 meters. Additionally, the sensor is suitable for the outdoor use under bad weather circumstance like splashing water, wind, and bad illumination conditions. The optical components of the sensor are two cameras and one projector. The measurement field is about 250 mm x 200 mm. The depth resolution is about 50 μm and the lateral resolution is approximately 150 μm. The weight of the scanner is about 10 kg. The housing was produced of synthetic powder using a 3D printing technique. The measurement time for one scan is between a third and a half second. The computer for measurement control and data analysis is already integrated into the housing of the scanner. A display on the backside presents the results of each measurement graphically for a real-time evaluation of the user during the recording of the measurement data.
Profilometry of discontinuous solids by means of co-phased demodulation of projected fringes with RGB encoding
J. M. Padilla, M. Servin, G. Garnica
Here we describe a 2-projectors and 1-camera setup for profilometry of discontinuous solids by means of co-phased demodulation of projected fringes and red, green, and blue (RGB) multichannel operation. The dual projection configuration for this profilometer is proposed to solve efficiently specular regions and self-occluding shadows due to discontinuities, which are the main drawbacks for a 1-projector 1-camera configuration. This is because the regions where shadows and specular reflections are generated, and the fringe contrast drops to zero, are in general different for each projection direction; thus, the resulting fringe patterns will have complementary phase information. Multichannel RGB operation allows us to work simultaneously with both projectors and to record independently the complementary fringe patterns phase-modulated by the 3D profile of the object under study. In other words, color encoding/decoding reduces the acquisition time respect to one-at-a-time grayscale operation and, in principle, enables the study of dynamic phenomena. The co-phased demodulation method implemented in this work benefits from the complex (analytic) nature of the output signals estimated with most phase demodulation methods (such as the Fourier method, and temporal phaseshifting algorithms). This allowed us to straightforwardly generate a single phase-map well-defined for the entire area of interest. Finally we assessed our proposed profilometry setup by measuring a fractured spherical cap made of (uncoated) expanded polystyrene. The results were satisfactory but in the authors’ opinion this must be considered a preliminary report.
Calibration and Accuracy
icon_mobile_dropdown
Development of orientation method with constraint conditions using vector data
Takashi Fuse, Keita Kamiya
Recently, various kinds of vector data have been widely used. Images as raster data also became popular, and then applications using the vector data and images simultaneously attract more interests. Such applications require registration of those data in a same coordinates system. This paper proposes an orientation method combining the vector data with the images based on bundle adjustment. Since the vector data can be regarded as constraint condition, the bundle adjustment is extended to constrained non-linear optimization method. The constraint conditions are coincidence between lines extracted from images and the corresponding ones of vector data. For formulation, a representative point is set as midpoint of a projected line of vector data on the image. By using the representative points, the coincidence condition is expressed as distance the point and the lines extracted from the image. According to the conditions, the proposed method is formulated as Lagrange's method of undetermined multipliers. The proposed method is applied to synthetic and real data (compared with laser scanner data). The experiments with both synthetic and real data show that the proposed method is more accurate to errors caused by low accuracy of coordinates of feature points than a method without constraint conditions. According to the experiments, the significance of the proposed method is confirmed.
Development, comparison, and evaluation of software for radial distortion elimination
A. I. Papadaki, A. Georgopoulos
Lately the interest of Computer Vision and Photogrammetry community has been focused on the automation of the processes of identification and elimination of the radial distortion, with the aim to correct the image coordinates and finally to obtain digital images with reliable geometric information. This effort has reached the point of development of commercial or free image processing software, claiming that it can automatically identify and remove the radial distortion from an image. In this paper in depth research has been conducted about the radial distortion and the methods of its identification and elimination. Specifically, it has been attempted to evaluate software as the aforementioned, about its effectiveness, accuracy and applicability on the elimination of the radial distortion from images. For the attainment of the desired aim, four different methods of comparison and evaluation of the performance of the software, with respect to the correction of an image, have been employed. The applied methods are (i) the optical evaluation of the produced digital images, (ii) the subtraction of the images, (iii) the comparison of the curves of the remaining radial distortion in the images and (iv) the comparison of the results from the orientation of an image pair. However, it was really important to have a benchmark for the evaluation, in order to ensure the objectivity and accuracy of the comparison. Therefore, a new reliable algorithm has been developed, which was of known and controllable accuracy. The results of these comparisons are presented and evaluated for their reliability and usefulness.
Relevance of ellipse eccentricity for camera calibration
W. Mordwinzew, B. Tietz, F. Boochs, et al.
Plane circular targets are widely used within calibrations of optical sensors through photogrammetric set-ups. Due to this popularity, their advantages and disadvantages are also well studied in the scientific community. One main disadvantage occurs when the projected target is not parallel to the image plane. In this geometric constellation, the target has an elliptic geometry with an offset between its geometric and its projected center. This difference is referred to as ellipse eccentricity and is a systematic error which, if not treated accordingly, has a negative impact on the overall achievable accuracy. The magnitude and direction of eccentricity errors are dependent on various factors. The most important one is the target size. The bigger an ellipse in the image is, the bigger the error will be. Although correction models dealing with eccentricity have been available for decades, it is mostly seen as a planning task in which the aim is to choose the target size small enough so that the resulting eccentricity error remains negligible. Besides the fact that advanced mathematical models are available and that the influence of this error on camera calibration results is still not completely investigated, there are various additional reasons why bigger targets can or should not be avoided. One of them is the growing image resolution as a by-product from advancements in the sensor development. Here, smaller pixels have a lower S/N ratio, necessitating more pixels to assure geometric quality. Another scenario might need bigger targets due to larger scale differences whereas distant targets should still contain enough information in the image. In general, bigger ellipses contain more contour pixels and therefore more information. This supports the target-detection algorithms to perform better even at non-optimal conditions such as data from sensors with a high noise level. In contrast to rather simple measuring situations in a stereo or multi-image mode, the impact of ellipse eccentricity on image blocks cannot be modeled in a straight forward fashion. Instead, simulations can help make the impact visible, and to distinguish critical or less critical situations. In particular, this might be of importance for calibrations, as undetected influence on the results will affect further projects where the same camera will be used. This paper therefore aims to point out the influence of ellipse eccentricities on camera calibrations, by using two typical calibration bodies: planar and cube shaped calibration. In the first step, their relevance and influence on the image measurements, object- and camera geometry is shown with numeric examples. Differences and similarities between both calibration bodies are identified and discussed. In the second step, practical relevance of a correction is proven in a real calibration. Finally, a conclusion is drawn followed by recommendations to handle ellipse eccentricity in the practice.
Self-calibration of a structured light based scanner for use in archeological applications
Adam Jahraus, Derek Lichti, Peter Dawson
It is frequently necessary in archaeology to map excavated features so their structure can be recorded before they are dismantled in order for the excavation to continue. This process can be time consuming, error prone and manually intensive. Three-dimensional recording devices, which have the advantage of being faster, less labor intensive and more detailed, present an attractive alternative method of mapping. A small, portable hand scanner such as the DotProduct DPI-7, could be used for this purpose. However, the three-dimensional data collected from this device contain systematic distortions that cause errors in the recorded shape of the features being mapped. The performance of the DPI-7 scanner is evaluated in this paper using self-calibration based techniques. A calibration field consisting of spherical targets rigidly mounted on a planar background was imaged from multiple locations, and the target deviations from expected locations are used to quantify the performance of the device. The largest source of systematic error in the DPI-7 data was found to be a scale error affecting dimensions orthogonal to the depth. These in-plane distortions were modeled using a single scale factor parameter in the self-calibration solution, resulting in a 54% reduction in the RMS coordinate errors.
Metrology Applications
icon_mobile_dropdown
Assessment of the accuracy of 3D models obtained with DSLR camera and Kinect v2
E. Lachat, H. Macher, T. Landes, et al.
3D modeling of objects such as statues, moldings or ornaments, answers to a need of documentation and analysis in the field of cultural heritage. Several sensors based on different technologies are used to obtain information on the geometry of an object in form of point clouds: laser scanners, digital cameras or more recently RGB-D cameras. Among them, the recent Kinect v2 sensor looks promising and therefore its use has been studied in this paper. The aim of this paper is to compare two methodologies for 3D models acquisition: photogrammetry-based models and models obtained using a RGB-D camera. Since the quality of the meshed models is obviously correlated to the quality of the point cloud, the result will be more or less faithful to reality. To quantify this reliability, several comparisons to a reference model have been carried out. Regarding the results of the comparisons, we will be able to conclude about the strengths and weaknesses of photogrammetry and RGB-D cameras for 3D modeling of complex objects.
Improving automated 3D reconstruction methods via vision metrology
Isabella Toschi, Erica Nocerino, Mona Hess, et al.
This paper aims to provide a procedure for improving automated 3D reconstruction methods via vision metrology. The 3D reconstruction problem is generally addressed using two different approaches. On the one hand, vision metrology (VM) systems try to accurately derive 3D coordinates of few sparse object points for industrial measurement and inspection applications; on the other, recent dense image matching (DIM) algorithms are designed to produce dense point clouds for surface representations and analyses. This paper strives to demonstrate a step towards narrowing the gap between traditional VM and DIM approaches. Efforts are therefore intended to (i) test the metric performance of the automated photogrammetric 3D reconstruction procedure, (ii) enhance the accuracy of the final results and (iii) obtain statistical indicators of the quality achieved in the orientation step. VM tools are exploited to integrate their main functionalities (centroid measurement, photogrammetric network adjustment, precision assessment, etc.) into the pipeline of 3D dense reconstruction. Finally, geometric analyses and accuracy evaluations are performed on the raw output of the matching (i.e. the point clouds) by adopting a metrological approach. The latter is based on the use of known geometric shapes and quality parameters derived from VDI/VDE guidelines. Tests are carried out by imaging the calibrated Portable Metric Test Object, designed and built at University College London (UCL), UK. It allows assessment of the performance of the image orientation and matching procedures within a typical industrial scenario, characterised by poor texture and known 3D/2D shapes.
Determining the coordinates of lamps in an illumination dome
Lindsay W. MacDonald, Ali Hosseininaveh Ahmadabadian, Stuart Robson
The UCL Dome consists of an acrylic hemisphere of nominal diameter 1030 mm, fitted with 64 flash lights, arranged in three tiers of 16, one tier of 12, and one tier of 4 lights at approximately equal intervals. A Nikon D200 digital camera is mounted on a rigid steel frame at the ‘north pole’ of the dome pointing vertically downwards with its optical axis normal to the horizontal baseboard in the ‘equatorial’ plane. It is used to capture sets of images in pixel register for visualisation and surface reconstruction. Three techniques were employed for the geometric calibration of flash light positions in the dome: (1) the shadow cast by a vertical pin onto graph paper; (2) multi-image photogrammetry with retro-reflective targets; and (3) multi-image photogrammetry using the flash lights themselves as targets. The precision of the coordinates obtained by the three techniques was analysed, and it was found that although photogrammetric methods could locate individual targets to an accuracy of 20 μm, the uncertainty of locating the centroids of the flash lights was approximately 1.5 mm. This result was considered satisfactory for the purposes of using the dome for photometric imaging, and in particular for the visualisation of object surfaces by the polynomial texture mapping (PTM) technique.
Image Sequences and Tracking: UAV Applications
icon_mobile_dropdown
Tracking of object deformations in color and depth video: deformation models and applications
Andreas Jordt, Stefan Reinhold, Reinhard Koch
The research on deformation tracking based on color image data has continuously gained a wide interest in the last 15 years. In addition, using depth sensors such as the Microsoft Kinect, allows to mitigate the ambiguity problems that arise when trying to solve the deformation tracking tasks on color images only, by adding depth information. However, the fusion of color and depth data is not straight forward, and the deformation tracking task is still ill-posed due to the lack of a general deformation model. The problem is usually circumvented by providing special deformation functions for the task at hand, e.g., skeleton-based for reconstructing people or triangle-based for tracking planar surfaces. In this article we summarize the Analysis by Synthesis (AbS) approach for deformation tracking in depth and color video and show some successful applications of specialized deformation functions. To overcome the issues with NURBS based deformation tracking we propose a new geodesic RBF-based deformation model, which can adapt to any surface topology and shape, while keeping the number of deformation parameters low. Example deformations for objects of different topologies are given, showing the versatility and efficiency of the proposed model.
Comparison between single and multi-camera view videogrammetry for estimating 6DOF of a rigid body
Erica Nocerino, Fabio Menna, Fabio Remondino
Motion capture (MOCAP) systems are used in many fields of application (e.g., machine vision, navigation, industrial measurements, medicine) for tracking and measuring the 6DOF (Degrees-Of-Freedom) of bodies. A variety of systems has been developed in the commercial, as well as research domain, exploiting different sensors and techniques, among which optical methods, based on multi-epoch photogrammetry, are the most common. The authors have developed an off-line low-cost MOCAP system made up of three consumer-grade video cameras, i.e. a multi-view camera system. The system was employed in two different case studies for measuring the motion of personnel working onboard a fishing boat and of a ship model in a towing tank (or model basin) subjected to different sea conditions. In this contribution, the same three single cameras are separately processed to evaluate the performances of a sequential space resection method for estimated the 6DOF of a rigid body (a ship model during high frequency tests in a model basin). The results from each video camera are compared with the motion estimated using the multi-view approach, with the aim of providing a quantitative assessment of the performances obtainable.
Fast instantaneous center of rotation estimation algorithm for a skied-steered robot
Skid-steered robots are widely used as mobile platforms for machine vision systems. However it is hard to achieve a stable motion of such robots along desired trajectory due to an unpredictable wheel slip. It is possible to compensate the unpredictable wheel slip and stabilize the motion of the robot using visual odometry. This paper presents a fast optical flow based algorithm for estimation of instantaneous center of rotation, angular and longitudinal speed of the robot. The proposed algorithm is based on Horn–Schunck variational optical flow estimation method. The instantaneous center of rotation and motion of the robot is estimated by back projection of optical flow field to the ground surface. The developed algorithm was tested using skid-steered mobile robot. The robot is based on a mobile platform that includes two pairs of differential driven motors and a motor controller. Monocular visual odometry system consisting of a singleboard computer and a low cost webcam is mounted on the mobile platform. A state-space model of the robot was derived using standard black-box system identification. The input (commands) and the output (motion) were recorded using a dedicated external motion capture system. The obtained model was used to control the robot without visual odometry data. The paper is concluded with the algorithm quality estimation by comparison of the trajectories estimated by the algorithm with the data from motion capture system.
Investigating influence of UAV flight patterns in multi-stereo view DSM accuracy
Dimitrios P. Skarlatos, Marinos Vlachos, Vasilis Vamvakousis
Current advancements on photogrammetric software along with affordability and wide spreading of Autonomous Unmanned Aerial Vehicles (AUAV), allow for rapid, timely and accurate 3D modelling and mapping of small to medium sized areas. Although the importance of flight patterns and large overlaps in aerial triangulation and Digital Surface Model (DSM) production from large format aerial cameras is well documented in literature, this is not the case for AUAV photography. This paper assess DSM accuracy of models created using different flight patterns and compares them against check points and Lidar data. Three UAV flights took place, with 70%-65% forward and side overlaps, with West-East (W-E), North-South (N-S) and Northwest-Southeast (NW-SE) directions. Blocks with different flight patterns were created and processed to create raster DSM with 0.25m ground pixel size using Multi View Stereo (MVS). Using Lidar data as reference, difference maps and statistics were calculated for each block, in order to evaluate their overall accuracy. The combined scenario performed slightly better that the rest. Because of their lower spatial resolution, Lidar data prove to be an inadequate reference data set, although according to their internal vertical precision they are superior to UAV DSM. Point cloud noise from MVS, is considerable in contrast to Lidar data. A Lidar data set from a lower flying platform such as helicopter might have been a better match to low flying UAV data.
Image Matching and Surface Models
icon_mobile_dropdown
Multi-image semi-global matching in object space
F. Bethmann, T. Luhmann
Semi-Global Matching (SGM) is a widespread algorithm for image matching which is used for very different applications, ranging from real-time applications (e.g. for generating 3D data for driver assistance systems) to aerial image matching. Originally developed for stereo-image matching, several extensions have been proposed to use more than two images within the matching process (multi-baseline matching, multi-view stereo). These extensions still perform the image matching in (rectified) stereo images and combine the pairwise results afterwards to create the final solution. This paper proposes an alternative approach which is suitable for the introduction of an arbitrary number of images into the matching process and utilizes image matching by using non-rectified images. The new method differs from the original SGM method mainly in two aspects: Firstly, the cost calculation is formulated in object space within a dense voxel raster by using the grey (or colour) values of all images instead of pairwise cost calculation in image space. Secondly, the semi-global (path-wise) minimization process is transferred into object space as well, so that the result of semi-global optimization leads to index maps (instead of disparity maps) which directly indicate the 3D positions of the best matches. Altogether, this yields to an essential simplification of the matching process compared to multi-view stereo (MVS) approaches. After a description of the new method, results achieved from two different datasets (close-range and aerial) are presented and discussed.
3D city models completion by fusing lidar and image data
L. Grammatikopoulos, I. Kalisperakis, E. Petsa, et al.
A fundamental step in the generation of visually detailed 3D city models is the acquisition of high fidelity 3D data. Typical approaches employ DSM representations usually derived from Lidar (Light Detection and Ranging) airborne scanning or image based procedures. In this contribution, we focus on the fusion of data from both these methods in order to enhance or complete them. Particularly, we combine an existing Lidar and orthomosaic dataset (used as reference), with a new aerial image acquisition (including both vertical and oblique imagery) of higher resolution, which was carried out in the area of Kallithea, in Athens, Greece. In a preliminary step, a digital orthophoto and a DSM is generated from the aerial images in an arbitrary reference system, by employing a Structure from Motion and dense stereo matching framework. The image-to-Lidar registration is performed by 2D feature (SIFT and SURF) extraction and matching among the two orthophotos. The established point correspondences are assigned with 3D coordinates through interpolation on the reference Lidar surface, are then backprojected onto the aerial images, and finally matched with 2D image features located in the vicinity of the backprojected 3D points. Consequently, these points serve as Ground Control Points with appropriate weights for final orientation and calibration of the images through a bundle adjustment solution. By these means, the aerial imagery which is optimally aligned to the reference dataset can be used for the generation of an enhanced and more accurately textured 3D city model.
DTM generation from STC-SIMBIO-SYS images
The research group with the responsibility of the STereo Camera (STC) for the ESA BepiColombo mission to Mercury, has realized an innovative and compact camera design in which the light collected independently by two optical channels at ±20° with respect to the nadir direction converges on unique bidimensional detector. STC will provide the 3Dmapping of Mercury surface, acquiring images from two different perspectives. A stereo validation setup has been developed in order to give a much greater confidence to the novel instrument design and to get an on ground verification of the actual accuracies in obtaining elevation information from stereo pairs. A series of stereo-pairs of an anorthosite stone sample (good analogue of the hermean surface) and of a modelled piece of concrete, acquired in calibration clean room by means of an auxiliary optical system, have been processed in the photogrammetric pipeline using image correlation for the 3D model generation. The stereo reconstruction validation has been performed by comparing the STC DTMs (Digital Terrain Models) to an high resolution laser scanning 3D model of the stone samples as reference data. The latter has a much higher precision (ca. 20 μm) of the expected in-lab STC DTM (190 μm). Processing parameters have been varied in order to test their influence on the DTM generation accuracy. The main aim is to define the best illumination conditions and the process settings in order to obtain the best DTMs in terms of accuracy and completeness, seeking the best match between the mission constraints and the specific matching aspects that could affect the mapping process.
Stereo matching based on census transformation of image gradients
C. Stentoumis, L. Grammatikopoulos, I. Kalisperakis, et al.
Although multiple-view matching provides certain significant advantages regarding accuracy, occlusion handling and radiometric fidelity, stereo-matching remains indispensable for a variety of applications; these involve cases when image acquisition requires fixed geometry and limited number of images or speed. Such instances include robotics, autonomous navigation, reconstruction from a limited number of aerial/satellite images, industrial inspection and augmented reality through smart-phones. As a consequence, stereo-matching is a continuously evolving research field with growing variety of applicable scenarios. In this work a novel multi-purpose cost for stereo-matching is proposed, based on census transformation on image gradients and evaluated within a local matching scheme. It is demonstrated that when the census transformation is applied on gradients the invariance of the cost function to changes in illumination (non-linear) is significantly strengthened. The calculated cost values are aggregated through adaptive support regions, based both on cross-skeletons and basic rectangular windows. The matching algorithm is tuned for the parameters in each case. The described matching cost has been evaluated on the Middlebury stereo-vision 2006 datasets, which include changes in illumination and exposure. The tests verify that the census transformation on image gradients indeed results in a more robust cost function, regardless of aggregation strategy.
Range Imaging Modelling and Analysis
icon_mobile_dropdown
Single-plane versus three-plane methods for relative range error evaluation of medium-range 3D imaging systems
Within the context of the ASTM E57 working group WK12373, we compare the two methods that had been initially proposed for calculating the relative range error of medium-range (2 m to 150 m) optical non-contact 3D imaging systems: the first is based on a single plane (single-plane assembly) and the second on an assembly of three mutually non-orthogonal planes (three-plane assembly). Both methods are evaluated for their utility in generating a metric to quantify the relative range error of medium-range optical non-contact 3D imaging systems. We conclude that the three-plane assembly is comparable to the single-plane assembly with regard to quantification of relative range error while eliminating the requirement to isolate the edges of the target plate face.
Extracting the MESA SR4000 calibrations
Time-of-flight range imaging cameras are capable of acquiring depth images of a scene. Some algorithms require these cameras to be run in `raw mode', where any calibrations from the off-the-shelf manufacturers are lost. The calibration of the MESA SR4000 is herein investigated, with an attempt to reconstruct the full calibration. Possession of the factory calibration enables calibrated data to be acquired and manipulated even in “raw mode.” This work is motivated by the problem of motion correction, in which the calibration must be separated into component parts to be applied at different stages in the algorithm. There are also other applications, in which multiple frequencies are required, such as multipath interference correction. The other frequencies can be calibrated in a similar way, using the factory calibration as a base. A novel technique for capturing the calibration data is described; a retro-reflector is used on a moving platform, which acts as a point source at a distance, resulting in planar waves on the sensor. A number of calibrations are retrieved from the camera, and are then modelled and compared to the factory calibration. When comparing the factory calibration to both the “raw mode” data, and the calibration described herein, a root mean squared error improvement of 51:3mm was seen, with a standard deviation improvement of 34:9mm.
Enhancing swimming pool safety by the use of range-imaging cameras
D. Geerardyn, S. Boulanger, M. Kuijk
Drowning is the cause of death of 372.000 people, each year worldwide, according to the report of November 2014 of the World Health Organization.1 Currently, most swimming pools only use lifeguards to detect drowning people. In some modern swimming pools, camera-based detection systems are nowadays being integrated. However, these systems have to be mounted underwater, mostly as a replacement of the underwater lighting. In contrast, we are interested in range imaging cameras mounted on the ceiling of the swimming pool, allowing to distinguish swimmers at the surface from drowning people underwater, while keeping the large field-of-view and minimizing occlusions. However, we have to take into account that the water surface of a swimming pool is not a flat, but mostly rippled surface, and that the water is transparent for visible light, but less transparent for infrared or ultraviolet light. We investigated the use of different types of 3D cameras to detect objects underwater at different depths and with different amplitudes of surface perturbations. Specifically, we performed measurements with a commercial Time-of-Flight camera, a commercial structured-light depth camera and our own Time-of-Flight system. Our own system uses pulsed Time-of-Flight and emits light of 785 nm. The measured distances between the camera and the object are influenced through the perturbations on the water surface. Due to the timing of our Time-of-Flight camera, our system is theoretically able to minimize the influence of the reflections of a partially-reflecting surface. The combination of a post image-acquisition filter compensating for the perturbations and the use of a light source with shorter wavelengths to enlarge the depth range can improve the current commercial cameras. As a result, we can conclude that low-cost range imagers can increase swimming pool safety, by inserting a post-processing filter and the use of another light source.
Evaluating the capability of time-of-flight cameras for accurately imaging a cyclically loaded beam
Time-of-flight cameras are used for diverse applications ranging from human-machine interfaces and gaming to robotics and earth topography. This paper aims at evaluating the capability of the Mesa Imaging SR4000 and the Microsoft Kinect 2.0 time-of-flight cameras for accurately imaging the top surface of a concrete beam subjected to fatigue loading in laboratory conditions. Whereas previous work has demonstrated the success of such sensors for measuring the response at point locations, the aim here is to measure the entire beam surface in support of the overall objective of evaluating the effectiveness of concrete beam reinforcement with steel fibre reinforced polymer sheets. After applying corrections for lens distortions to the data and differencing images over time to remove systematic errors due to internal scattering, the periodic deflections experienced by the beam have been estimated for the entire top surface of the beam and at witness plates attached. The results have been assessed by comparison with measurements from highly-accurate laser displacement transducers. This study concludes that both the Microsoft Kinect 2.0 and the Mesa Imaging SR4000s are capable of sensing a moving surface with sub-millimeter accuracy once the image distortions have been modeled and removed.
Poster Session
icon_mobile_dropdown
Precise deformation measurement of prestressed concrete beam during a strain test using the combination of intersection photogrammetry and micro-network measurement
Rudolf Urban, Jaroslav Braun, Martin Štroner
The prestressed thin-walled concrete elements enable the bridge a relatively large span. These structures are advantageous in economic and environmental way due to their thickness and lower consumption of materials. The bending moments can be effectively influenced by using the pre-stress. The experiment was done to monitor deformation of the under load. During the experiment the discrete points were monitored. To determine a large number of points, the intersection photogrammetry combined with precise micro-network were chosen. Keywords:
A simple and flexible calibration method of non-overlapping camera rig
Banglei Guan, Yang Shang, Qifeng Yu, et al.
A simple and flexible method for non-overlapping camera rig calibration that includes camera calibration and relative poses calibration is presented. The proposed algorithm gives the solutions of the cameras parameters and the relative poses simultaneously by using nonlinear optimization. Firstly, the intrinsic and extrinsic parameters of each camera in the rig are estimated individually. Then, a linear solution derived from hand-eye calibration scheme is proposed to compute an initial estimate of the relative poses inside the camera rig. Finally, combined non-linear refinement of all parameters is performed, which optimizes the intrinsic parameters, the extrinsic parameters and relative poses of the coupled camera at the same time. We develop and test a novel approach for calibrating the parameters of non-overlapping camera rig using camera calibration and hand-eye calibration method. The method is designed inter alia for the purpose of deformation measurement using the calibrated rig. Compared the camera calibration with hand-eye calibration separately, our joint calibration is more convenient in practice application. Experimental data shows our algorithm is feasible and effective.
The phase correlation algorithm for stabilization of capillary blood flow video frames
The capillary blood flow parameters recovery is one of the videocapillaroscopy objectives. Capillaries position can vary at recorded video sequences due to the registration features of capillary blood flow. Stabilization algorithm of capillary blood flow video frames based on the advanced phase correlation is proposed and investigated. This algorithm is compared to the advanced version of known algorithms of video frames stabilization based on full-frame superposition and key points detection. Programs based on discussed algorithms are compared by processing of the experimentally recorded video sequences of human capillaries and by processing of computer-simulated video frames sequences with the specified offset. The full-frame superposition algorithm provides high quality of stabilization, however, the program based on this algorithm requires significant computational resources. Software implementation of the key points detection algorithm is characterized by good performance, but provides low quality of stabilization for capillary blood flow video sequences. Algorithm based on phase correlation method provides high quality of stabilization and program realization of this algorithm requires minimal computational resources. It is shown that the phase correlation algorithm is the most useful for stabilization of video sequences for capillaries blood flow. Obtained results can be used in software for biomedical diagnostics.
Efficient estimation of orthophoto images using visibility restriction
The orthophoto image which is generated using an aerial photo is used in river management, road design and the various fields since the orthophoto has ability to visualize land use with position information. However, the image distortion often occurs in the ortho rectification process. This image distortion is used to estimate manually by the evaluation person with great time. The image distortion should be automatically estimated from the view point of efficiency of the process. With this motive, formed angle V between view vector at exposure point and normal vector at center point of a patch area was focused in this paper. In order to evaluate the relation between image distortion and formed angle V, DMC image which were acquired 2000m height were used and formed angle V for 10m×10m patch was adopted for computing visibility restriction. It was confirmed that image distortion occurred for the patch which show rather than 69 degree of the formed angle V. Therefore, it is concluded that efficient orthophoto visibility restriction is able to perform using the formed angle V as visibility restriction in this paper.
Action cameras and low-cost aerial vehicles in archaeology
M. Ballarin, C. Balletti, F. Guerra
This research is focused on the analysis of the potential of a close range aerial photogrammetry system, which is accessible both in economic terms and in terms of simplicity of use. In particular the Go Pro Hero3 Black Edition and the Parrot Ar. Drone 2.0 were studied. There are essentially two limitations to the system and they were found for both the instruments used. Indeed, the frames captured by the Go Pro are subject to great distortion and consequently pose numerous calibration problems. On the other hand, the limitation of the system lies in the difficulty of maintaining a flight configuration suitable for photogrammetric purposes in unfavourable environmental conditions. The aim of this research is to analyse how far the limitations highlighted can influence the precision of the survey and consequent quality of the results obtained. To this end, the integrated GoPro and Parrot system was used during a survey campaign on the Altilia archaeological site, in Molise. The data obtained was compared with that gathered by more traditional methods, such as the laser scanner. The system was employed in the field of archaeology because here the question of cost often has a considerable importance and the metric aspect is frequently subordinate to the qualitative and interpretative aspects. Herein one of the products of these systems; the orthophoto will be analysed, which is particularly useful in archaeology, especially in situations such as this dig in which there aren’t many structures in elevation present. The system proposed has proven to be an accessible solution for producing an aerial documentation, which adds the excellent quality of the result to metric data for which the precision is known.
Miniaturized 3D microscope imaging system
We designed and assembled a portable 3-D miniature microscopic image system with the size of 35x35x105 mm3 . By integrating a microlens array (MLA) into the optical train of a handheld microscope, the biological specimen’s image will be captured for ease of use in a single shot. With the light field raw data and program, the focal plane can be changed digitally and the 3-D image can be reconstructed after the image was taken. To localize an object in a 3-D volume, an automated data analysis algorithm to precisely distinguish profundity position is needed. The ability to create focal stacks from a single image allows moving or specimens to be recorded. Applying light field microscope algorithm to these focal stacks, a set of cross sections will be produced, which can be visualized using 3-D rendering. Furthermore, we have developed a series of design rules in order to enhance the pixel using efficiency and reduce the crosstalk between each microlens for obtain good image quality. In this paper, we demonstrate a handheld light field microscope (HLFM) to distinguish two different color fluorescence particles separated by a cover glass in a 600um range, show its focal stacks, and 3-D position.
Improving depth estimation from a plenoptic camera by patterned illumination
Richard J. Marshall, Chris J. Meah, Massimo Turola, et al.
Plenoptic (light-field) imaging is a technique that allows a simple CCD-based imaging device to acquire both spatially and angularly resolved information about the “light-field” from a scene. It requires a microlens array to be placed between the objective lens and the sensor of the imaging device1 and the images under each microlens (which typically span many pixels) can be computationally post-processed to shift perspective, digital refocus, extend the depth of field, manipulate the aperture synthetically and generate a depth map from a single image. Some of these capabilities are rigid functions that do not depend upon the scene and work by manipulating and combining a well-defined set of pixels in the raw image. However, depth mapping requires specific features in the scene to be identified and registered between consecutive microimages. This process requires that the image has sufficient features for the registration, and in the absence of such features the algorithms become less reliable and incorrect depths are generated. The aim of this study is to investigate the generation of depth-maps from light-field images of scenes with insufficient features for accurate registration, using projected patterns to impose a texture on the scene that provides sufficient landmarks for the registration methods.
Frequency-spatial cues based sea-surface salient target detection from UAV image
Xiaoliang Sun, Xiaolin Liu, Qifeng Yu, et al.
This paper proposes an algorithm for salient target detection from Unmanned Aerial Vehicles (UAV) sea surface image using frequency and spatial cues. The algorithm is consisted of three parts: background suppression in the frequency domain, adaptive smoothing of the background suppressed image and salient target detection via adaptive thresholding, region growth and cluster. The sea surface background in UAV image is modeled as non-salient components which correspond to the spikes of the amplitude spectrum in the frequency domain. The background suppression is achieved by removing the spikes using a low pass Gaussian kernel of proper scale. In order to eliminate the negative effects brought by the complex textures, a Gaussian blur kernel is introduced to process the background suppressed image and its scale is determined by the entropy of the background suppressed image. The salient target is detected using adaptive thresholding, region growth and cluster performed on the blurred background suppressed image. Experiments on a large number of images indicate that the algorithm proposed in this paper can detected the sea surface salient target accurately and efficiently.