Proceedings Volume 5006

Stereoscopic Displays and Virtual Reality Systems X

Andrew J. Woods, John O. Merritt, Stephen A. Benton, et al.
cover
Proceedings Volume 5006

Stereoscopic Displays and Virtual Reality Systems X

Andrew J. Woods, John O. Merritt, Stephen A. Benton, et al.
View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 30 May 2003
Contents: 14 Sessions, 61 Papers, 0 Presentations
Conference: Electronic Imaging 2003 2003
Volume Number: 5006

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Stereoscopic Display Systems
  • Autostereoscopic Displays I: Integral Imaging
  • Autostereoscopic Displays II
  • Autostereoscopic Displays III
  • Stereoscopic Video
  • Stereoscopic Image Coding
  • Human Factors I
  • Human Factors II
  • Stereoscopic Image Processing
  • Poster Pop Session
  • Human Factors I
  • Poster Pop Session
  • Techniques and Applications
  • Video-based Image Techniques and Emerging Work
  • Focused Research
  • Augmented Reality
  • Video-based Image Techniques and Emerging Work
  • Techniques and Applications
  • Video-based Image Techniques and Emerging Work
Stereoscopic Display Systems
icon_mobile_dropdown
Light loss reduction of LCD polarized stereoscopic projection
Victor A. Elkhov, Yuri N. Ovechkis
The overwhelming majority of LCD projectors that are used for polarized stereoscopic projection have linear polarized output with two colors in one direction and the third color in an orthogonal direction (e.g. green is horizontal and red and blue are vertical). During standard conversion of two projectors light to orthogonal or circular clockwise and counter-clockwise polarization more than fifty percent of light energy is lost. This paper considers a method of polarized stereoscopic projection that gives less light loss. For this purpose the light of each LCD projector is converted to circular polarization by using quarter wave retarder plates. Orientation of its optical axes are chosen so that green light from the first projector and red and blue light from the second projector are circularly polarized in the clockwise direction. The remaining colors are circular polarized in the counter-clockwise polarization. Simultaneously color transformation of stereo pairs pictures is realized. Green component of first picture is mixed with red and blue component of second picture and vice versa. This method enables the observation of good quality stereoscopic images using glasses with circular polarizers. In the case of glasses with linear polarization, half wave plates for conversion of LCD projectors light are used.
Random dot stereograms generated with ray tracing as a visualization tool for evaluating stereoscopic camera systems
Tools for evaluation of candidate stereoscopic camera systems, including subjective impression are invaluable during the development of the system architecture. Ray tracing has long been used to predict the performance of optical elements used in camera systems. Stray light ray tracing analysis software utilizing a Monte Carlo method for generating random rays provides ray-intercept maps for two stereoscopic camera image planes that are used to build a random dot stereogram (RDS). The visual fusion of the RDS produced by random rays traced through a model of the candidate system provides an impression of the system quality. The impact of system parameters as well as imperfections in the optical design can thus be visualized and even quantified by the observer's ability to separate objects modeled at different distances from the stereoscopic camera(s). This paper describes the technique for generating an RDS using Lambda Research Corporation's TracePro software and provides examples of system performance with and without the introduction of optical imperfections.
Building a large-scale high-resolution tiled rear-projected passive stereo display system based on commodity components
Glenn Bresnahan, Raymond Gasser, Augustinas Abaravichyus, et al.
The Boston University Deep Vision Display Wall is a large scale, high-resolution, tiled, rear-projected, passive stereo display system based on commodity components. Using Linux on PC workstations provides an affordable infrastructure for distributed processing and high-end graphics. Passive stereo eliminates the need to genlock the display video cards as well as allowing the use of very inexpensive glasses and inexpensive projectors. By careful selection of projectors, polarizing filters, and screen material, problems such as cross-talk, chromatic aberration, and low luminance are minimized. The 8'x15' display surface at Boston University is installed as one wall of the viewing room. The final installation will use 24 workstations driving 24 projectors to produce a 4096x2304 stereo image. This paper discusses development issues including synchronization of displays, alignment of projectors, blending of overlapping projected images, and constraints on projectors and projection surface to support passive stereo. Performance comparisons are given for configurations including workstations driving one vs. two projectors, Ethernet vs. Myrinet for intermachine communication, and overall display performance. Also discussed are software issues involved in putting the system together and with providing an application environment for our user community.
Investigation into screenless 3D TV
Christian Moller, Oliver S. Cossairt, Stephen A. Benton, et al.
If a three dimensional image is to be projected into mid-air in a room with bare walls, then light must follow a curving path. Since this does not happen in a vacuum, a gradient must be introduced into the refractive index of air itself, which can be introduced by varying either the temperature or pressure of air. A reduction from 300°C to room temperature across the front of a 1 mm wide ray will bend it with a radius of curvature of 3 m. However the temperature gradient cannot be sustained without an unacceptably aggressive mechanism for cooling. The pressure gradients delivered by sound waves are dynamically sustainable, but even powers as extreme as 175 dBm at 25 kHz deliver a radius of curvature of only 63 m. It appears that something will have to be added to the air if such displays are to be possible.
Binocular retinal scanning laser display with integrated focus cues for ocular accommodation
Brian T. Schowengerdt, Eric J. Seibel, John P. Kelly, et al.
In natural vision, the oculomotor processes of accommodation (focus) and vergence (angle between lines of sight of two eyes) are reflexively linked such that a change in one drives a matching change in the other. Conventional stereoscopic displays require viewers to decouple these processes, and accommodate at a fixed distance while dynamically varying vergence to view objects at different stereoscopic distances. This decoupling generates eye fatigue and compromises image quality. We describe a binocular display that generates variable accommodation cues that match vergence and stereoscopic retinal disparity demands, better approximating natural vision and leading to decreased eye fatigue. In our display, a luminance modulated laser beam is reflected from a deformable membrane mirror and raster scanned. The scan is converged at the entrance pupil of the viewer’s eye, creating a Maxwellian view of the displayed image. As the beam is scanned, the deformable membrane mirror dynamically changes the beam divergence angle to present images at different focal distances. The display has a large range of focus (closer than the viewer’s near point to infinity) and presents images at 60 Hz. The accommodation response of a viewer to the display was measured objectively using an infrared autorefractor.
Autostereoscopic Displays I: Integral Imaging
icon_mobile_dropdown
Integral three-dimensional television based on superhigh-definition video system
Jun Arai, Makoto Okui, Masaki Kobayashi, et al.
In an integral three-dimensional television (integral 3-D TV) system, 3-D images are reconstructed by integrating the light beams from elemental images captured by a pickup system. 160(H) x 118(V) elemental images are used for reconstruction in this system. We use a camera with 2000 scanning lines for the pickup system and a high-resolution liquid crystal display for the display system and have achieved an integral 3-D TV system with approximately 3000(H) x 2000(V) effective pixels. Comparisons with theoretical resolution and viewing angle are performed, and it is shown that the resolution and viewing angle of 3-D images are improved about 2 times and 1.5 times respectively compared to previous system. The accuracy of alignment of microlenses is another factor that should be considered for integral 3-D TV system. If the lens array of the pickup system or display system is not aligned accurately, positional errors of elemental images may occur, which cause the 3-D image to be reconstructed at an incorrect position. The relation between positional errors of elemental image and reconstructed image is also shown. As a result, the 3-D images reconstructed far from the lens array are greatly influenced by such positional error.
Full parallax images with a diamond shape pixel cell
Jung-Young Son, Vladmir V. Saveljev, Yong-Jin Choi, et al.
Characteristics and two building methods of diamond shaped pixel cell are introduced. It can provide wider horizontal direction size of viewing zone compared with its corresponding square or rectangular pixel cell and reduce the pseudoscopic effect. The two building methods are named as integer and non-integer depending on the number of different view pixels involved with the pixel cell. The full parallax images generated by these two methods shows that the integer method provides better image quality than the non-integer.
Computer generation of integral 3D images with maximum effective viewing zone
Jinsong Ren, Amar Aggoun, Malcolm McCormick
For computer generated integral images, a transition line can be observed when the viewer shifts parallel to the lens sheet and reaches the edge of current viewing zone during viewing. This is due to the transition from current viewing zone to the next. The images generated using conventional algorithms will suffer from big transition zone, which damages the replaying visual effect and greatly decreases the effective viewing width. This phenomenon is especially apparent for large size images. The conventional computer generation algorithms of integral images use the same boundary configuration as the micro-lenses, which is straightforward and easy to be implemented, but the cause of large transition zone and narrow viewing angle. This paper presents a novel micro-image configuration and algorithm to solve the problem. In the new algorithm, the boundaries of micro-images are not confined by the physical boundaries but normally larger than them. To achieve the maximum effective viewing width, each micro-image is arranged according to the rules decided by several constraints. The considerations in the selection of optimal parameters are discussed, and new definitions related to this issue are given.
Integral 3D imaging that has an enhanced viewing angle along full directions with no mechanical movement
The integral 3D imaging provides full motion parallax unlike other conventional stereoscopy-based techniques. To make most of this advantage, a 3D system that has a wide view along full direction is required. In this paper, we propose and demonstrate a method to enhance the viewing angle in integral imaging along both horizontal and vertical direction. Proposed system consists of two sub systems that have properly designed polarizing masks. The viewing angle enhancement along full direction is achieved by elemental lens switching that combines time and spatial multiplexing of different elemental image arrays. Some experimental results and the method to avoid a cross-talk effect are shown and discussed.
Digital three-dimensional object reconstruction and correlation based on integral imaging
Integral images contain multiple views of a scene obtained from slightly different points of view. They therefore include three-dimensional (3D) information - including depth - about the scenes they represent. In this paper, we propose to use this depth information contained in integral images in order to recognize 3D objects. The integral images are first used to estimate the longitudinal distances of the objects composing the 3D scene. Using this information, a 3D model of the scene is reconstructed in the computer. These models are then used to compute digital 3D correlations between various scenes and objects. For a better discrimination we use a nonlinear 3D correlation. We present experimental results for digital 3D reconstruction of real 3D scenes containing several objects at various distances. With these experimental data, we demonstrate the recognition and 3D localization of objects through nonlinear correlation. We investigate the effect of the nonlinearity strength in the correlation. We finally present experiments to show that the three-dimensional correlation is more discriminant than the two-dimensional correlation.
Autostereoscopic Displays II
icon_mobile_dropdown
High-resolution autostereoscopic immersive imaging display using a monocentric optical system
Joshua M. Cobb, David Kessler, John A. Agostinelli, et al.
An autostereoscopic display system was designed and constructed. The design, which uses pupil imaging with a curved mirror, is described. It employs a monocentric configuration to enable a wide field of view and large pupils while keeping the lens diameters small to fit them within the interoccular separation. For each eye, image formation was accomplished using 1920 x 1200 liquid crystal on silicon (LCOS) spatial light modulators in a 3-panel configuration. The design employs custom curved diffusers, which were developed to optimize throughput, contrast, and pupil illumination uniformity.
Real-image-based autostereoscopic display using LCD, mirrors, and lenses
The author presents a new version of FLOATS (Flesnel-Lens-based Optical Apparatus for Touchable-distance Stereoscopy) system. The autostereoscopic display FLOATS combines parallax presentation and real image generation by lenses to show realistic 3-D images within the reach of the viewer. In the conventional FLOATS, polarizing filters or liquid crystal shutters are used to separate images projected to the eyes, which has caused cross-talk noise and reduction of brightness. This paper proposes use of combined mirrors instead of filters or shutters to realize parallax presentation. In this system the images for the right eye and the left eye are displayed side by side on the screen. Then the combined mirrors reflect each image to shift it to the center so that the optical geometry of this system may be completely the same as the conventional FLOATS display. This new method avoids both cross-talk noise and reduction of brightness and enables presentation of more realistic 3-D image which gives less eyestrain to the viewer. In addition to it, a LCD panel can be used as the screen part of this system because it is only required to show two images side by side. As a result the size of the system is reduced compared with the conventional system.
Desktop autostereoscopic display using compact LED projector
Hiroki Kaneko, Tetsuya Ohshima, Osamu Ebina, et al.
A stereoscopic display using a curved directional reflection (CDR) screen and projectors is a promising approach towards realizing an immersive three-dimensional (3D) display system. The CDR screen consists of a corner reflective mirror sheet for horizontal focusing and an anisotropic diffuser sheet for vertical diffusion. The CDR 3D display can provide bright and large images without the need for special glasses. In this paper, we introduce this immersive 3D display technique onto the desktop display. To realize this concept, we have developed the compact projectors with light emitting diode (LED) light sources and a liquid crystal on silicon (LCOS). These have allowed the realization of 65 mm width projectors that are able to be put side by side at the interocular distance. The efficient optical system of an LED array as area-light-sources combined with an ultra high gain (>100) CDR screen have allowed for a desktop autostereoscopic display whose luminance is more than 100 cd/m2 with only 9 W power consumption. This system provides immersive 3D images for only the observer and keeps his privacy.
Second version of 3D display system by fan-like array of projection optics
Toshio Honda, Masaya Shimomatsu, H. Imai, et al.
We have been developing a Super Multi-View 3-D display system that enables natural 3-D vision using super multi-perspective images with fine perspective-pitch, which is narrower than the half of pupil diameter of viewer's eyes. We call the display system Fan-like Array of Projection Optics (in short FAPO system). The length of the first version of FAPO system was too long. So, down sizing of FAPO system was tried by using small LCD panels. Consequently, the length of FAPO system became about half compared with the first version. For widening the horizontal viewing zone, we have introduced an eye-position tracking system. The eye-position of the observer is precisely measured in real time by processing face image of the observer illuminated by infrared light. And, in the eye-position tracking system of the display system, the position of viewing zone is changed by rotating the concave mirror used in the display system. At the same time, the 3-D images to be displayed are also changed according to the position, using the measured eye-position data. The eye-position tracking allows observer to watch naturally 3-D images.
Autostereoscopic Displays III
icon_mobile_dropdown
Special features of stereo visualization in multichannel autostereoscopic display from 4D vision
Our company 4D-Vision develops technology based on the wave length selective filter array which allows to observe the stereo-images and animations in TFT or plasma displays of any size, from 3.9 to 50 inches and even more and at relatively low cost. The other advantage of our original technology is that the stereo-images may be viewed on our 3D displays by many users simultaneously and without having to use any additional viewing aids. In this paper we present in matrix form the different types of the stereo-image encoding which may be realized in our 3D display. In particular, we show that the colour stereo-image on our 3D display may be obtained also from the set of the 2D gray scaled images. We present also the results of the investigation of stability of the stereo-image with respect to the definite perturbations dependent on an angle between neighbour perspectives. The general relations, allowing to evaluate the distribution of the 2D images in one stereo-image are also presented. Some of them were already realized in our 3D display. At this time 4D-Vision manufactures 3D-displays using up to 40 channels for the stereo-image representation. We show, that the presented results may be also used in stereo projection devices, based on 4D-Vision technology.
Full-time full-resolution dual stereoscopic/autostereoscopic display or rock-solid 3D on a flat screen: with glasses or without!
Paul Kleinberger, Ilan Kleinberger, Hillel Goldberg, et al.
A stereoscopic or autostereoscopic display based on this technology provides full resolution and freedom of movement, but with no flicker. Simply put, the display is neither spatially nor temporally multiplexed. It sounds unbelievable, but it's true -- an autostereoscopic display where each eye sees every displayable pixel on the screen at all times. This technology is designed for flat-panel displays, such as LCDs and has the following characteristics: (1) The display is not spatially multiplexed. Each eye sees the full native resolution of the entire screen. (2) The display is not temporally multiplexed. The image for each eye is visible continuously, i.e., at all times. (3) In its simplest form, this technology provides a full-time, full-resolution stereoscopic display for multiple viewers wearing passive polarizing glasses. (4) A variation of this technology can be used to make a full-time, full-resolution stereoscopic projection system for viewers wearing passive polarizing glasses using just a single projector. (5) With the addition of a dynamic aiming mechanism, and an adjustment in the display's output, we can create a single-user, full-time, full-resolution autostereoscopic display requiring no glasses and providing full freedom of movement. Software applications can use the same information about viewwer position to provide natural, full "look-around." (6) A hybrid version of the display can alternate between autostereoscopic (single-user, no glasses) and stereoscopic modes (multi-user, passive glasses).
Position- and velocity-depending subpixel correction for spatially multiplexed autostereoscopic displays
Markus Andiel, Siegbert Hentschke
In recent years there have been a growing demand in stereoscopic displays to perceive realistic 3D presentations of objects with depth perception. Our Person-adaptive Autostereoscopic Display (PAM) is easy in usage and cost-effective to establish and so ideal for mass-market. For successful acceptance in a wide range of applications, accurate and efficient image generation and presentation on the spatial-multiplex autostereoscopic display is absolutely necessary. This includes the simulation of motion parallax as well as presentation of two perspectives on display with low cross-talk. This paper describes first the image generation steps, then the relation and arrangement of the perspective image pixel to the display screen executed by software. Therefore the x,y,z-observer's eye-positions are incorporated. Both steps are capsulated in an extension to OpenGL 3D graphic API. For generating the two perspectives, a stereo buffer on the graphic board is not necessary.
Three-dimensional volumetric display by inclined-plane scanning
Daisuke Miyazaki, Takuma Eto, Yasuhiro Nishimura, et al.
A volumetric display system based on three-dimensional (3-D) scanning that uses an inclined two-dimensional (2-D) image is described. In the volumetric display system a 2-D display unit is placed obliquely in an imaging system into which a rotating mirror is inserted. When the mirror is rotated, the inclined 2-D image is moved laterally. A locus of the moving image can be observed by persistence of vision as a result of the high-speed rotation of the mirror. Inclined cross-sectional images of an object are displayed on the display unit in accordance with the position of the image plane to observe a 3-D image of the object by persistence of vision. Three-dimensional images formed by this display system satisfy all the criteria for stereoscopic vision. We constructed the volumetric display systems using a galvanometer mirror and a vector-scan display unit. In addition, we constructed a real-time 3-D measurement system based on a light section method. Measured 3-D images can be reconstructed in the 3-D display system in real time.
SOLID FELIX: a static volume 3D-laser display
The two basic classes of volumetric displays are swept volume techniques and static volume techniques. During several years of investigations on swept volume displays within the FELIX 3D Project we learned about some significant disadvantages of rotating screens, one of them being the presence of hidden zones, and therefore started investigations on static volume displays two years ago with a new group of high school students. Systems which are able to create a space-filling imagery without any moving parts are classified as static volume displays. A static setup e.g. a transparent crystal describes the complete volume of the display and is doped with optically active ions of rare earths. These ions are excited in two steps by two intersecting IR-laser beams with different wavelengths (two-frequency, two-step upconversion) and afterwards emit visible photons. Suitable host materials are crystals, various special glasses and in future even polymers. The advantage of this approach is that there are only very little hidden zones which leads to a larger field of view and a larger viewing zone, the main disadvantage is the small size of the currently used fluoride crystals. Recently we started working with yttrium-lithium-fluoride (YLiF4) crystals, which are still very small but offer bright voxels with less laser-power than necessary in CaF2 crystals. Potential applications are for example in medical imaging, entertainment and computer aided design.
Stereoscopic Video
icon_mobile_dropdown
Parallax Player: a stereoscopic format converter
The Parallax Player is a software application that is, in essence, a stereoscopic format converter. Various formats may be inputted and outputted. In addition to being able to take any one of a wide variety of different formats and play them back on many different kinds of PCs and display screens. The Parallax Player has built into it the capability to produce ersatz stereo from a planar still or movie image. The player handles two basic forms of digital content - still images, and movies. It is assumed that all data is digital, either created by means of a photographic film process and later digitized, or directly captured or authored in a digital form. In its current implementation, running on a number of Windows Operating Systems, The Parallax Player reads in a broad selection of contemporary file formats.
Development of the 960p stereoscopic video format
With high definition video tools still out of reach for all but the best funded projects, there exists a need and a desire to maximize the usefulness of standard definition video for the creation of stereoscopic motion pictures. Through digital signal processing, NTSC video can be used to record progressive scan images of 720 x 480 resolution. The 960p system further enhances the quality of standard definition video as a stereoscopic production format. By utilizing two full resolution standard definition video streams, 960p allows the viewer’s brain to perceive even greater resolution. Dual progressive scan DVDs are employed along with dual DLP projectors and passive sheet polarizers. The system represents a breakthrough in cost and performance for stereoscopic motion picture production.
Measurement of parallax distribution and its application to the analysis of visual comfort for stereoscopic HDTV
Yuji Nojiri, Hirokazu Yamanoue, Atsuo Hanazato, et al.
The relationship between visual comfort and parallax distribution for stereoscopic HDTV has been studied. In this study, we first examined a method for measuring this parallax distribution. As it is important to understand the characteristics of the distribution in a frame or temporal changes of the characteristics, rather than having detailed information on the parallax at every point, we propose a method to measure the parallax based on the phase correlation. It includes a way of reducing the measurement error depending on the phase correlation method. The method was used to measure stereoscopic HDTV images with good results. Secondly, we conducted a subjective evaluation test of visual comfort and sense of presence using 48 different stereoscopic HDTV pictures, and compared the results with the parallax distributions in these pictures measured by the proposed method. The comparison showed that the range of parallax distribution and the average parallax distribution significantly affect visual comfort when viewing stereoscopic HDTV images. It is also suggested that the range of parallax distribution in many of the images that were judged comfortable to view is located within approximate 0.3Diopter.
Stereoscopic Image Coding
icon_mobile_dropdown
Low-bandwidth stereoscopic image encoding and transmission
Julien Flack, Philip V. Harman, Simon Fox
Encoding 3D information using depth maps is quickly becoming the dominant technique for rendering high quality stereoscopic images. This paper describes how depth maps can be highly compressed and transmitted alongside 2D images with minimal additional bandwidth. The authors have previously described a rapid 2D to 3D conversion system for generating depth maps. This system, which relies on Machine Learning algorithms, effectively encodes the relationships between a 2D source image and the associated depths of objects within the image. These relationships, which are expressed in terms of the colour and position of objects, may be exploited to provide an effective compression mechanism. This paper describes the practical implementation of this technology in an integrated 2D to 3D conversion system. We demonstrate the advantages of the encoding scheme relative to other industry standard compression techniques, examining issues relating to bandwidth, decoding performance and the effect of compression artifacts on stereoscopic image quality.
Perceptual evaluation of JPEG-coded stereoscopic images
JPEG compression of the left and right components of a stereo image pair is a way to save valuable bandwidth when transmitting stereoscopic images. This paper presents results on the effects of camera-base distance and JPEG-coding on overall image quality, perceived depth, perceived sharpness and perceived eye-strain. In the experiment, two stereoscopic still scenes were used, varying in depth (three different camera-base distances: 0, 8 and 12 cm) and compression ratio (4 levels: original, 1:30, 1:40 and 1:60). All levels of compression were applied to both the left and right stereo image, resulting in a 4x4 matrix of all possible symmetric and asymmetric coding combinations. We applied the single stimulus method for subjective testing according to the ITU 500-10 recommendations. The observers were asked to assess image quality, sharpness, depth and eye-strain. Results showed that JPEG coding had a negative effect on image quality, sharpness and eye-strain but had no effect on perceived depth. An increase in camera-base distance increased perceived depth and reported eye-strain but had no effect on perceived sharpness. Furthermore, both sharpness and eye-strain correlated highly with perceived image quality.
Effect of the compression of the depth map image on depth-fused 3D image quality
Kazutake Uehira, Keiichiro Kono, Kazumi Komiya, et al.
A depth-fused 3-D (DFD) display, which is composed of two 2-D images displayed at different depths, is a new 3-D display proposed recently and enables an observer using no extra equipment to perceive an apparent 3-D image. The original data for it are a 2-D image of objects and a depth map image of objects. The two 2-D images are formed by dividing the luminance of an original 2-D image between the two 2-D images according to a depth data of objects at each pixel. This paper presents the effect of the compression of the depth map image on a DFD image. We studied on still pictures using JPEG as an algorithm for compression. After decoding the depth map image, 3-D images were displayed forming the two 2-D images. The main result obtained from subjective evaluations is that the effect of the compression noises appearing on its decoded image appears as errors of position in depth on DFD image, however, a higher compression rate is possible for depth map image than for conventional 2-D image. This result shows that is is advantageous to transmit or store the original data before forming the two 2-D images.
Progressive coding of stereo images using a hybrid scheme
Torsten Palfner, Erika Mueller
In this paper, we propose a coder based on the Discrete Multiwavelet Transform, block-based disparity estimation and interpolation. By using the Discrete Multiwavelet Transform instead of the popular Discrete Cosine Transform we avoid artifacts which appear at low bit rates in the reconstructed stereo image and which do not only affect the subjective quality of each image individually, but also the depth perception. Our existing coder consists of an adapted state-of-the-art still image coder. The correlation between the two images is exploited by disparity compensation using overlapping blocks. Since the full-search block matching algorithm is very time-consuming, an interpolation factor of two has been used by our coder so far. However, to be able to improve the disparity compensation for certain images, our coder is upgraded by increasing the interpolation factor. To prevent an immense slow-down, faster search algorithms could be implemented. These faster algorithms can reduce the search space considerably. They do not give optimal compensation results, but in conjunction with a higher interpolation factors, they still can improve the performance. The results published in this paper are competitive.
Human Factors I
icon_mobile_dropdown
Determinants of perceived image quality: ghosting vs. brightness
Laurie M. Wilcox, Jeffrey A. D. Stewart
The physical specifications of stereoscopic eyewear are routinely documented. However, their effects on the appearance or perceived quality of 3D images is most often evaluated superficially, if at all. Here we apply psychophysical techniques to assess the influence of ghosting and perceived brightness on judgements of image quality. To determine which of these variables has the largest impact we simulated several levels of ghosting and brightness in a digital version of a 70mm 3D image sequence. We then presented these image sequences in a large-format 3D theatre and used a magnitude estimation task to assess image quality. The data were clear in showing a significant effect of ghosting on perceived quality but no effect of image brightness. From this we argue that image ghosting is a critical determinant of perceived image quality and should be a primary consideration in relevant technology decisions.
Improving the visual comfort of stereoscopic images
Lew B. Stelmach, Wa James Tam, Filippo Speranza, et al.
We compared the visual comfort and apparent depth of stereoscopic images for three camera configurations: parallel (without image shift), image-shifted and converged. In the parallel and image-shifted configurations, the stereo cameras were pointed straight ahead. In the converged configuration the cameras were toed-in. In the image-shifted configuration the image frame was shifted perpendicularly with respect to the line of sight of the camera. The parallel configuration produces images with uncomfortably large disparities for objects near the camera. By converging the cameras or by shifting the image, these large disparities can be reduced and visual comfort can be improved. However, the converged configuration introduces keystone distortions into the image, which can produce visual discomfort. The image-shifted configuration does not introduce keystone distortions, but affects the width of the image frame. It also requires unusual camera hardware or computer post-processing to shift the images. We found that converged and image-shifted configurations improved the visual comfort of stereoscopic images by an equivalent amount, without affecting the apparent depth. Keystone distortions in the converged configuration had no appreciable negative effect on visual comfort.
Enhancement of stereoscopic comfort by fast control of frequency content with wavelet transform
Nicolas Lemmer, Guillaume Moreau, Philippe Fuchs
As the scope of virtual reality applications including stereoscopic imaging becomes wider, it is quite clear that not every designer of a VR application thinks of its constraints in order to make a correct use of stereo. Stereoscopic imagery though not required can be a useful tool for depth perception. It is possible to limit the depth of field as shown by Perrin who has also undertaken research on the link between the ability of fusing stereoscopic images (stereopsis) and local disparity and spatial frequency content. We will show how we can extend and enhance this work especially on the computational complexity point of view. The wavelet theory allows us to define a local spatial frequency and then a local measure of stereoscopic comfort. This measure is based on local spatial frequency and disparity as well as on the observations made by Woepking. Local comfort estimation allows us to propose several filtering methods to enhance this comfort. The idea to modify the images such as they check a “stereoscopic comfort condition” defined as a threshold for the stereoscopic comfort condition. More technically, we seek to limit high spatial frequency content when disparity is high thanks to the use of fast algorithms.
Human Factors II
icon_mobile_dropdown
Evaluating accuracy and precision in a stereoscopic display: perception of 3D object motion
Stereoscopic depth is often included in the design of tele-operation or Virtual Reality (VR) systems, with the expectation that it will enhance a participant’s feeling of presence in a scene, and improve perceptual accuracy. Our aim here was to test the latter assertion: is human stereoscopic depth perception accurate? We examined how well humans can use stereoscopic information to perceive and respond to a simple object undergoing three dimensional (3-D) motion. Observers viewed a scene containing a stationary reference point and a target point that moved towards them in depth, along a range of trajectories, to the left or right of straight towards their nose. How good should performance be? Simple geometry can be used to show that the average and difference of the left and right eye’s projections can be used to estimate trajectory angles. How good is human performance? In several different tasks, results suggested that although observers could distinguish between different trajectories precisely, their accuracy of perception was very poor. Angles were perceived as up to 3-5 times wider than was physically specified. This suggests that stereoscopic depth does not provide accurate perception in simple environments and has implications for the design of 3-D Virtual Environments.
Comparison of stereoscopic and nonstereoscopic video images for visual telephone systems
Wa James Tam, Andre Vincent, Ronald Renaud, et al.
Possible differences in perceptual qualities between stereoscopic and non-stereoscopic images were investigated using a field-sequential stereoscopic display. A total of forty non-expert viewers were asked to rate overall image quality, sharpness and sense of presence using the double-stimulus continuous quality scale (ITU-R Recommendation 500). Viewers rated a set of five video sequences (stereoscopic and non-stereoscopic) each presented at four levels of image quality that were obtained by varying the quantization level (Q=0, 32, 36, and 39) of a generic H.264 video compression codec. Each sequence was 8 seconds long at 30 frames per second and the spatial resolution of each frame was common image format (CIF, 352 x 240 pixels). Image size was 15.5 cm x 11.6 cm and scene contents were representative of visual telephone systems with one, two or three individuals. The experimental results showed that viewers' ratings depended on the sequences and that there was no reliable difference between stereoscopic and non-stereoscopic sequences in terms of image quality and perceived sharpness. However, binocular disparity tended to improve ratings of sense of presence. We conclude that incorporating stereoscopic information into visual telephone systems can be useful for enhancing sense of presence.
A survey of perceptual quality issues in three-dimensional television systems
Three-dimensional television (3DTV) is often mentioned as a logical next step following high-definition television (HDTV). A high quality 3-D broadcast service is becoming increasingly feasible based on various recent technological developments combined with an enhanced understanding of 3-D perception and human factors issues surrounding 3DTV. In this paper, perceptually relevant issues, in particular stereoscopic image quality and visual comfort, in relation to 3DTV systems are reviewed. We discuss how the principles of a quantitative measure of image quality for conventional 2-D images, based on identifying underlying attributes of image quality and quantifying the perceived strengths of each attribute, can be applied in image quality research for 3DTV. In this respect, studies are reviewed that have focussed on the relationship between subjective attributes underlying stereoscopic image quality and the technical parameters that induce them (e.g. parameter choices in image acquisition, compression and display). More specifically, artifacts that may arise in 3DTV systems are addressed, such as keystone distortion, cross-talk, cardboard effect, puppet theatre effect, and blur. In conclusion, we summarize the perceptual requirements for 3DTV that can be extracted from the literature and address issues that require further investigation in order for 3DTV to be a success.
Stereoscopic Image Processing
icon_mobile_dropdown
Stereoscopic visualization and reconstruction of turbulent flames
Wen B. Ng, Yang Zhang
The 3D surface topology of turbulent diffusion flames is reconstructed and visualized using stereoscopic methodology. The basic stereo apparatus used in present study consists of a high-resolution digital camera and a stereo adapter, which is mounted to the front filter ring of the camera. A pair of stereo images therefore could be formed through the same lens system and recorded simultaneously on the same charge coupled device (CCD) chip. The digitally reconstructed 3D results have also been validated by optical stereoscopic viewing using a pair of electronic shutter glasses synchronized with computer monitor. The results have demonstrated that the technique is a very powerful diagnostic tool in combustion studies.
Artifact reduction in lenticular multiscopic 3D displays by means of anti-alias filtering
Janusz Konrad, Philippe Agniel
This paper addresses the issue of artifact visibility in automultiscopic 3-D lenticular displays. A straightforward extension of the two-view lenticular autostereoscopic principle to M views results in an M-fold loss of horizontal resolution due to the subsampling needed to properly multiplex the views. In order to circumvent the imbalance between the horizontal and vertical resolution, a tilt can be applied to the lenticules to orient them at a small angle to the vertical direction, as is done in the SynthaGram display from Stereographics Corp. In either case, to avoid aliasing the subsampling should be preceded by suitable lowpass pre-filtering. Although for purely vertical lenticules a sufficiently narrowband lowpass horizontal filtering suffices, the situation is more complicated for diagonal lenticules; the subsampling of each view is no more orthogonal, and more complex sampling models need to be considered. Based on multidimensional sampling theory, we have studied multiview sampling models based on lattices. These models approximate pixel positions on a lenticular automultiscopic display and lead to optimal anti-alias filters. In this paper, we report results for a separable approximation to non-separable 2-D anti-alias filters based on the assumption that the lenticule slant is small. We have carried out experiments on a variety of images, and different filter bandwidths. We have observed that the theoretically-optimal bandwidth is too restrictive; aliasing artifacts disappear, but some image details are lost as well. Somewhat wider bandwidths result in images with almost no aliasing and largely preserved detail. For subjectively-optimized filters, the improvements, although localized, are clear and enhance the 3-D viewing experience.
Producing anaglyphs from synthetic images
Distance learning and virtual laboratory applications have motivated the use of inexpensive visual stereo solutions for computer displays. The anaglyph method is such a solution. Several techniques have been proposed for the production of anaglyphs. We discuss three approaches: the Photoshop algorithm and its variants, the least squares algorithm proposed by Eric Dubois that optimizes in the CIE color space, and the midpoint algorithm that minimizes the sum of the distances between the anagylph color and the left and right eye colors in CIEL*a*b*. Our results show that each method has its advantages and disadvantages in faithful color representation and in stereo quality as it relates to region merging and ghosting.
Hardware-accelerated autostereogram rendering for interactive 3D visualization
Single Image Random Dot Stereograms (SIRDS) are an attractive way of depicting three-dimensional objects using conventional display technology. Once trained in decoupling the eyes' convergence and focusing, autostereograms of this kind are able to convey the three-dimensional impression of a scene. We present in this work an algorithm that generates SIRDS at interactive frame rates on a conventional PC. The presented system allows rotating a 3D geometry model and observing the object from arbitrary positions in real-time. Subjective tests show that the perception of a moving or rotating 3D scene presents no problem: The gaze remains focused onto the object. In contrast to conventional SIRDS algorithms, we render multiple pixels in a single step using a texture-based approach, exploiting the parallel-processing architecture of modern graphics hardware. A vertex program determines the parallax for each vertex of the geometry model, and the graphics hardware's texture unit is used to render the dot pattern. No data has to be transferred between main memory and the graphics card for generating the autostereograms, leaving CPU capacity available for other tasks. Frame rates of 25 fps are attained at a resolution of 1024x512 pixels on a standard PC using a consumer-grade nVidia GeForce4 graphics card, demonstrating the real-time capability of the system.
Adaptive disparity estimation scheme using balanced stereo image sequences
Kyung-Hoon Bae, Yong-Ok Kim, Sang-Woo Lee, et al.
In this paper, an adaptive stereo matching method using a sequence of the balanced stereo image pair is proposed. The balanced stereo image pair can be acquired by applying the balance compensation scheme to the input stereo image pair, and then stereo matching is carried out by using the disparity vectors estimated from this balanced stereo image pair, in which the matching window size for disparity estimation is adaptively selected depending on the magnitude of the feature values. Because the balance compensation is able to alleviate the problem of luminance imbalance between the input stereo image pair, performance of the disparity estimation can be improved. Moreover, since the resultant image-smoothing effect occurring in the balanced stereo image can help to reduce not only the unreliable intensity matching but also the ambiguous matching in the process of disparity estimation, effective reconstruction of the stereo image can be expected as well. From some experiments using stereo image pairs of 'Piano' and 'Claude', it is shown that the proposed method improves the PSNR of a reconstructed image up to 6.08 dB on average within the search ranges of ∮± 30 by comparing with that of conventional algorithms.
Synthesizing stereo 3D views from focus cues in monoscopic 2D images
In this paper we propose a monoscopic 2D to stereo 3D conversion system. Our process of producing a stereo 3D system from 2D images requires to estimate a relative depth map of the objects in the image that comprises the real world 3D geometry of the scene initially captured. Subsequently, we map the estimated depth into two perspective image views, left and right, with an artificially synthesized parallax between them. We present a depth estimation method based on measuring focus cues, which consists of a local spatial frequency measurement using multiresolution wavelet analysis and a Lipschitz regularity estimation of significant edges; resulting in a pixel resolution depth map. Based on this relative depth map, the stereo 3D image is synthesized with a method that uses interpolated image row sections to artificially generate parallax in the left and right perspective views, and thus when viewed with a stereo 3D display system induce a sense of stereopsis to the observer.
Poster Pop Session
icon_mobile_dropdown
Parallel-axis stereoscopic camera with vergence control and multiplexing functions
Gwangsoon Lee, Namho Hur, Chung-Hyun Ahn, et al.
Among the 3D stereoscopic cameras to acquire the stereo views, the parallel-axis stereo camera is considered as the simplest one of binocular stereo cameras. However, it is not able to control vergence since its left and right imaging sensors are fixed. In order to overcome such limitations of the stereoscopic cameras, we propose a parallel-axis stereoscopic camera that has functions for the vergence control and video multiplexing simultaneously, which can be implemented by simple and real-time processing without image deterioration. In this paper, we simulate the effects of the vergence control according to the proposed methods, which is accomplished by the over-sampling at ADC and extracted disparity with help of multiplexing function. It is confirm that the processed stereoscopic images by the proposed PASC are very comfortable for viewing on the 3D display within a limited disparity range.
Human Factors I
icon_mobile_dropdown
How crosstalk affects stereopsis in stereoscopic displays
KuoChung Huang, Jy-Chyi Yuan, Chao-Hsu Tsai, et al.
The ghost-image issue induced by crosstalk in stereoscopic, especially autostereoscopic, display systems has been believed to be a major factor to jeopardize stereopsis. Nevertheless, it is found that in some cases the stereopsis remains effective even with serious crosstalk. In fact, many other factors, such as contrast ratio, disparity, and monocular cues of the images, play important roles in the fusion of stereo images. In this paper, we study the factors in an image that may affect stereo fusion, and provide a macroscopic point of view to get a reasonable criterion of system crosstalk. Both natural and computer-generated images are used for detailed evaluation. Image processing techniques are adopted to produce desired characteristics. The results of this research shall be of reference value to content makers of stereoscopic displays, in addition to their designers.
Examination of a stereoscopic 3D display system using a correction lens
This paper describes an examination of a stereoscopic 3-D display system using a correction lens. The purpose of the system is to reduce the accommodation and convergence difference during viewing stereoscopic 3-D images by using simple technique. This correction lens is a mono-focal lens, and added to the polarized filter glasses. In this study, the authors carried out three experiments in order to examine the appropriate utilization conditions and effects of the correction lens. In experiment 1, the refractive power of correction lens was examined under six conditions in which distances of accommodation and convergence were theoretically equal. In experiment 2, the presenting condition of stereoscopic 3-D images suitable for the correction lens was examined by measurement of refractions during viewing visual target that moved in depth direction. In experiment 3, the effectiveness of the correction lens was examined by using the utilization conditions obtained in experiment 1 and 2. From the results of the experiments, the following conclusions were drawn. (1) Correction lenses shift the accommodation distance. (2) Using a correction lens with the appropriate refractive power and setting the appropriate conditions for presenting stereoscopic 3-D images reduced the difference between accommodation and convergence. (3) The use of a correction lens affected the subjective symptoms of asthenopia.
Poster Pop Session
icon_mobile_dropdown
Pioneering block-based stereo image CODEC in wavelet domain
Eran Anusha Edirisinghe, M. Yunus Nayan, Helmut E. Bez
In this paper we propose the wavelet domain implementation of our original pioneering block based stereo image compression algorithm and compare its performance with traditional, DCT based and state-of-the-art, DWT based stereo image compression algorithms. Due to the special requirements of the pioneering block based CODEC and the properties of DWT based multi-resolution decomposition, the implementation of the original algorithm in the wavelet domain is not straightforward and thus provides knowledge and understanding of significant novelty. Experiments were performed on a set of eight stereo image pairs representing, natural, synthetic, in-door and out-door images. We show that for the same bit rates, objective quality gains of up to 5 dB (PSNR) are obtained as compared to the benchmark algorithms. One significant property of the proposed CODEC is its ability to produce reconstructed right images of up to 25 dB at right image bit rates as low as 0.1 bpp. Significant gains in subjective image quality are also obtained as compared to benchmark methods.
Large-scale projection using integral imaging techniques
Rohit Kotecha, Malcolm McCormick, Neil A. Davies
Currently, several 3D stereoscopic projection systems exist where the audience are required to wear some sort of visual aid. A number of research groups are investigating autostereoscopic displays, as these systems are more acceptable to the casual observer as they require no additional visual aids. Most autostereoscopic projection displays are stereoscopic or employ multiview techniques. Both approaches are limited in their ability to present realistic 3D images with natural viewing attributes. The paper reports on the results of experiments carried out to evaluate large-scale 3D integral projection. Two projection arrangements (single-lens un-corrected optics and double-lens corrected optics) are reported and their advantages and disadvantages described. Live capture and computer generated integral images have been projected back to “life-size.” The observer can interact with the 3D image by reaching into the presented volumetric image space.
Automatic control of parallel stereoscopic camera by disparity compensation
Ki-Chul Kwon, Young-Soo Choi, Nam Kim, et al.
Parallel stereoscopic camera has a linear relationship between vergence and focus control. We introduced the automatic control method for a stereoscopic camera system that uses the relationship between vergence and focus of a parallel stereoscopic camera. The automatic control method uses disparity compensation of the acquired image pair from the stereoscopic camera. For faster extraction of disparity information, the proposed binocular disparity estimation method by the one-dimensional cepstral filter algorithm would be investigated. The suggested system in this study substantially reduced the controlling time and error-ratio so as to make it possible to achieve natural and clear images.
Techniques and Applications
icon_mobile_dropdown
New visibility computing algorithm for three-dimensional indoor walkthroughs
At present, Cybercity has introduced the visualization f 3D buildings, and the further development necessarily includes various applications of 3D scenes from outdoor to indoor. Nevertheless, as a large furnished 3D architectural model is usually made up of millions of polygons, ideal frame rates for smooth interactive walkthroughs are hardly available. Visibility processing for potentially visible set of polygons (PVS) is of great importance in improving performance for interactive indoor walkthroughs. A novel algorithm of constructing room-to-room PVS and view-to-room PVS for various architectural structures, which may be concave, or non-axis-aligned, etc., is proposed in this paper. Test shows it can drastically improve the performance of real-time interactive walkthroughs in a large furnished architectural 3D model.
Toward enhanced data consistency in distributed virtual environments
Distributed virtual environments are rapidly gaining in popularity for the implementation of intuitive and collaborative workspaces. In distributed virtual environments, geographically dispersed user sites possess considerable capabilities for computing and cooperation with other user sites. Primary challenges that have to be addressed by these systems are compensating network latency jitters, keeping system-wide data consistent, enabling fair resource sharing and interaction between the users. This paper reviews a global time-stamp based approach, which is developed by authors to enhance the fairness and consistency across the distributed virtual environments. The approach is described in combination with three different implementation philosophies, a centralized approach similar to client-server model, a decentralized approach similar to peer-to-peer model, and a combined approach consisting of hierarchical layers of centralized and decentralized approaches. Based on a new object-oriented real-time programming methodology called the time-triggered message-triggered object (TMO) programming scheme, two different implementations were tested and compared.
Virtual immersive review for car design
Damien Paillot, Fred Merienne, Marc Neveu, et al.
In this paper, a method to link CAD models to an immersive virtual environment is proposed. CAD models cannot be viewed directly in a real-time visualization environment. CAD models have to be adapted to be viewed in an immersive environment with high quality rendering. The proposed method allows design review in application requesting high quality complex scene visualization in immersive virtual environment. Our application is dedicated to an immersive room called the MoVE (Mobile Virtual Environment). This display offers a particular place to the user. User is inside the virtual world. This position allows us to take care of the peripheral.
Video-based Image Techniques and Emerging Work
icon_mobile_dropdown
Virtual reality applied to teletesting
Thomas J.T.P. van den Berg, Roland J. M. Smeenk, Alain Mazy, et al.
The activity "Virtual Reality applied to Teletesting" is related to a wider European Space Agency (ESA) initiative of cost reduction, in particular the reduction of test costs. Reduction of costs of space related projects have to be performed on test centre operating costs and customer company costs. This can accomplished by increasing the automation and remote testing ("teletesting") capabilities of the test centre. Main problems related to teletesting are a lack of situational awareness and the separation of control over the test environment. The objective of the activity is to evaluate the use of distributed computing and Virtual Reality technology to support the teletesting of a payload under vacuum conditions, and to provide a unified man-machine interface for the monitoring and control of payload, vacuum chamber and robotics equipment. The activity includes the development and testing of a "Virtual Reality Teletesting System" (VRTS). The VRTS is deployed at one of the ESA certified test centres to perform an evaluation and test campaign using a real payload. The VRTS is entirely written in the Java programming language, using the J2EE application model. The Graphical User Interface runs as an applet in a Web browser, enabling easy access from virtually any place.
Focused Research
icon_mobile_dropdown
Three-dimensional techniques for capturing and building virtual models of complex objects for use in scientific and industrial applications, data archiving, and the entertainment industry
The past 10 years have seen remarkable improvements in the capture of 3-dimesional data. Both scanning speeds and accuracy have increased by a magnitude. Software and increasingly more powerful computers allow larger data bases and faster post processing. CT, laser and optical scanners are finding increased use in the medical, manufacturing, scientific and entertainment industries. CT (Computerized Tomography) is generally used to capture internal as well as external surfaces. Medical (hospital) scanners are the most common and can be of service in industrial applications. But true industrial scanners service a much wider range of sizes and materials. Laser and optical scanners are line-of-sight, and are available in portable and permanent CMM mounting arrangements. Scanners are available to capture a wide range of objects; from entire buildings to fingernail sized parts. Solid objects requiring multiple scans, must register each scan to another for part completion. The collected data is exported as a “point cloud.” The data can be used to digitally inspect complex parts, surface them for tooling and reverse engineering, or export surfaces to animation software.
Studying extinct animals using three-dimensional visualization, scanning, animation, and prototyping
Technology provides an important means for studying the biology of extinct animals. Skeletons of these species must be constructed virtually by scanning in data for individual bones and building virtual models for each. These then are used to produce prototypes of each of the bones at varying scales, allowing the construction of a starter skeleton configuration and the analysis of movement along each joint. The individual virtual bones are then assembled into a starter virtual skeleton using digitized landmark points on the starter physical skeleton to help place them in three-dimensional space. This virtual skeleton is then modified and improved by analyzing the movement at each joint, using the prototype bones. Once this is done, the movement is constrained further by doing animations of the whole skeleton and noting areas of impossible overlap between bones and unreasonable movement. The problems are corrected and new animations attempted until the movement is perfected. This provides a means for understanding locomotion and mastication in these extinct animals.
Wearable augmented reality system using an IrDA device and a passometer
Ryuhei Tenmoku, Masayuki Kanbara, Naokazu Yokoya
This paper describes a wearable augmented reality system with an IrDA device and a passometer. To realize augmented reality systems, the position and orientation of user's viewpoint should be obtained in real time for aligning the real and virtual coordinate systems. In the proposed system, the orientation of user's viewpoint is measured by an inertial sensor attached to the user's glasses, and the position is measured by using an IrDA device and a passometer. First, the user's position is specified exactly when the user comes into the infrared ray range of the IrDA markers which are set up to the appointed points. When the user goes out of the infrared ray range, the user's position is estimated by using a passometer. The passometer is constructed of an electronic compass and acceleration sensors. The former can detect the user's walking direction. The latter can count how many steps the user walks. These data and the user's pace make it possible to estimate the user's position in the neighborhood of the IrDA markers. We have developed a navigation system based on using the techniques above and have proven the feasibility of the system with experiments.
Vision-based registration for augmented reality system using monocular and binocular vision
Steve Vallerand, Masayuki Kanbara, Naokazu Yokoya
In vision-based augmented reality systems, the relation between the real and virtual worlds needs to be estimated to perform the registration of virtual objects. This paper suggests a vision-based registration method for video see-through augmented reality systems using binocular cameras which increases the quality of the registration performed using three points of a known marker. The originality of this work is the use of both monocular vision-based and stereoscopic vision-based techniques in order to complete the registration. Also, a method that performs a correction of the 2D positions in the images of the marker points is proposed. The correction improves the registration stability and accuracy of the system. The stability of the registration obtained with the proposed registration method combined or not with the correction method is compared to the registration obtained with standard stereoscopic registration.
Calibration method for an omnidirectional multicamera system
Sei Ikeda, Tomokazu Sato, Naokazu Yokoya
Telepresence systems using an omnidirectional image sensor enable us to experience remote site. A omnidirectional multi-camera system is more useful to acquire outdoor scenes than a monocular camera system, because the multi-camera system can easily capture high-resolution omnidirectional images. However, exact calibration of the camera system is necessary to virtualize the real world accurately. In this paper, we describe a geometric and photometric camera calibration and a panorama movie generation method for the omnidirectional multi-camera system. In the geometric calibration, intrinsic and extrinsic parameters of each camera are estimated using a calibration board and a laser measurement system called total station. In the photometric calibration, the limb darkening and color balances among the cameras are corrected. The result of the calibration is used in the panorama movie generation. In experiments, we have actually calibrated the multi-camera system and have generated spherical panorama movies by using the estimated camera parameters. A telepresence system was prototyped in order to confirm that the panorama movie can be used for telepresence well. In addition, we have evaluated the discontinuity in generated panoramic images.
Augmented Reality
icon_mobile_dropdown
Onboard camera pose estimation in augmented reality space for direct visual navigation
Zhencheng Hu, Keiichi Uchimura
This paper presents a dynamical solution of the registration problem for on-road navigation applications via 3D-2D parameterized model matching algorithm. Traditional camera’s three dimensional (3D) position and pose estimation algorithms always employ the fixed and known-structure models as well as the depth information to obtain the 3D-2D correlations, which is however unavailable for on-road navigation applications since there are no fixed models in the general road scene. With the constraints of road structure and on-road navigation features, this paper presents a 2D digital road map based road shape modeling algorithm. Dynamically generated multi-lane road shape models are used to match real road scene to estimate camera 3D position and pose data. Our algorithms successfully simplified the 3D-2D correlation problem to the 2D-2D road model matching on the projective image. The algorithms proposed in this paper are validated with the experimental results from real road test under different conditions and types of road.
Flexible augmented reality architecture applied to environmental management
Nuno Manuel Rob Correia, Teresa Romao, Carlos Santos, et al.
Environmental management often requires in loco observation of the area under analysis. Augmented Reality (AR) technologies allow real time superimposition of synthetic objects on real images, providing augmented knowledge about the surrounding world. Users of an AR system can visualize the real surrounding world together with additional data generated in real time in a contextual way. The work reported in this paper was done in the scope of ANTS (Augmented Environments) project. ANTS is an AR project that explores the development of an augmented reality technological infrastructure for environmental management. This paper presents the architecture and the most relevant modules of ANTS. The system’s architecture follows the client-server model and is based on several independent, but functionally interdependent modules. It has a flexible design, which allows the transfer of some modules to and from the client side, according to the available processing capacities of the client device and the application’s requirements. It combines several techniques to identify the user’s position and orientation allowing the system to adapt to the particular characteristics of each environment. The determination of the data associated to a certain location involves the use of both a 3D Model of the location and the multimedia geo-referenced database.
Real-time 3D hand tracking in a virtual environment
The development of a reliable untethered interactive virtual environment has long been a goal of the VR community. Several nonmagnetic tracking systems have been developed in recent years based on optical, acoustic, and mechanical solutions. However, an inexpensive, effective, and unobtrusive tracking solution remains elusive. This paper presents a camera based three-dimensional hand tracking system implemented in the PARIS augmented reality environment and used to drive a demonstration application.
Hierarchical depth estimation for image synthesis in mixed reality
Mixed reality is different from the virtual reality in that users can feel immersed in a space which is composed of not only virtual but also real objects. Thus, it is essential to realize seamless integration and mutual occlusion of the virtual and real worlds. Therefore, we need depth information of the real scene to perform the synthesis. We propose the depth estimation algorithm with sharp object boundaries for mixed reality system based on hierarchical disparity estimation. Initial disparity vectors are obtained from downsampled stereo images using region-dividing disparity estimation technique. Then, background region is detected and flattened. With these initial vectors, dense disparities are estimated and regularized with shape-adaptive window in full resolution images. Finally, depth values are calculated by stereo geometry and camera parameters. As a result, virtual objects can be mixed into the image of real world by comparing the calculated depth values with the depth information of generated virtual objects. Experimental results show that occlusion between the virtual and real objects are correctly established with sharp boundaries in the synthesized images, so that user can observe the mixed scene with considerably natural sensation.
Video-based Image Techniques and Emerging Work
icon_mobile_dropdown
Experimental system of free viewpoint television
In this paper, we proposed a new realtime dynamic ray data acquisition and rendering system named the “Free Viewpoint Television” or “FTV”. With this system, the user can freely control the viewpoint position of any dynamic real-world scene in realtime. The basic idea of this system is based on the ray-space method in which an arbitrary photo-realistic view can be generated from a collection of real view images. Since the system is aimed for realtime operation, the collection of images is obtained through an array of cameras where the generation of the missing ray information can be obtained by interpolation of data between cameras. The prototype system used here includes 16 CCD cameras forming a camera array. The interpolation is based on the adaptive filtering ray-space data interpolation technique. Between each pair of cameras, up to 15 interpolated views can be generated to ensure that no aliasing occurs. The system fully operates under consumer-class hardware. The results achieved from the system are good in terms of both image quality and rendering speed.
Techniques and Applications
icon_mobile_dropdown
INPRES (intraoperative presentation of surgical planning and simulation results): augmented reality for craniofacial surgery
Tobias Salb, Jakob Brief, Thomas Welzel, et al.
In this paper we present recent developments and pre-clinical validation results of our approach for augmented reality (AR, for short) in craniofacial surgery. A commercial Sony Glasstron display is used for optical see-through overlay of surgical planning and simulation results with a patient inside the operation room (OR). For the tracking of the glasses, of the patient and of various medical instruments an NDI Polaris system is used as standard solution. A complementary inside-out navigation approach has been realized with a panoramic camera. This device is mounted on the head of the surgeon for tracking of fiducials placed on the walls of the OR. Further tasks described include the calibration of the head-mounted display (HMD), the registration of virtual objects with the real world and the detection of occlusions in the object overlay with help of two miniature CCD cameras. The evaluation of our work took place in the laboratory environment and showed promising results. Future work will concentrate on the optimization of the technical features of the prototype and on the development of a system for everyday clinical use.
Video-based Image Techniques and Emerging Work
icon_mobile_dropdown
Depth keying
Ronen Gvili, Amir Kaplan, Eyal Ofek, et al.
We present a new solution to the known problem of video keying in a natural environment. We segment foreground objects from background objects using their relative distance from the camera, which makes it possible to do away with the use of color for keying. To do so, we developed and built a novel depth video camera, capable of producing RGB and D signals, where D stands for the distance to each pixel. The new RGBD camera enables the creation of a whole new gallery of effects and applications such as multi-layer background substitutions. This new modality makes the production of real time mixed reality video possible, as well as post-production manipulation of recorded video. We address the problem of color spill -- in which the color of the foreground object is mixed, along its boundary, with the background color. This problem prevents an accurate separation of the foreground object from its background, and it is most visible when compositing the foreground objects to a new background. Most existing techniques are limited to the use of a constant background color. We offer a novel general approach to the problem with enabling the use of the natural background, based upon the D channel generated by the camera.
Interaction devices for hands-on desktop design
Wendy Ju, Sally Madsen, Jonathan Fiene, et al.
Starting with a list of typical hand actions - such as touching or twisting - a collection of physical input device prototypes was created to study better ways of engaging the body and mind in the computer aided design process. These devices were interchangeably coupled with a graphics system to allow for rapid exploration of the interplay between the designer's intent, body motions, and the resulting on-screen design. User testing showed that a number of key considerations should influence the future development of such devices: coupling between the physical and virtual worlds, tactile feedback, and scale. It is hoped that these explorations contribute to the greater goal of creating user interface devices that increase the fluency, productivity and joy of computer-augmented design.