Proceedings Volume 5664

Stereoscopic Displays and Virtual Reality Systems XII

cover
Proceedings Volume 5664

Stereoscopic Displays and Virtual Reality Systems XII

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 22 March 2005
Contents: 18 Sessions, 70 Papers, 0 Presentations
Conference: Electronic Imaging 2005 2005
Volume Number: 5664

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Convergence Accommodation Issues
  • Human Factors
  • Stereoscopic Image Processing
  • Autostereoscopic Displays
  • 2D to 3D Conversion
  • Stereoscopic Video
  • Stereoscopic Developments
  • Depth Mapping
  • Volumetric 3D Displays
  • Integral 3D Displays
  • Telemanipulator and Telepresence Technologies
  • Stereoscopic Display Applications
  • Systems I
  • Mixed Realities
  • Systems II
  • Systems III
  • Virtual Reality Works: Demonstration and Panel Discussion
  • Poster Session
  • Human Factors
  • Poster Session
  • Human Factors
  • Poster Session
Convergence Accommodation Issues
icon_mobile_dropdown
Stereoscopic 3D display with dynamic optical correction for recovering from asthenopia
Takashi Shibata, Takashi Kawai, Masaki Otsuki, et al.
The purpose of this study was to consider a practical application of a newly developed stereoscopic 3-D display that solves the problem of discrepancy between accommodation and convergence. The display uses dynamic optical correction to reduce the discrepancy, and can present images as if they are actually remote objects. The authors thought the display may assist in recovery from asthenopia, which is often caused when the eyes focus on a nearby object for a long time, such as in VDT (Visual Display Terminal) work. In general, recovery from asthenopia, and especially accommodative asthenopia, is achieved by focusing on distant objects. In order to verify this hypothesis, the authors performed visual acuity tests using Landolt rings before and after presenting stereoscopic 3-D images, and evaluated the degree of recovery from asthenopia. The experiment led to three main conclusions: (1) Visual acuity rose after viewing stereoscopic 3-D images on the developed display. (2) Recovery from asthenopia was particularly effective for the dominant eye in comparison with the other eye. (3) Interviews with the subjects indicated that the Landolt rings were particularly clear after viewing the stereoscopic 3-D images.
Creating a comfortable stereoscopic viewing experience: effects of viewing distance and field of view on fusional range
Elaine W. Jin, Michael E. Miller, Serguei Endrikhovski, et al.
In stereoscopic display systems, there is always a balance between creating a “wow factor,” using large horizontal disparities, and providing a comfortable viewing environment for the user. In this paper, we explore the range of horizontal disparities, which can be fused by a human observer, as a function of the viewing distance and the field of view of the display. Two studies were conducted to evaluate the performance of human observers in a stereoscopic viewing environment. The viewing distance was varied in the first study using a CRT with shutter glasses. The second study employed a large field-of-view display with infinity focus, and the simulated field of view was varied. The recorded responses included fusion/no fusion, fusion time, and degree of convergence. The results show that viewing distance has a small impact on the angular fusional range. In contrast, the field of view has a much stronger impact on the angular fusional range. A link between the degree of convergence and the fusional range is demonstrated. This link suggests that the capability of the human observer to perform eye vergence movements to achieve stereoscopic fusion may be the limiting factor in fusing large horizontal disparities presented in stereoscopic displays.
A fixed-viewpoint volumetric stereoscopic 3D display using adaptive optics
The design, implementation, and preliminary evaluation of a volumetric stereoscopic 3D display is discussed. Pixels rendered from different ranges of distance, or depth fields, in the 3D scene and are displayed field-sequentially. An adaptive optics element is used to modulate wavefront curvature for each field such that its optical distance matches its depth in the 3D scene. This allows the observer to accommodate (focus) to various depths in the scene in the same way as they do in under natural viewing conditions. The enabling of appropriate accommodation is particularly useful in stereoscopic 3D displays. These are prone to the problem of accommodation-convergence conflict, hypothesised as leading cause of visual discomfort. The system has been implemented in a binocular design, i.e. fixed-viewpoint rather than autostereoscopic, using commercially-available liquid crystal microdisplays and deformable mirror adaptive optics components.
Natural 3D display with 128 directional images used for human-engineering evaluation
A natural 3D display is required for the next-generation display techniques. We have already demonstrated that the accommodation responses are evoked when a large number of directional images, which are orthographic projections of 3D scenes, are projected into the corresponding directions with a very small angle pitch. We have also constructed a three-dimensional display which displayed 64 directional images into different horizontal directions with the horizontal angle pitch of 0.34°. With high-density directional images, not parallax images (perspective projection of 3D scenes), very smooth motion parallax is obtained in addition to the accommodation evocation. In this study we develop a new 3D display which is used to investigate the natural 3D displaying conditions on commission of the Japanese ministry of internal affairs and communications. It can display 128 directional images simultaneously by using 128 small LCD panels. The horizontal viewing angle is 29.6° and the horizontal display angle pitch is 0.23°. The screen size is 13.2 inches. The 3D display is controlled by a PC cluster consisted of 16 PC’s. We are also developing equipment which can measure the dynamic responses of the accommodation, the vergence, and the pupil diameters of both eyes. This equipment will be used to investigate the optimum 3D display conditions.
Predicting individual fusional range from optometric data
Serguei Endrikhovski, Elaine Jin, Michael E. Miller, et al.
A model was developed to predict the range of disparities that can be fused by an observer from optometric measurements. This model uses parameters, such as dissociated phoria and fusional reserves, to predict an individual’s fusional range (i.e., the disparities that can be fused on stereoscopic displays) when the user views a stereoscopic stimulus from various distances. This model is validated by comparing its output with data from a previous study in which the individual fusional range of a group of observers was quantified while they viewed a stereoscopic display from distances of 0.5, 1.0, and 2.0 meters. Overall, the model provides good data predictions for the majority of the participants and can be generalized for other viewing conditions. The model may, therefore, be used within a customized stereoscopic system, which would render stereoscopic information in a way that accounts for the individual differences in fusional range. Because the comfort of an individual user also depends on the user’s ability to fuse stereo images, such a system is described that may, consequently, improve the comfort level and viewing experience for people with different stereoscopic fusional capabilities.
Human Factors
icon_mobile_dropdown
Stereo-foveation for anaglyph imaging
Arzu Coltekin
For 1:1 displays and network visualization in stereo imaging, we suggest that foveation is a feasible and efficient compression method which also gives a good basis for Level of Detail (LOD) control. The demand for highest resolution images in the area(s) of interest is a generally wanted feature, but it is particularly important for photogrammetric 3D modelling where the precision might be highly required depending on the project. Particularly for 1:1 stereo-viewing in large screen displays such as panoramic screens or caves, the actual area of interest is much smaller than the whole screen. Instead of loading the whole image pair, we would foveate both images, and project them after the stereo-foveation. This principally gives us the best possible resolution in the area of interest and still a very good overview of the neighbouring areas to navigate and locate other areas of interest as the human eyes do throughout the non-uniform 3D image. We test the idea with an anaglyph pair, and create a hybrid model by combining algorithms that deal with anaglyph imaging; disparity maps, foveation and we create a LOD function for the resolution control along the z axis.
Perceived smoothness of viewpoint transition in multi-viewpoint stereoscopic displays
Filippo Speranza, Wa James Tam, Taali Martin, et al.
In this study, we conducted three experiments to investigate the perceived smoothness of multiview images. Different viewpoints of a stereoscopic scene were generated in real-time. The left-eye and right-eye views of each viewpoint were viewed stereoscopically, from a distance of 120 cm, with shutter glasses synchronized to the display. In Experiment 1, new and different vantage points of the scene were displayed as the viewer moved his/her head left and right in front of the display. Viewers rated the perceived smoothness of the scene for different viewpoint densities, i.e., number of viewpoints displayed per unit of amplitude of lateral movement, and extent of look-around, i.e., angular separation between the leftmost and rightmost rendered viewpoints. The second and third experiments were similar with the exception that the change in displayed viewpoint was either controlled by the viewer’s hand (Experiment 2) or occurred without any intervention on the part of the viewer (Experiment 3). Perceived smoothness improved with increasing viewpoint density up to about 4-6 views per cm in all three experiments. Smoothness ratings were somewhat lower in Experiments 1 and 2 than in 3. The perceived smoothness of viewpoint transition was affected by the extent of look-around in Experiments 1 and 2 only.
Stereoscopic Image Processing
icon_mobile_dropdown
Camera system for arbitrary viewpoint 3D display system
Hideya Takahashi, Yoshinori Nakano, Kenji Yamada
We have developed an arbitrary viewpoint 3D display system. This system consists of multiview camera system for 3D data capturing and real-time color autostereoscopic 3D display system. The multiview camera system is based on the image-based rendering technique. To reconstruct a 3D image of a real 3D object, this 3D display system requires many multiview images. Therefore, it is difficult to implement multiview camera system. In order to overcome this problem, we propose the interpolation method to decrease the number of required multiview images. This paper describes multiview camera system to obtain the ray data of real 3D objects for arbitrary viewpoint 3D display system. This camera system consists of 9 CMOS cameras, and their pixels are randomly accessible. Thus, the ray data to reconstruct 3D images of captured real 3D objects can be synthesized from 9 CMOS cameras directly. Each ray data to reconstruct captured real 3D objects is calculated by the FPGA board, and the pixel corresponds to each ray data, which is given by the FPGA board to the PC. Therefore, arbitrary viewpoint 3D display with proposed camera system can display real-time captured 3D objects. As the experimental results, the 3D image of real object is reconstructed.
Real-time stereo imaging of gaseous phenomena
Gaseous phenomena such as clouds, fog, and mist have been difficult to render in realistic monoscopic imaging environments. Such phenomena are transparent, cast shadows, have dynamic behavior, and are of variable density. This paper describes a method based on splatting, billboarding, and alpha-blending that works well in a realistic real-time stereo imaging environment. Splatting is used to reconstruct a discretely sampled 3D volume to produce a 2D image with appropriate density. Efficient reconstruction is gained through the use of texture-mapped billboards whose transparencies are determined by a Gaussian reconstruction kernel. To achieve the fastest rendering, it is possible to orient all billboards to face the viewplane rather than the viewpoint. The parallax error introduced by this approach is analyzed. The authors give examples to illustrate how the number, position, and size of the billboards in a scene can be used to create different effects. The discussion does not treat the problems of self-shadowing or dynamic behavior although the approach could be used as the basis for simulating both.
Stereoscopic image rendering based on depth maps created from blur and edge information
Wa James Tam, A. Soung Yee, J. Ferreira, et al.
Depth image based rendering (DIBR) is suited for 3D-TV and for autostereoscopic multiview displays. With DIBR, each 2D image captured with a camera at a given position has an associated depth map. This map is used to process the original 2D image so as to generate new images as if they were taken from different camera viewpoints. In the present study we examined the depth and image quality of stereoscopic 3D images that were generated using surrogate depth maps, that is, maps that were created using blur and edge information from the original 2D images. Depth maps were created with three different methods. Formal subjective assessments indicated that the stereoscopic images thus created have enhanced depth quality, with a marginal loss in image quality, when compared to the original non-stereoscopic images. This finding of enhanced depth is surprising because the surrogate depth maps contained limited depth information and mainly at object boundaries. We speculate that the visual system combines the information from pictorial depth cues and from depth interpolation between object boundaries and edges to arrive at an overall perception of depth. The methods for creating the depth maps for stereoscopic imaging that were investigated in this study might be used in applications where depth accuracy is not critical.
An extended H.264 CODEC for stereoscopic video coding
We propose an extension to the H.264 video coding standard, which is capable of efficiently coding stereoscopic video sequences. In contrast to previous techniques, the proposed Stereoscopic Video CODEC uses a single modified H.264 encoder and a single modified H.264 decoder in its design. The left (reference) and right (predicted) sequences are fed alternatively to the encoder. The modified H.264 encoder uses a Decoded Picture Buffer Store (DPBS) in addition to the regular DPB of the original H.264 encoder. An effective buffer management strategy between DPBS and DPB is used so that the left sequence frames are coded only based on its previously coded frames while the right frames are coded based on previously coded frames from both left and right sequences. We show that the proposed CODEC has the capability of exploiting worldline correlation present in stereo video sequences, in addition to the exploitation of joint spatial-temporal-binocular correlation. Further we show that the coded bit stream fully conforms to a standard H.264 bit-stream and a standard H.264 decoder will be able to effectively decode the left video stream ignoring the right. We provide experimental results on two popular test stereoscopic video sequences to prove the efficiency of the proposed CODEC
Recovery of a missing color component in stereo images (or helping NASA find little green Martians)
The current exploration of Mars by the National Aeronautics and Space Administration (NASA) has produced a lot of images of its surface. Two rovers, "Spirit" and "Opportunity", are each equipped with a pair of high-resolution cameras, called "PanCam". While most commercial cameras are sensitive to three spectral bands, typically red (R), green (G) and blue (B), the "PanCam" is sensitive to many more bands since it was designed to deliver additional information to geologists. This is achieved by means of a filter wheel in front of each camera lens. It turns out that slightly different filters are used in both cameras; while the left camera is equipped with red, green and blue filters, among others, the right camera does not have a green filter on its color wheel. Therefore, since the G component of the right image is missing, currently it is not possible to view a 3D image of Mars surface in color. In this paper, we develop a method to reconstruct one missing color component of an image given its remaining color components and all three components of the other image of a stereo pair. The method relies on disparity-compensated prediction. In the first step, a disparity field is estimated using the two available components (R and B). In the second step, the missing component is recovered using disparity-compensated prediction from the same component (G) in the other image of the stereo pair. In ground-truth experiments, we have obtained high PSNR values of the reconstruction error confirming efficacy of the approach. Similar reconstructions using images transmitted by the rovers yield comfortable 3D experience when viewing with shutter glasses.
Autostereoscopic Displays
icon_mobile_dropdown
Autostereoscopic desktop display: an evolution of technology
A new technology for creating a large stereoscopic image has been developed and has evolved over several years. The optical apparatus for creating a large, distortion-free image has changed from a bulky, immersive viewing system to a display that can sit on a desktop and creates a comfortable stereo image that can be viewed for long periods of time without eyestrain. The central idea of creating the images with a monocentric optical system has remained constant; however, the application of monocentricity has changed over several designs. A monocentric design is one where multiple spherical optical surfaces share the same center of curvature. The advantage of this type of system is that it allows the image quality to be corrected over a very wide field of view with a large pupil. The first system was presented at the Stereoscopic Displays and Applications Conference in 2003. This system was based upon custom digital projectors creating images on two curved diffusers, which were then imaged by a ball lens. The final collimation of the images was done with a 36-inch radius mirror. This system was designed as proof of a concept for the technology, and it was not practical to market it as a product solution. This led to a desktop solution that utilized twin LCD displays with monocentric imaging engines that had separate collimating mirrors. There were various improvements to this configuration that ultimately resulted in a high-resolution, bright, low-distortion stereo image. After a brief review of the previous technology, the various embodiments of the desktop display will be discussed.
Time-multiplexed autostereoscopic flat panel display using an optical wedge
Christian Moller, Adrian Travis
Time multiplexed autostereoscopic displays are often associated with complex optics design and have not yet been made in a flat panel format. The reason is mainly because the high bandwidth image sources are not available as flat panel displays. The optical Wedge developed at Cambridge University compresses the optics into a single flat waveguide, which allows for a flat panel time multiplexed autostereoscopic display. By using an active shutter synchronized with a custom built high frame rate DLP light engine we suggest two approaches for creating a flat panel 3D display. The limitations on size are purely limited by the shutter size. However, we also suggest a solution where a small shutter can be used to create a large sized display.
Multi-view image integration system for glass-less 3D display
Takahisa Ando, Ken Mashitani, Masahiro Higashino, et al.
We have developed a multi-view image integration system which combines seven parallax video images into a single video image so that it fits the parallax barrier. The apertures of this barrier are not stripes but tiny rectangles that are arranged in the shape of stairs. Commodity hardware is used to satisfy a specification which requires that the resolution of each parallax video image is SXGA(1645×800 pixel resolution), the resulting integrated image is QUXGA-W(3840×2400 pixel resolution), and the frame rate is fifteen frames per second. The point is that the system can provide with QUXGA-W video image, which corresponds to 27MB, at 15fps, that is about 2Gbps. Using the integration system and a LCD display with the parallax barrier, we can enjoy an immersive live video image which supports seven viewpoints without special glasses. In addition, since the system can superimpose the CG images of the relevant seven viewpoints into the live video images, it is possible to communicate with remote users by sharing a virtual object.
Three-dimensional multiview large projection system
Ingo Relke, Bernd Riemann
During of some last years our company investigates and produces 3D autostereoscopic displays of the different sizes. The multi-view 3D display from our company consists of the ordinary flat TFT LCD or plasma display in front of which the special optical filter is situated. In this paper we describe our last achievements for the creation of the large 3D systems, consisting of a set of identical display panels. In particular, we consider here a construction of the autostereoscopic back projection system. In this case a situation is more complicated, because colour value of every R G, and B subpixel is projected on the same place that leads to the decrease of the 3D resolution and quality of the observed stereo-image. The other problem is the adjustment of the 3D large system, so the special approaches are discussed. The corresponding equations for the calculation of the stereo-image for the projection wall will be presented. We demonstrate also that for the same structure of the optical filter the different number of views may be used. Based on the presented principles and approaches our company creates the first in the world 3D large projection system. The future directions of development of the 3D large display are also discussed.
Correction of aberrations in lens-based 3D displays
Aberrations have been a persistent limiting factor in 3D displays that are based on lens arrays. This paper describes layered polymer microlens and lenticular arrays that can be used to cost-effectively correct for lens aberrations. Arrays are described which have embedded apertures of non-circular cross-section, i.e. aspheric or acylindrical surfaces, that correct for prominent aberrations across a significant angular viewing field (typically 30-60°). Corrected lens systems are described, and their relative theoretical and practical performance shown. Aberrations, particularly spherical aberration and field curvature, have historically forced a tradeoff between viewing angle and view resolution. It will be demonstrated that the corrected arrays can surpass existing lens arrays in their attainable visual depth, their ability to reduce optical cross-talk, and their capacity for displaying large numbers of views within multiview systems. In addition, it will be demonstrated how these particular lens geometries may be used to exclude light from neighboring lens cells due to their equiangular relationship to certain concave surface geometries at the critical angle of internal reflection (θ). Lens cells in the equiangular design do not need to be partitioned from one another. This dual utility makes the designs applicable to many lens-array-based 3D displays, and potentially to lens-array-based 3D acquisition systems as well.
2D to 3D Conversion
icon_mobile_dropdown
Automatic video to stereoscopic video conversion
Efrat Rotem, Karni Wolowelsky, David Pelz
In this paper a method to convert a monoscopic video movie to a stereoscopic video movie is presented. This method is based on passively acquired video images using a single camera and may be applied in PC based real time system. Current methods for generating single camera stereoscopic videos and ad-hoc standards are based on creating a depth map. The depth map calculation is based on structure from motion methods. In order to work properly the depth map should be very dense and accurate. Otherwise, local deformations may occur. Our proposed method is based on calculating a planar transformation between images in the sequence and relies on the human capability to sense the residual parallax. Therefore it does not depend on calculation of the depth map. This advantage is of great importance when deformation of even small objects is forbidden as in reconnaissance and medical systems. The proposed method generates stereoscopic image pairs. Each pair consists of the original image and a transformed image. The transformed image is generated from another image of the original sequence. That image is selected by the algorithm such that a considerable parallax is developed between the two original frames. The chosen image is then warped using the planar transformation. Our algorithm assumes a good planar transformation approximation between the images. Thus, residual disparities are related to the distance from the average plane. As such, zooming and rotation are eliminated after the warping process, and the eyes integrate the images to 3D scene based on the disparity. The algorithm has been tested and is demonstrated on aerial scenes as well as terrestrial using a hand held camera.
Interactive 2D to 3D stereoscopic image synthesis
Advances in stereoscopic display technologies, graphic card devices, and digital imaging algorithms have opened up new possibilities in synthesizing stereoscopic images. The power of today’s DirectX/OpenGL optimized graphics cards together with adapting new and creative imaging tools found in software products such as Adobe Photoshop, provide a powerful environment for converting planar drawings and photographs into stereoscopic images. The basis for such a creative process is the focus of this paper. This article presents a novel technique, which uses advanced imaging features and custom Windows-based software that utilizes the Direct X 9 API to provide the user with an interactive stereo image synthesizer. By creating an accurate and interactive world scene with moveable and flexible depth map altered textured surfaces, perspective stereoscopic cameras with both visible frustums and zero parallax planes, a user can precisely model a virtual three-dimensional representation of a real-world scene. Current versions of Adobe Photoshop provide a creative user with a rich assortment of tools needed to highlight elements of a 2D image, simulate hidden areas, and creatively shape them for a 3D scene representation. The technique described has been implemented as a Photoshop plug-in and thus allows for a seamless transition of these 2D image elements into 3D surfaces, which are subsequently rendered to create stereoscopic views.
Stereoscopic Video
icon_mobile_dropdown
New version of HD stereoscopic camera and its picture quality assessment concerning the camera parameters
Jun-Yong Lee, Seung-Jin Nam, Jae-Ho Lee, et al.
We have improved the HD stereoscopic camera system reported last year. Though the previous version shows good performance in many aspects, we felt some more accurate control mechanism should be realized after various types of trial shooting in field. So we have changed several parts of it. For controlling the separation between two cameras and the convergence of the parallel-axis style stereoscopic camera system, we replaced the linear motor system in the first version with small DC motors. And by changing the lens with full-digitally controlled HD lens, we could control both of the lenses more accurately. For the preparation of the real-time image composition with computer graphics, namely mixed-reality, in this version we fixed the updating frequency of the camera parameters to 60 Hz. In addition, for better zoom-convergence interlocked control, we made the look-up table with much more steps so even smoother operation of zoom-convergence control is accomplished. And, we have done subjective evaluation test on the acquired pictures. As we have implemented the function of storing and retrieving the major parameters of the stereoscopic camera, we could precisely analyze the relationship between the result of picture quality assessment and the camera parameters.
Pre-rendered stereoscopic movies for commodity display systems
John Moreland, Laura Arns, W. Scott Meador
Immersive stereoscopic display systems built from commodity PCs and equipment are becoming increasingly common. Although such systems are generally used for interactive experiences, it is occasionally useful to instead display stereoscopic movies. This paper discusses a method of creating pre-rendered stereoscopic movies for display with an inexpensive, commodity-based, passive stereoscopic display wall. Playback methods, possible uses, and experiences with early movies are also discussed. We first describe our display system, which uses the spanned desktop mode of video cards with dual video outputs, and a pair of projectors with polarizing filters. We then explain our method for creating stereo pair movies (left/right or top/bottom) using video compositing software. While the concept is straightforward, one difficulty that must be addressed is using an appropriate video codec that can be displayed in full screen, across two displays, at the desired resolution, file size, playback rate, and visual quality. A number of tests were performed to find video codecs that would be suitable for stereo movies based on these attributes. Results are provided in a comparison of multiple codecs that can aid in the successful implementation of this method. Several codecs are recommended, but specific codecs should be chosen based on individual needs.
OpenGL hardware accelerated algorithms for autostereoscopic monitor pattern creation
Autostereoscopic monitors generally require complicated image pattern creation based on reprocessing of multiple scene views. The computational power necessary for such reprocessing is very high when real-time output of images is required at refresh rates higher than 24 fps and at high output resolutions from 1600x1200, to as much as 3840 x 2400. The optimal method is to do such reprocessing by the help of graphic card HW and not by CPU. We solved output creation for 3 types of autostereosopic monitors: generic autostereoscopic monitors requiring column-interlaced pattern, the Sharp RD3D autostereoscopic notebook, and monitors based on StereoGraphics SynthaGram principles. OpenGL stencil buffer operations were used for implementation of the output for monitors requiring column-interlaced patterns as well as output for Sharp RD3D notebook. We have tested 3 different implementations of SynthaGram like pattern creation - pure fixed pipeline OpenGL 1.2 method, nVidia Cg based GPU programming method, and a method using a mixture of both approaches. Benchmarking of all methods on different nVidia graphic card models were made. We were primary focused on application for multi-view stereoscopic video processing in DepthQ Stereoscopic Media Server software during the development, but identical methods as used for video could be employed for real time CG scene reprocessing.
Integral 3D imaging system using monocular 2D video and depth data
Koya Suehiro, Hiroya Nakamura, Kunio Yamada, et al.
An autostereoscopic imaging system, which displays moving actual images created from 2D video and depth data is proposed. The system consists of a depth camera, signal processing algorithms, and an integral display prototype. The specifications are as follows: (a) Depth camera: a monocular VGA video camera with optical distance sensor captures 2D texture image and its depth map simultaneously. (b) Signal processing: 10*10 parallax images are created from the video stream, and the angle between two neighboring parallax images is 0.8 degrees. (c) Display: The display device is contact type (LCD) or projection type (LCOS), and the optical device is a microlens-array. 2D resolution of the 3D image is 720 cycle/radian at a viewing distance and a viewing zone is about 8 degrees. Noticeable defects due to occlusions are not found within the viewing zone. The method of transmission of texture and depth data has higher compatibility with conventional coding technologies than transmission using compressed parallax images. And the transmission method enables the choice of various types of displays and how the contents are displayed. Therefore we believe that this system is a candidate for the future 3D television system.
Stereoscopic Developments
icon_mobile_dropdown
Tri-stack 3D LCD monitor
Andrew Loukianitsa, Andrew Yarovoy, Konstantin Kanashin
Currently most 3d device developer realize that for auto stereoscopic displays separate images for right and left eyes are required. The consequence is a 3D image quality reduction, because of the decrease in resolution and viewing angle. For the most well-known schemes, such as parallax barrier, lenticular screens, etc., the resolution decreases by up to 50%. Additionally, because of the listed devices' inherent rather narrow viewing angle, head-tracking devices are required. In contrast, the neuro-stereo display, proposed by the authors of this paper, will increase resolution without reducing viewing angles. This advantage exists because neuro-stereo displays create one continuous 3d image, which uses all the information contained in the initial stereo pair. The presented modification of this neuro-stereo display, SmartrON consists of three LCD panels. The value of further increases in the number of LCD panels is limited by the power of the backlight. Besides, it is unclear if additional LCDs will increase the quality of the 3D effect. On the three LCD devices, images displayed on each panel are processed with a special neural network, so that the resulting luminous flux exactly corresponds to the amount of light in the scene's objects, with a rather wide viewing angle. Considering the mathematical aspect, a new method is proposed for coding the part of 3D space that is superimposed on two or three 2D fields. So we can talk about a new method of holography in incoherent light. The neural network can be emulated either using serial processors or standard graphics cards which provide the real-time mode. The results ofnumerical simulations and physical experiments show that the three LCD panel device appreciably increases the quality of 3d images in comparison with the two-panel scheme.
Full-color autostereoscopic video display system using computer-generated synthetic phase holograms
A full-color auto-stereoscopic video display system has been introduced and developed using only a single phase-only spatial light modulator, a simple projection lens module, and three laser diode sources with the wavelengths of 635nm (red), 532nm (green), and 473nm (blue). Full-color stereoscopic input video frames are separated by each red, green, and blue component with respect to each stereo eye view for a 3D image frame. Each hologram is then optimized by a modified iterative Fresnel transform algorithm method, for the reconstruction of each gray-level quantized stereo image without color dispersion. To solve the color dispersion problem we applied scaling constraints and phase-leveling techniques for each hologram. Then the optimized holograms are synthesized with direction-multiplexed holograms and modulated by a single phase-type spatial light modulator. The modulated signals are Fourier-transformed by an achromatic lens and redirected to each viewer's eye for the reconstruction of the composed full-color auto-stereoscopic 3D display. Experimentally, we demonstrated that the designed computer-generated holograms were able to generate full-color stereoscopic 3D video images without any use of glasses.
Real-time holographic video images with commodity PC hardware
V. Michael Bove Jr., Wendy J. Plesniak, Tyeler Quentmeyer, et al.
The MIT second-generation holographic video system is a real-time electro-holographic display. The system produces a single-color horizontal parallax only (HPO) holographic image. To reconstruct a three-dimensional image, Holovideo uses a computed fringe pattern with an effective resolution of 256K samples wide by 144 lines high by 8 bits per sample. In this paper we first describe the implementation of a new computational subsystem for Holovideo, replacing custom computing hardware with commodity PC graphics chips, and using OpenGL. We also report the implementation of stereogram computing techniques that employ the PC hardware acceleration to generate and update holographic images at rates of up to two frames per second. These innovations shrink Holovideo’s physical footprint to fit on the table-top and mark the fastest rate at which full computation and update have been achieved on this system to date. Finally we present first results of implementing the Reconfigurable Image Plane (RIP) method of computing high-quality holograms on this new system.
Low-loss filter for stereoscopic projection with LCD projectors
Oliver Stefani, Matthias Bues, Roland Blach, et al.
To overcome the disadvantage of the low light output of common LCD Projectors when used in combination with polarizing filters, we developed a new combination of half-wave retarder plates and a polarizing filter. The paper points out, that the use of modern LCD Projectors in combination with the newly developed filter results in less light loss than using standard polarizing filters for stereoscopic projections either with LCD or DLP Projectors. We have combined a color selective half-wave retarder plate (CSR) with an achromatic half-wave retarder plate (AR). The CSR with its optical axis oriented 45° to the green light only rotates the spectrum in the 520nm-565nm range by 90°. This means that the red and blue components remain untouched and all the light passing through the first filter will be oriented vertically. The AR then is oriented in that way that all the light after the CSR will be rotated to the desired +/-45° polarization state. We added a third high transmissive polarizing filter with its orientation parallel to the transmitted light in order to cut off any noise which is not perfectly polarized.
Depth Mapping
icon_mobile_dropdown
Three-dimensional scene reconstruction using multiview images and depth camera
Gi-Mun Um, Kang Yeon Kim, ChungHyun Ahn, et al.
This paper presents a novel multi-depth map fusion approach for the 3D scene reconstruction. Traditional stereo matching techniques that estimate disparities between two images often produce inaccurate depth map because of occlusion and homogeneous area. On the other hand, Depth map obtained from the depth camera is globally accurate but noisy and provides a limited depth range. In order to compensate pros and cons of these two methods, we propose a depth map fusion method that fuses the multi-depth maps from stereo matching and the depth camera. Using a 3-view camera system that includes a depth camera for the center-view, we first obtain 3-view images and a depth map from the center-view depth camera. Then we calculate camera parameters by camera calibration. Using the camera parameters, we rectify left and right-view images with respect to the center-view image for satisfying the well-known epipolar constraint. Using the center-view image as a reference, we obtain two depth maps by stereo matching between the center-left image pair and the center-right image pair. After preprocessing each depth map, we pick an appropriate depth value for each pixel from the processed depth maps based on the depth reliability. Simulation results obtained by our proposed method showed improvements in some background regions.
Smoothing region boundaries in variable depth mapping for real-time stereoscopic images
We believe the need for stereoscopic image generation methods that allow simple, high quality content creation continues to be a key problem limiting the widespread up-take of 3D displays. We present new algorithms for creating real time stereoscopic images that provide increased control to content creators over the mapping of depth from scene to displayed image. Previously we described a Three Region, variable depth mapping, algorithm for stereoscopic image generation. This allows different regions within a scene to be represented by different ranges of perceived depth in the final image. An unresolved issue was that this approach can create a visible discontinuity for smooth objects crossing region boundaries. In this paper we describe two new Multi-Region algorithms to address this problem: boundary smoothing using additional sub-regions and scaling scene geometry to smoothly vary depth mapping. We present real time implementations of the Three-Region and the new Multi-Region algorithms for OpenGL to demonstrate the visual appearance of the results. We discuss the applicability and performance of each approach for rendering real time stereoscopic images and propose a simple modification to the standard graphics pipeline to better support these algorithms.
Volumetric 3D Displays
icon_mobile_dropdown
Laser-induced image technology (yesterday, today, and tomorrow)
Methods and systems of laser-induced image technology and ways of their development are discussed. The methods depend on kind of laser-induced etch points (marks), which are used for image creation. Today, the marks, which are usually used for this purpose, are laser-induced damages which are a result of breakdown phenomena. Corresponding systems comprise the following subsystems: A) laser system for generating laser-induced damages inside a transparent material so as the exterior light scattered from them has low fluctuations; B) computer graphic system for transformation of an image into arrangement of points at which breakdowns should be produced; C) systems for controlling characteristics of laser-induced images including the number of gray shades, image resolution, color deviation and so on; D) systems for controlling parameters of laser radiation including direction of laser beam. The methods and systems for increasing quality of such images are discussed. However, there are also other marks appearing as a result of photo ionization and of other physical phenomenon, accompanying interaction of power laser radiation with transparent material. The use of these marks opens new opportunities for creation of laser-induced images with special characteristics. These opportunities are also subject of our discussion.
Spatial 3D infrastructure: display-independent software framework, high-speed rendering electronics, and several new displays
Won-Suk Chun, Joshua Napoli, Oliver S. Cossairt, et al.
We present a software and hardware foundation to enable the rapid adoption of 3-D displays. Different 3-D displays - such as multiplanar, multiview, and electroholographic displays - naturally require different rendering methods. The adoption of these displays in the marketplace will be accelerated by a common software framework. The authors designed the SpatialGL API, a new rendering framework that unifies these display methods under one interface. SpatialGL enables complementary visualization assets to coexist through a uniform infrastructure. Also, SpatialGL supports legacy interfaces such as the OpenGL API. The authors’ first implementation of SpatialGL uses multiview and multislice rendering algorithms to exploit the performance of modern graphics processing units (GPUs) to enable real-time visualization of 3-D graphics from medical imaging, oil & gas exploration, and homeland security. At the time of writing, SpatialGL runs on COTS workstations (both Windows and Linux) and on Actuality’s high-performance embedded computational engine that couples an NVIDIA GeForce 6800 Ultra GPU, an AMD Athlon 64 processor, and a proprietary, high-speed, programmable volumetric frame buffer that interfaces to a 1024 x 768 x 3 digital projector. Progress is illustrated using an off-the-shelf multiview display, Actuality’s multiplanar Perspecta Spatial 3D System, and an experimental multiview display. The experimental display is a quasi-holographic view-sequential system that generates aerial imagery measuring 30 mm x 25 mm x 25 mm, providing 198 horizontal views.
Optical system which projects small volumetric images to very large size
Volumetric displays, in which an image that physically occupies space is built up from many 2D cross sections, has many attractive features including smooth parallax in all directions, coincident focus and fixation points, and wide viewing angles. However, existing volumetric displays require an image forming apparatus that occupies the same volume as the image itself. Large volumetric displays therefore tend to be very expensive and bulky, and may not be practical at all beyond a certain size. DTI has demonstrated optics that can project miniature (about 1 cubic inch) 3D volume filling images into a volume of arbitrary size limited only by the dimensions of a screen-like (i.e. of large area and relatively thin) optical assembly. This greatly reduces the size requirements for the image forming device and its mechanical or optical scanning mechanisms, while producing very large images that can occupy space ranging from the area in front of the display to infinity behind it. The viewing area can be of about the same lateral dimensions as the screen-like assembly, without violating etendue conservation. The optics also simplify the challenges associated with electronic holography, since a very small electronic hologram can be employed as the image source.
Exploring interaction with 3D volumetric displays
Tovi Grossman, Daniel Wigdor, Ravin Balakrishnan
Viewing imagery on volumetric displays, which generate true volumetric 3D images by actually illuminating points in 3D space, is akin to viewing physical objects in the real world. These displays typically have a 360° field of view, and the user does not have to wear hardware such as shutter glasses or head-trackers. As such, they are a promising alternative to traditional display systems for viewing in 3D. Although these displays are now commercially available (e.g., www.actuality-systems.com), current applications tend to use them as a non-interactive output-only display device, much like one would use a printer. In order to fully leverage the unique features of these displays, however, it would be desirable if one could directly interact with and manipulate the 3D data being displayed. We investigate interaction techniques for volumetric display interfaces, through the development of an interactive 3D geometric model building application. While this application area itself presents many interesting challenges, our focus is on the interaction techniques that are likely generalizable to interactive applications for other domains. We explore a very direct style of interaction where the user interacts with the virtual data using direct finger manipulations on and around the enclosure surrounding the displayed 3D volumetric image.
Integral 3D Displays
icon_mobile_dropdown
Three-dimensional electro-floating display system based on integral imaging technique
New three-dimensional (3D) display system which combines two different display techniques is proposed. One of the techniques is integral imaging. The integral imaging system consists of a lens array and 2D display device, and the 3D image of the system is integrated by the lens array from the elemental images. The other technique is image floating, which uses a big convex lens or a concave mirror to exhibit the image of a real object to the observer. The electro-floating display system, which does not use the real object, needs the volumetric 3D display part because the floating display system cannot make the 3D image, but only carries the image closer to the observer. The integral imaging system can be adopted in the electro-floating display system, because the integrated image has the characteristics of the volumetric image within the viewing angle. Moreover, the many methods to enhance the viewing angle of the integral imaging system can be applied to the proposed system directly. The optimum value of the focal length of the floating lens is related to the central depth plane and the viewing angle. The proposed system can be successfully applied to many 3D applications such as 3D TV.
Projection-type integral 3D imaging using multifacet flat mirrors
Sungyong Jung, Sergei A. Shestak, Kyunghoon Cha, et al.
Integral 3D imaging is getting much attention as one of the viable candidates for natural 3D display. Compared with other stereoscopic techniques, it provides more freedom of viewing and sense of naturalness with reduced eye fatigue. Recently several approaches on the projection integral imaging have been issued because of its merits for large size implementation. However, the use of a concave or convex lens array can cause problems of high cost and spherical aberration. To overcome these problems, we propose a novel scheme using multi-facet flat mirrors and demonstrate its feasibility. Instead of spherical mirrors in a flat surface, multi-facet mirrors which contacts with the tangential line of a curved surface function like elemental mirror component. Lights reflected from different facet of each mirrors give different viewing perspectives. By using electronic capturing and display devices, the proposed method makes it possible to record and reconstruct 3D scene in real-time. In the experiment, only horizontal parallax is provided because one-dimensional surface can be made more easily. But extension for two-dimensional or computer generation of elemental images can be also done. Some detailed discussions on the design parameters will be shown in the presentation.
Autostereoscopic liquid crystal display using mosaic color pixel arrangement
We have developed some prototypes of a one-dimensional integral imaging (1-D II) autostereoscopic display. Generally, II is one of the most promising methods for realizing an autostereoscopic display. However, a lens or barrier pitch is wide and obtrusive because this method requires many parallaxes. In this case, slanting lens or barrier is undesirable because the pattern is asymmetrical. From the result of examination about the display resolution of the autostereoscopic display, we adopted an LCD with mosaic color filter arrangement and a vertical lenticular sheet. We changed the color filter to the mosaic arrangement for two types of LCD. One was an LCD of 20.8-inch diagonal size with QUXGA resolution (3200 x 2400 pixels) and another was an LCD of 15.4-inch diagonal size with WUXGA resolution (1920 x 1200 pixels). The typical specifications of the prototypes of the autostereoscopic display were 32 parallaxes with 300 horizontal resolution for the 20.8-inch size and 18 parallaxes with the same resolution for the 15.4-inch size. We confirmed these prototypes showed good appearance and stereoscopic display properties due to the symmetrical lens pattern.
Long viewing distance autostereoscopic display
Hongen Liao, Makoto Iwahara, Yoichi Katayama, et al.
Most of the reported studies have focused on improving the viewing resolution of integral photography (IP) image, widening the viewing angle. To the best of our knowledge, there has been no report about producing an IP image with a depth of several meters for viewing with the naked eye. We developed a technique of three-dimensional (3-D) display for distant viewing of a 3-D image without the need for special glasses. The photo-based integral photography (IP) method enables precise 3-D images to be displayed at long viewing distances without any influence from deviated or distorted lenses in a lens array. We calculate the elemental images from a referential viewing area for each lens and project the corresponding result images through each lens. We succeeded in creating an image display that appears to have three-dimensionality even when viewed from a distance, with an image depth of 5.7 m or more in front of the display, and 3.5m or more behind the display. To the best of our knowledge, the long-distance IP display presented in this paper is technically unique as it is first report of generating an image with such a long viewing distance.
Telemanipulator and Telepresence Technologies
icon_mobile_dropdown
Effect of reduced stereoscopic camera separation on ring placement with a surgical telerobot
Stephen R. Ellis, Jonathan M. Fishman, Christopher J. Hasser, et al.
A custom, stereoscopic video camera was built to study the impact of decreased camera separation on a stereoscopically viewed, visual-manual task resembling some aspects of surgery. The camera’s field of view was matched to that of a stereoscopic laparoscope by adjusting focal length and viewing distance so that the viewer could see equivalent image content at a plane orthogonal to their view. This plane contained the point at which the left and right viewing axes converged. This geometry only exactly matches the images from both the laparoscope and the stereo-camera at this point. This condition was considered a useful approximation for a match between the two image sources. Twelve naive subjects and one of the experimenters were first trained in a ring placement task using the stereo-laparoscope and subsequently switched to the stereo-camera. It was used with differing camera separations ranging from 100% of the laparoscope’s separation to a biocular view corresponding to no separation. The results suggest that camera separation may be reduced 20-35% without appreciably degrading user performance. Even a 50% reduction in separation shows stereoscopically supported performance much better than the biocular condition. The results suggest that existing laparoscopes which use 5 mm camera separation may well be significantly miniaturized without causing substantial performance degradation.
Networked telepresence system using web browsers and omni-directional video streams
Tomoya Ishikawa, Kazumasa Yamazawa, Tomokazu Sato, et al.
In this paper, we describe a new telepresence system which enables a user to look around a virtualized real world easily in network environments. The proposed system includes omni-directional video viewers on web browsers and allows the user to look around the omni-directional video contents on the web browsers. The omni-directional video viewer is implemented as an Active-X program so that the user can install the viewer automatically only by opening the web site which contains the omni-directional video contents. The system allows many users at different sites to look around the scene just like an interactive TV using a multi-cast protocol without increasing the network traffic. This paper describes the implemented system and the experiments using live and stored video streams. In the experiment with stored video streams, the system uses an omni-directional multi-camera system for video capturing. We can look around high resolution and high quality video contents. In the experiment with live video streams, a car-mounted omni-directional camera acquires omni-directional video streams surrounding the car, running in an outdoor environment. The acquired video streams are transferred to the remote site through the wireless and wired network using multi-cast protocol. We can see the live video contents freely in arbitrary direction. In the both experiments, we have implemented a view-dependent presentation with a head-mounted display (HMD) and a gyro sensor for realizing more rich presence.
Fire training in a virtual-reality environment
Eckhard Freund, Jurgen Rossmann, Arno Bucken
Although fire is very common in our daily environment - as a source of energy at home or as a tool in industry - most people cannot estimate the danger of a conflagration. Therefore it is important to train people in combating fire. Beneath training with propane simulators or real fires and real extinguishers, fire training can be performed in virtual reality, which means a pollution-free and fast way of training. In this paper we describe how to enhance a virtual-reality environment with a real-time fire simulation and visualisation in order to establish a realistic emergency-training system. The presented approach supports extinguishing of the virtual fire including recordable performance data as needed in teletraining environments. We will show how to get realistic impressions of fire using advanced particle-simulation and how to use the advantages of particles to trigger states in a modified cellular automata used for the simulation of fire-behaviour. Using particle systems that interact with cellular automata it is possible to simulate a developing, spreading fire and its reaction on different extinguishing agents like water, CO2 or oxygen. The methods proposed in this paper have been implemented and successfully tested on Cosimir, a commercial robot-and VR-simulation-system.
Stereoscopic Display Applications
icon_mobile_dropdown
Stereoscopy in orthopaedics
Stereoscopy was used in medicine as long ago as 1898, but has not gained widespread acceptance except for a peak in the 1930's. It retains a use in orthopaedics in the form of Radiostereogrammetrical Analysis (RSA), though this is now done by computer software without using stereopsis. Combining computer assisted stereoscopic displays with both conventional plain films and reconstructed volumetric axial data, we are reassessing the use of stereoscopy in orthopaedics. Applications include use in developing nations or rural settings, erect patients where axial imaging cannot be used, and complex deformity and trauma reconstruction. Extension into orthopaedic endoscopic systems and teaching aids (e.g. operative videos) are further possibilities. The benefits of stereoscopic vision in increased perceived resolution and depth perception can help orthopaedic surgeons achieve more accurate diagnosis and better pre-operative planning. Limitations to currently available stereoscopic displays which need to be addressed prior to widespread acceptance are: availability of hardware and software, loss of resolution, use of glasses, and image “ghosting”. Journal publication, the traditional mode of information dissemination in orthopaedics, is also viewed as a hindrance to the acceptance of stereoscopy - it does not deliver the full impact of stereoscopy and “hands-on” demonstrations are needed.
Systems I
icon_mobile_dropdown
Passive method of eliminating accommodation/convergence disparity in stereoscopic head-mounted displays
The difference in accommodation and convergence distance experienced when viewing stereoscopic displays has long been recognized as a source of visual discomfort. It is especially problematic in head mounted virtual reality and enhanced reality displays, where images must often be displayed across a large depth range or superimposed on real objects. DTI has demonstrated a novel method of creating stereoscopic images in which the focus and fixation distances are closely matched for all parts of the scene from close distances to infinity. The method is passive in the sense that it does not rely on eye tracking, moving parts, variable focus optics, vibrating optics, or feedback loops. The method uses a rapidly changing illumination pattern in combination with a high speed microdisplay to create cones of light that converge at different distances to form the voxels of a high resolution space filling image. A bench model display was built and a series of visual tests were performed in order to demonstrate the concept and investigate both its capabilities and limitations. Results proved conclusively that real optical images were being formed and that observers had to change their focus to read text or see objects at different distances
Reusable methodology based on filters in order to define relevant tangible parts for a TUI
Fabrice Depaulis, Nadine Couture, Jeremy Legardeur, et al.
Modern CAD systems offer many powerful functions to handle parts and assemble them. However, these functions often mask problems that only occur on the final production stage (for example, positioning difficulties for two parts before fixing). The ESKUA project aims to solve this issue by providing a tangible way to test an assembling task, as soon as possible in the design process. In this Tangible User Interface (TUI) based system, each CAD part is associated to a real world object, called interactor. Each action performed with these interactors is captured by a camera, and then visualized in the CAD software. From a usability point of view, it is very important to provide an appropriate interactor family. This paper deals with a design methodology for such a set. First, we show how an object can be characterized in the assembling context, regarding a theoretical definition of assembling task. Then, we detail how our methodology gathers together parts that share the same value for a given assembling criterion, and how it builds interactors from this analysis, as abstractions of each subset properties. Finally, we validate the proposed approach with an experimental use to find out an interactors set for mechanical parts assembling.
WebVR: an interactive web browser for virtual environments
The pervasive nature of web-based content has lead to the development of applications and user interfaces that port between a broad range of operating systems and databases, while providing intuitive access to static and time-varying information. However, the integration of this vast resource into virtual environments has remained elusive. In this paper we present a 3D Web Browser (WebVR) that allows users to search for arbitrary information on the Internet and to seamlessly augment the results into virtual environments. WebVR provides access to the standard data input and query mechanisms while supporting active texture-skins of web content that can be mapped onto arbitrary surfaces within the environment. Once mapped, the corresponding texture functions as a fully integrated web-browser that will respond to traditional events such as the selection of links or text input. As a result, any surface within the environment can be turned into a web-enabled resource that provides access to user-definable data. In order to leverage from the continuous advancement of browser technology and to support both static as well as streamed content, WebVR uses ActiveX controls to extract the desired texture skin from industry strength browsers, providing a unique mechanism for data fusion and extensibility.
A global-timestamp-based approach to construct a real-time distributed tiled display system
Tiled display systems have emerged as a means to visualize complex scientific data sets, while reducing the need to subsample potentially critical information. This paper presents a global-timestamp-based approach for the development and the control of real-time distributed tiled display systems. Two different techniques are presented that enable the development of multi-tile configurations in combination with distributed render clusters. A single-display-multiple-renderer approach is presented that fuses visuals generated by multiple render nodes into one composite image, which can be assigned to one specific display tile. This approach is subsequently extend to a multiple-display-multiple-rendered approach that facilitates the creation of scalable display systems consisting of multiple display tiles and render clusters. This paper investigates challenges that have to be addressed by these systems and describes a proof-of-concept system based on a high-level object-oriented real-time programming scheme called TMO.
Mixed Realities
icon_mobile_dropdown
Projection-based augmented reality with automated shape scanning
Yoshihiro Yasumuro, Masataka Imura, Yoshitsugu Manabe, et al.
We propose a new framework for interactive Augmented Reality (AR) and Mixed Reality (MR) representation using both visible and invisible projection onto physical target objects. Projection-based approach for constructing AR/MR uses physical objects such as walls, books, plaster ornaments and whatever the computer generated contents can be optically projected onto. Namely, projection makes it possible to use real objects as displays. We mainly focus on capturing and utilizing the 3D shape of the object surface, whose information allows the AR/MR system to take into account the visual consistency when merging the physical and rendered objects. 3D shape data of the object can be used to compensate the distortion caused by the difference between positions of projectors and the viewer. The other advantage is the capability to generate proper visual occlusion between physical and virtual objects so that they seem to coexist in front of the viewer. What we demonstrate in this study is to employ near-infrared pattern projection for triangulation so that scanning and updating the geometry data of the object is automatically performed in background process, thus parallel processing to provide AR/MR representation can be achieved according to dynamic physical geometry changes.
Localization of wearable users using invisible retro-reflective markers and an IR camera
Yusuke Nakazato, Masayuki Kanbara, Naokazu Yokoya
This paper describes a localization method for wearable computer users. To realize applications of wearable computers like a navigation system, the position of a user is required for location-based services. Many localization methods in indoor environments have been proposed. One of the methods estimates user's position using IR beacons or visual markers. However, these methods have same problems concerning power supply and/or undesirable visual effects. In order to avoid the problems, we propose a new localization method which is based on using an IR camera and invisible markers consisting of translucent retro-reflectors. In the proposed method, to extract the regions of the markers from the captured images stably, the camera captures the reflection of IR LEDs which are flashed on and off synchronously.
Toward natural fiducials for augmented reality
Augmented Reality (AR) requires a mapping between the camera(s) and the world, so that virtual objects can be correctly registered. Current AR applications either use pre-prepared fiducial markers or specialist equipment or impose significant constraints on lighting and background. Each of these approaches has significant drawbacks. Fiducial markers are susceptible to loss or damage, can be awkward to work with and may require significant effort to prepare an area for Augmented interaction. Use of such markers may also present an imposition to non-augmented observers, especially in environments such as museums or historical landmarks. Specialist equipment is expensive and not universally available. Lighting and background constraints are often impractical for real-world applications. This paper presents initial results in using the palm of the hand as a pseudo-fiducial marker in a natural real-world environment, through colour, feature and edge analysis. The eventual aim of this research is to enable fiducial marker cards to be dispensed with entirely in some situations in order to allow more natural interaction in Augmented environments. Examples of this would be allowing users to "hold" virtual 3D objects in the palm of their hand or use gestures to interact with virtual objects.
3D reconstruction of outdoor environments from omnidirectional range and color images
Toshihiro Asai, Masayuki Kanbara, Naokazu Yokoya
This paper describes a 3D modeling method for wide area outdoor environments which is based on integrating omnidirectional range and color images. In the proposed method, outdoor scenes can be efficiently digitized by an omnidirectional laser rangefinder which can obtain a 3D shape with high-accuracy and an omnidirectional multi-camera system (OMS) which can capture a high-resolution color image. Multiple range images are registered by minimizing the distances between corresponding points in the different range images. In order to register multiple range images stably, the points on the plane portions detected from the range data are used in registration process. The position and orientation acquired by the RTK-GPS and the gyroscope are used as initial value of simultaneous registration. The 3D model which is obtained by registration of range data is mapped by the texture selected from omnidirectional images in consideration of the resolution of the texture and occlusions of the model. In experiments, we have carried out 3D modeling of our campus with the proposed method.
Systems II
icon_mobile_dropdown
Large-format 3D interaction table
Jonny Gustafsson, Christoffer Lindfors, Lars Mattsson, et al.
The first prototype of the Interaction Table was presented at the Electronic Imaging conference in 2004. In this paper we describe the progress made with the second prototype. The Interaction Table uses a autostereoscopic display, in reflection mode, to show a computer-generated three-dimensional image to a small number of people. The display is mounted as a table top so that the users can discuss freely while interacting with the image. Target applications include, e.g., product development, planning, surveying, military command, education, and medicine. Recently we work to increase the size four times to 60 x 80 cm. Furthermore, the field of view for each viewing position has been extended up to 120 degrees by the use of a specially made holographic optical element. The second prototype can be driven by a number of PC:s connected by a Ethernet network and/or by a single SGI Onyx computer with a number of separate graphics channels. For both the PC setup and the SGI system specially designed software written in Java or commercially available CAD visualization software can be used.
Stereoscopic stimuli are not used in absolute distance evaluation to proximal objects in multicue virtual environment
Damien Paille, Andras Kemeny, Alain Berthoz
Many authors report that binocular vision plays an important role in the evaluation of the distance to scenery objects. Furthermore, it was observed that narrow (<20°) angular visual observation field, called field of view, causes underestimation of distances to objects in natural scenes. In a series of experiments we studied if distances were underestimated for larger fields of views (60°, 90° and 120°) and if binocular vision could correct distance estimation. We have also studied distance estimation in virtual environments in the same observation conditions as it is known that it may be poorer because of the lack or bias of visual stimuli. Observers had to estimate proximal distances in real and virtual (large fields of view head mounted display and power wall) scenes of a car interior. We have found that there is a strong underestimation of distances in observing proximal objects (≤50cm) in a reduced field of view for both real and virtual scenes and more the field of view is reduced, more observers underestimate distances. Furthermore underestimation is stronger in virtual environments for the same objects than in real ones. We have also compared distance estimations between monocular and binocular observation conditions and found no significant differences for all fields of view. Our results show that binocular vision is not allowing better distance estimation than monocular vision. These results suggest an unexpectedly weak effect of binocular vision on the observation of distances of proximal objects in multi-cue environments.
ShadowLight: an immersive environment for rapid prototyping and design
ShadowLight is a virtual reality application that provides an immersive environment for multipurpose design and evaluation. Unlike traditional design tools that provide a built-in set of manipulators keyed to a particular set of design tasks, or evaluative systems that provide limited manipulation capabilities, ShadowLight offers a loosely defined environment that is capable of supporting the unique needs of both audiences. ShadowLight provides an atmosphere that is flexible enough to support rapid prototyping and design tasks, while at the same time permitting a richness of extensibility that allows scientific and industrial tasks to be performed using the same environment. ShadowLight defines only a basic interface that is extended through the development of plugins. The collection of plugins that is loaded at any given moment defines the capabilities available, and hence what the application “becomes” to its user. ShadowLight specifically seeks to address the current state-of-the-art in which several different environments must be employed for design, evaluation, and exploration tasks. In their place, ShadowLight attempts to provide a single environment that can be selectively extended to address the unique needs of a particular application field.
Systems III
icon_mobile_dropdown
Quantitative comparison of interaction with shutter glasses and autostereoscopic displays
In this paper we describe experimental measurements and comparison of human interaction with three different types of stereo computer displays. We compare traditional shutter glasses-based viewing with three-dimensional (3D) autostereoscopic viewing on displays such as the Sharp LL-151-3D display and StereoGraphics SG 202 display. The method of interaction is a sphere-shaped “cyberprop” containing an Ascension Flock-of-Birds tracker that allows a user to manipulate objects by imparting the motion of the sphere to the virtual object. The tracking data is processed with OpenGL to manipulate objects in virtual 3D space, from which we synthesize two or more images as seen by virtual cameras observing them. We concentrate on the quantitative measurement and analysis of human performance for interactive object selection and manipulation tasks using standardized and scalable configurations of 3D block objects. The experiments use a series of progressively more complex block configurations that are rendered in stereo on various 3D displays. In general, performing the tasks using shutter glasses required less time as compared to using the autostereoscopic displays. While both male and female subjects performed almost equally fast with shutter glasses, male subjects performed better with the LL-151-3D display, while female subjects performed better with the SG202 display. Interestingly, users generally had a slightly higher efficiency in completing a task set using the two autostereoscopic displays as compared to the shutter glasses, although the differences for all users among the displays was relatively small. There was a preference for shutter glasses compared to autostereoscopic displays in the ease of performing tasks, and glasses were slightly preferred for overall image quality and stereo image quality. However, there was little difference in display preference in physical comfort and overall preference. We present some possible explanations of these results and point out the importance of the autostereoscopic "sweet spot" in relation to the user's head and body position.
Experiments in interactive panoramic cinema
Scott S. Fisher, Steve Anderson, Susana Ruiz, et al.
For most of the past 100 years, cinema has been the premier medium for defining and expressing relations to the visible world. However, cinematic spectacles delivered in darkened theaters are predicated on a denial of both the body and the physical surroundings of the spectators who are watching it. To overcome these deficiencies, filmmakers have historically turned to narrative, seducing audiences with compelling stories and providing realistic characters with whom to identify. This paper describes several research projects in interactive panoramic cinema that attempt to sidestep the narrative preoccupations of conventional cinema and instead are based on notions of space, movement and embodied spectatorship rather than traditional storytelling. Example projects include interactive works developed with the use of a unique 360 degree camera and editing system, and also development of panoramic imagery for a large projection environment with 14 screens on 3 adjacent walls in a 5-4-5 configuration with observations and findings from an experiment projecting panoramic video on 12 of the 14, in a 4-4-4 270 degree configuration.
Import and visualization of clinical medical imagery into multiuser VR environments
Andreas H. Mehrle, Wolfgang Freysinger, Ron Kikinis, et al.
The graphical representation of three-dimensional data obtained from tomographic imaging has been the central problem since this technology is available. Neither the representation as a set of two-dimensional slices nor the 2D projection of three-dimensional models yields satisfactory results. In this paper a way is outlined which permits the investigation of volumetric clinical data obtained from standard CT, MR, PET, SPECT or experimental very high resolution CT-scanners in a three dimensional environment within a few worksteps. Volumetric datasets are converted into surface data (segmentation process) using the 3D-Slicer software tool and saved as .vtk files and exported as a collection of primitives in any common file format (.iv, .pfb). Subsequently this files can be displayed and manipulated in the CAVE virtual reality center. The CAVE is a multiuser walkable virtual room consisting of several walls on which stereoscopic images are projected by rear panel beamers. Adequate tracking of the head position and separate image calculation for each eye yields a vivid impression for one or several users. With the use of a seperately tracked 6D joystick manipulations such as rotation, translation, zooming, decomposition or highlighting can be done intuitively. The usage of the CAVE technology opens new possibilities especially in surgical training ("hands-on-effect") and as an educational tool (availability of pathological data). Unlike concurring technologies the CAVE permits a walk-through into the virtual scene but preserves enough physical perception to allow interaction between multiple users, e.g. gestures and movements. By training in a virtual environment on one hand the learning process of students in complex anatomic findings may be improved considerably and on the other hand unaccustomed views such as the one through a microscope or endoscope can be trained in advance. The availability of low-cost PC based CAVE-like systems and the rapidly decreasing price of high-performance video beamers makes the CAVE an affordable alternative to conventional surgical training techniques and without limitations in handling cadavers.
Virtual Reality Works: Demonstration and Panel Discussion
icon_mobile_dropdown
Collaborative virtual environments art exhibition
This panel presentation will exhibit artwork developed in CAVEs and discuss how art methodologies enhance the science of VR through collaboration, interaction and aesthetics. Artists and scientists work alongside one another to expand scientific research and artistic expression and are motivated by exhibiting collaborative virtual environments. Looking towards the arts, such as painting and sculpture, computer graphics captures a visual tradition. Virtual reality expands this tradition to not only what we face, but to what surrounds us and even what responds to our body and its gestures. Art making that once was isolated to the static frame and an optimal point of view is now out and about, in fully immersive mode within CAVEs. Art knowledge is a guide to how the aesthetics of 2D and 3D worlds affect, transform, and influence the social, intellectual and physical condition of the human body through attention to psychology, spiritual thinking, education, and cognition. The psychological interacts with the physical in the virtual in such a way that each facilitates, enhances and extends the other, culminating in a 'go together' world. Attention to sharing art experience across high-speed networks introduces a dimension of liveliness and aliveness when we 'become virtual' in real time with others.
Poster Session
icon_mobile_dropdown
Stereoscopic player and stereoscopic multiplexer: a computer-based system for stereoscopic video playback and recording
Computer-based solutions are well suited for capturing and playing stereoscopic video content at high quality, because the constraints of TV standards need not be obeyed. We developed an application called Stereoscopic Player to play stereoscopic videos. Based on Microsoft's DirectShow, it can handle all major video sources, codecs, stereoscopic layouts and viewing methods. Usability was an important goal in the application design; several features simplify handling of stereoscopic videos. To free the user from choosing the proper settings, we introduced the concept of stereoscopic information files and an Internet-based service for automatic configuration. The Stereoscopic Multiplexer is a solution to capture stereoscopic content within applications compatible with WDM capture drivers. It takes frames from two 'real' capture devices, synchronizes them and passes the resulting stereo pairs to the application. If the cameras support numerical control of camera parameters, Stereoscopic Multiplexer also synchronizes these parameters. Supporting resolutions up to high definition and a wide range of devices, Stereoscopic Multiplexer and Stereoscopic Player are universal, affordable solutions for stereoscopic video recording and playback.
Human Factors
icon_mobile_dropdown
Thin-type natural three-dimensional display with 72 directional images
High-density generation of directional images can provide natural 3D images. Directional images are orthographic projections of a 3D scene into specific directions. A number of directional images projected into different horizontal directions are simultaneously displayed into the corresponding horizontal directions with nearly parallel rays. When the number of directional images becomes large enough and the display angle pitch becomes small enough, rays from a 3D scene are virtually reconstructed. A slanted lenticular sheet technique is used to construct a thin-type natural 3D display which can generate high-density directional images. A slanted lenticular sheet is attached to a high-resolution LCD panel to construct a 2D array of 3D pixels. One 3D pixel consists of 3M x N color sub-pixels to generate M x N rays having different horizontal proceeding directions. The lenticular sheet is slanted so as to differentiate all horizontal distances from the same color sub-pixels to the axis of a lenticle in each 3D pixel. The LCD panel having the resolution of 3,840 x 2,400 is used to construct 320 x 400 3D pixels and each 3D pixel emits rays into 72 different horizontal directions with the horizontal angle pitch of 0.38° by setting M = 12 and N = 6. The slant angle was determined by considering the directivity of directional images.
Poster Session
icon_mobile_dropdown
Three-dimensional visualization of human fundus from a sequence of angiograms
We explore the feasibility of reconstructing some three-dimensional (3D) surface information of the human fundus present in a sequence of fluorescein angiograms. The angiograms are taken during the same examination with an uncalibrated camera. The camera is still and we assume that the natural head/eye micro movement is large enough to create the necessary view change for the stereo effect. We test different approaches to calculate the fundamental matrix and the disparity map. A careful medical analysis of the reconstructed 3D information indicates that it represents the 3D distribution of the fluorescein within the eye fundus rather than the 3D retina surface itself because the latter is mainly a translucent medium. Qualitative evaluation is presented and compared with the 3D information perceived with a stereoscope. This preliminary study indicates that our approach could provide a simple way to extract 3D fluorescein information without the use of a complex stereo image acquisition setup.
Analysis of the viewing parameters for curved lens array system based on integral imaging
Integral imaging (integral photography) is a three-dimensional display technique, first proposed by Lippmann in 1908. Recently the integral imaging attracts much attention as an autostereoscopic three-dimensional display technique for its many advantages. However, the limitation of viewing angle is the primary disadvantage of integral imaging. To overcome the limitation some methods have been proposed. Among them the method that uses a curved lens array has been reported recently. This method widens the viewing angle considerably compared with the conventional method. Generally, in integral imaging each elemental lens has its corresponding area, elemental image region, on the display panel. To prevent the image flipping, the elemental image that exceeds the corresponding area is discarded. Therefore the number of the elemental images is limited. However, in the curved lens array system each elemental image does not exceed the corresponding area. It owes the curved structure and this characteristic widens the viewing angle. In this paper, we will examine the proposed integral imaging system using a curved lens array and analyze of the representative viewing parameters; viewing angle, image depth, image size, etc. for the curved lens array system. The viewing region, in which the three-dimensional image can be displayed with wide-viewing angle, is closely related with image depth and the corresponding viewing angle of the curved lens array system.
Block-wise MAP disparity estimation for intermediate view reconstruction
A dense disparity map is required in the application of intermediate view reconstruction from stereoscopic images. A popular approach to obtaining a dense disparity map is maximum a-posteriori (MAP) disparity estimation. The MAP approach requires statistical models for modeling both a likelihood term and an a-priori term. Normally, a Gaussian model is used. In this contribution, block-wise MAP disparity estimation using different statistical models are compared in terms of Peak Signal-to-Noise Ratio (PSNR) of disparity-compensation errors and number of corresponding matches. It was found that, among the Cauchy, Laplacian, and Gaussian models, the Laplacian model is the best for the likelihood term while the Cauchy model is the best for the a-priori term. Experimental results show that reconstruction algorithm with the MAP disparity estimation using the determined models can improve image quality of the intermediate views reconstructed from stereoscopic image pairs.
A new configuration of LCD-polarized stereoscopic projection system without light loss
Seung-Cheol Kim, Dong-Kyu Kim, Dae-Heum Kim, et al.
In this paper, a new LCD polarized stereoscopic projection method with improved light efficiency is suggested. In the proposed system, two external polarizers are taken away from the conventional LCD polarized stereoscopic projection system by effectively taking into account of inherent polarization properties of the LCD projectors, so that light efficiency of the proposed system can be dramatically improved. From some experimental results with the Type-1 LCD projectors of NEC MT 1060R, it is found that the proposed system shows zero light loss in the polarization process and the resultant stereoscopic image projected from this system is 213%, 75% and 300% brighter than those projected from the conventional Type-1 LCD projector-based, Type-2 LCD projector-based and Type-3 projector-based systems, respectively.
Stereoscopic painting with varying levels of detail
Efstathios Stavrakis, Margrit Gelautz
We present an algorithm for generating automatically stereoscopic paintings with varying levels of detail. We describe our interactive system built around the algorithm to enable users to select the level of detail of the painting. In this context of interactivity we have modified our stereo painting algorithm, presented in previous work, in order to explore the idea of user-driven artistic level of detail selection and display. In particular, a stereo painting is composed by two canvases, one for each eye. These canvases contain multiple refining layers of brush strokes that compose the final painting. In past research, the underlying coarser layers are obscured and function only as the basis to progressively build the finer painting layers. In contrast, our interactive stereo viewing system enables the user to selectively toggle the visibility of finer strokes to reveal coarser representations of the artwork.
Coding of full-parallax multiview images
Torsten Palfner, Erika Muller
Autostereoscopic displays support vertical and horizontal head movements in front of the screen. Although the number of views is limited in the vertical and horizontal direction, the amount of data, which has to be stored or transmitted for these multi-view images, is huge compared to a single image. Therefore compression algorithms have to be used to remove the data redundany. In this paper, we propose a multi-level 4d-DWT to transform multi-view images. This novel approach is able to concentrate the energy of a multi-view image much better than any two-dimensional or three-dimensional transform suggested so far. Therefore much higher compression ratios can be reached. In our paper, we further focus on progressive coding of disparity maps. This approach is inevitable, because the estimated disparity map is only optimal for the target bit rate. In contrast to other approaches, a high-resolution depth map can be reconstructed at the end of the decoding process.
Physical modeling of a microlens array setup for use in computer generated IP
Spyros S. Athineos, Nicholas P. Sgouros, Panagiotis G. Papageorgas, et al.
One of the most promising techniques for visualizing three-dimensional objects is Integral Photography (IP). Two common methods used in synthetic IP generation involve the development of simplified raytracing algorithms for elementary 3D objects or the realization of pinhole arrays. We present a technique utilizing POVRAY’s raytracing capabilities to generate synthetic, high-quality and photorealistic integral images, by accurately modeling an actual microlens array along with the necessary optics. Our work constitutes a straightforward approach for translating a computer generated 3D model to an IP image and a robust method to develop modules that can be easily integrated in existing raytracers. The proposed technique simulates the procedure of a single stage IP capture, for producing a real orthoscopic IP image. Full control is provided over geometry selection, size and refractive index of the elementary microlenses. Specifically our efforts have been focused on the development of arrays with different geometries (square or hexagonal) in order to demonstrate the parameterization capabilities of the proposed IP setup. Moreover detailed benchmarking is provided over a variety of sizes and geometries of microlens arrays.
Stereoscopic display which shows 3D natural scenes without contradiction of accommodation and convergence
Contradiction between convergence and accommodation of our eyes often causes eyestrain and sickness of the viewer in the conventional stereoscopic display systems. Though there exist several methods to solve this problem, they are expensive and are not expected to be commercially available in a near future. The authors propose a novel system that realizes electronic motion images without VR sickness, consisting of cylindrical lenses and electronic image displays with high frequency striped patterns. When we see images with high frequency patterns through cylindrical lenses, we perceive change of blur and focal depth in proportion to the inclination angle of high frequency stripes. Thus this system can control accommodation status of our eyes continuously with inexpensive devices. As for natural scenes, this system uses a filter that blurs high frequency components contained in the scenes except for those needed to lead desired accommodation of our eyes. Thus this system can also induce proper accommodation of our eyes for most natural scenes including various high frequency components.
An innovative beamsplitter-based stereoscopic/3D display design
James L. Fergason, Scott D. Robinson, Charles W. McLaughlin, et al.
A novel stereoscopic/3D desktop monitor has been developed that combines the output of two active matrix LCDs (AMLCDs) into a stereo image through use of a unique beamsplitter design. This approach, called the StereoMirror, creates a stereo/3D monitor that retains the full resolution, response time and chromaticity of the component displays. The resultant flicker-free image, when viewed with passive polarizing glasses, provides an unprecedented level of viewing comfort in stereo. The monitor also is bright enough to use in normal office lighting. The display has excellent optical isolation of the two stereo channels and a wide viewing angle suitable for multi-viewer use. This paper describes the architecture of the system and the principal of conservation of polarization that results in the full-definition stereo image. Optical performance results are also described. Practical considerations will be discussed, including system interface requirements, conversion between stereo/3D and monoscopic viewing and comparison to other stereo display approaches. The higher level of performance provided by the StereoMirror allows for stereo viewing to be viable in new imaging markets as well as permitting a more effective use of stereo in existing markets. These applications are discussed.
McLiflet: multiple cameras for light field live with thousands of lenslets
We have proposed a 3D live video system named LIFLET which stands for Light Field Live with Thousands of Lenslets. It is a computer graphics system based on the optical system of integral photography. It captures a dynamic 3D scene with a camera through an array of lenslets and synthesizes arbitrary views of the scene in real time. Though synthetic views are highly photo-realistic, their quality is limited by the configuration of the optical system and the number of pixels of the camera. This limitation has not been well discussed in our prior works. The contributions of this paper are as follows. First, we introduce a theoretical analysis based on geometrical optics for formulating the upper limit of spatial frequency captured by the system. Second, we propose a system which uses a combination of an array of lenslets and multiple cameras based on that theoretical analysis. We call it McLiflet since it is a multiple-camera version of LIFLET. The proposed system significantly improves the quality of synthetic views compared with the prior version which uses only one camera. This result confirms our theoretical analysis.
Human Factors
icon_mobile_dropdown
Accommodative load for stereoscopic displays
Masako Omori, Shin'ya Ishihara, Satoshi Hasegawa, et al.
In the present study, we examined the visual accommodation of subjects who were gazing fixedly at 3D images from two different displays: a cathode ray tube (CRT) while wearing special glasses and a liquid crystal display (LCD) while not wearing special glasses. The subjects were 3 people aged 20 years (2 people) and 36 years, all with normal vision. Visual function was tested using a custom-made apparatus (Nidek AR-1100). The instrument objectively measured visual accommodative changes of the right eye in both binocular and natural viewing conditions. The target shown to subjects moved away slowly and disappeared at a distance about 3 m from the eye. The results suggested that it was easy and comfortable to focus on both the LCD and CRT. When the subjects viewed the progressively receding target, their accommodation was about 0.8 D at the presumed furthest points, a level at which the ciliary muscle is relaxed. The accommodative power differed by about 1.5 D from the near to far point. Thus, the ciliary muscle is repeatedly strained and relaxed while the subject views the moving target. In the present study, the subjects’ accommodative amplitude was changed when the target moved from the near to far point.
Poster Session
icon_mobile_dropdown
Reduction of the distortion due to non-ideal lens alignment in lenticular 3D displays
A lenticular display system provides 3D images to a viewer without wearing glasses. For N-view lenticular display, N view images are N:1 sub-sampled and multiplexed to generate a multi-view image, Then, the generated image is allocated to the LCD pixel array. Since the lenticular sheet may not be exquisitely placed on an LCD panel without alignment error, and the rays from a viewer’s eye to lenticules on the LCD panel are not parallel, any view image observed from a multi-view image inevitably produces undesirable distortion. In this paper, we propose a novel method to alleviate the display distortion of each view image in the lenticular display. In this method, we first derive the relationship between pixel values on the LCD pixel array and the image to be observed at each viewing zone in terms of hardware parameters and viewer’s eye position. Based on this relationship, we analyze the distortion between the observed and original view images. Finally, we derive the compensation algorithm to minimize the distortion and generate high quality a 3D image. To verify the proposed scheme, we examine the displayed results from several 3D images of synthetic and real scenes. The experimental results show that the proposed scheme significantly reduces distortions and improves the image quality in the lenticular display.