Proceedings Volume 3639

Stereoscopic Displays and Virtual Reality Systems VI

cover
Proceedings Volume 3639

Stereoscopic Displays and Virtual Reality Systems VI

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 24 May 1999
Contents: 11 Sessions, 51 Papers, 0 Presentations
Conference: Electronic Imaging '99 1999
Volume Number: 3639

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Human Factors in Stereoscopic Displays
  • Autostereoscopic Displays
  • New Developments
  • Depth and Disparity Processing
  • Special Session: Digital Stereoscopic and 3D Video--Communication and Entertainment for the Future
  • Computer-Based Stereoscopic Imaging
  • Stereoscopic Acquisition Systems
  • New Developments
  • Software Techniques and Architectures
  • Interfaces
  • Systems and Applications
  • Displays
Human Factors in Stereoscopic Displays
icon_mobile_dropdown
Stereo image quality: effects of spatio-temporal resolution
Lew B. Stelmach, Wa James Tam, Daniel V. Meegan
We explored the response of the human visual system to mixed-resolution stereo video sequences in which one eye view was spatially or temporally low-pass filtered. It was expected that perceived quality, stereo depth, and perceived sharpness of sequences would be relatively unaffected by low-pass filtering, compared to the case where both eyes viewed a filtered image. Subjects viewed two 10-second stereo video-sequences, in which the right-eye frames were filtered vertically and horizontally (H) at 1/2H, 1/2V, 1/4H, 1/4V, 1/2H1/2V, 1/2H1/2V, 1/4H1/2V, and 1/4H1/4V resolution. Temporal filtering was implemented for a subset of these conditions at 1/2 temporal resolution, or with dropped-and-repeated frames. Subjects rates the overall quality, sharpness, and overall sensation of depth. It was found that spatial filtering produced acceptable results: the overall sensation of depth was unaffected by low-pass filtering, while ratings of quality and of sharpness were biased towards the eye with the greater spatial resolution. By comparison, temporal filtering produced unacceptable result: Field averaging and dropped-and-repeated frame conditions yielded images with poor quality and sharpness, even though perceived depth was relatively unaffected. We conclude that spatial filtering of one channel of a stereo video-stereo may be an effective means of reducing transmission bandwidth.
Is monocular degradation visible in fused stereo images?
Daniel V. Meegan, Lew B. Stelmach, Wa James Tam
For efficient transmission of stereoscopic images over bandwidth-limited channels, human factors specialists have recognized that savings can be achieved by degrading one monocular component of a stereo pair and maintaining the other at the desired quality. The desired quality can be preserved as long as binocular vision assigns greater weight to the non-degraded component. The present study sought to determine if such over-weighting occurred when the monocular degradation included blocking artifacts common to DCT-based compression at low bit-rates. Stereo images with asymmetric amounts of degradation in the left and right components were matched to symmetric images on a metric of blocking artifact visibility. Under weighting of the higher-quality component was indicated because matches did not require that the degree of improvement in one component be offset by equivalent degradation of the other. These results suggest that blocking artifacts should not be present if monocular degradation is to be a successful means of bandwidth savings for stereo image transmission. There was also evidence that the type of weighting can depend upon which eye is shown the higher-quality component, suggesting the ta monocular degradation should not be applied to only one eye.
Kinder, gentler stereo
Mel Siegel, Yoshikazu Tobinaga, Takeo Akiya
Not only binocular perspective disparity, but also many secondary binocular and monocular sensory phenomena, contribute to the human sensation of depth. Binocular perspective disparity is notable as the strongest depth perception factor. However means for creating if artificially from flat image pairs are notorious for inducing physical and mental stresses, e.g., 'virtual reality sickness'. Aiming to deliver a less stressful 'kinder gentler stereo (KGS)', we systematically examine the secondary phenomena and their synergistic combination with each other and with binocular perspective disparity. By KGS we mean a stereo capture, rendering, and display paradigm without cue conflicts, without eyewear, without viewing zones, with negligible 'lock-in' time to perceive the image in depth, and with a normal appearance for stereo-deficient viewers. To achieve KGS we employ optical and digital image processing steps that introduce distortions contrary to strict 'geometrical correctness' of binocular perspective but which nevertheless result in increased stereoscopic viewing comfort. We particularly exploit the lower limits of interoccular separation, showing that unexpectedly small disparities stimulate accurate and pleasant depth sensations. Under these circumstances crosstalk is perceived as depth-of-focus rather than as ghosting. This suggests the possibility of radically new approaches to stereoview multiplexing that enable zoneless autostereoscopic display.
Evaluation of stereoscopic display with visual function and interview
The influence of binocular stereoscopic (3D) television display on the human eye were compared with one of a 2D display, using human visual function testing and interviews. A 40- inch double lenticular display was used for 2D/3D comparison experiments. Subjects observed the display for 30 minutes at a distance 1.0 m, with a combination of 2D material and one of 3D material. The participants were twelve young adults. Main optometric test with visual function measured were visual acuity, refraction, phoria, near vision point, accommodation etc. The interview consisted of 17 questions. Testing procedures were performed just before watching, just after watching, and forty-five minutes after watching. Changes in visual function are characterized as prolongation of near vision point, decrease of accommodation and increase in phoria. 3D viewing interview results show much more visual fatigue in comparison with 2D results. The conclusions are: 1) change in visual function is larger and visual fatigue is more intense when viewing 3D images. 2) The evaluation method with visual function and interview proved to be very satisfactory for analyzing the influence of stereoscopic display on human eye.
Development of an autostereoscopic monitor and 2D-to-3D conversion for medical and surgical uses: requirements, clinical trials, and degree of acceptance
Melvin E. Levinson M.D., Goro Hamagishi, Haruhiko Murata
Previous attempts at popularizing stereoscopic devices for surgical use have been only minimally successful. In this paper, we point out what we perceive as past errors and misdirected designs. Although the perfect viewing medium has yet to be identified, certain basic principles and needs are summarized in order to enhance and promote acceptance of stereoscopic methods for surgical procedures, especially in the minimally invasive area. In addition, we present a newly developed autostereoscopic screen and accompanying 2D and 3D converter for medical/surgical use. A summary of the clinical testing performed and the degree of acceptance is also presented. Particular design requirements are unique to the surgical environment and these parameters are presented. The operator acceptance of the device and the value added requirements for stereoscopic endoscopic viewing are discussed.
Evaluation of stereoscopic video cameras synchronized with the movement of an operator's head on the teleoperation of the actual backhoe shovel
Masahiko Minamoto, Katsuya Matsunaga
Operator performance while using a remote controlled backhoe shovel is described for three different stereoscopic viewing conditions: direct view, fixed stereoscopic cameras connected to a helmet mounted display (HMD), and rotating stereo camera connected and slaved to the head orientation of a free moving stereo HMD. Results showed that the head- slaved system provided the best performance.
Comparison of operation efficiency for the insert task when using stereoscopic images with additional lines, stereoscopic images, and a manipulator with force feedback
Katsuya Matsunaga, Kazunori Shidoji, Kenjiro Matsubara
It has been reported that operation efficiency for the teleoperations using stereoscopic video images is lower than when using the naked eye in real environments. Here, the authors tried to improve the human-machine interface of this particular system to achieve higher operation efficiency for stereoscopic video imags by adding other information. An experiment was carried out under the four following conditions: when the insert task was performed by subjects using conventional stereoscopic video imags, when the centering lines of the cylindrical objects and holes were added to the conventional stereoscopic video images, when the force feedback was provided to the system manipulator as one object touched another object, and when both of the additional centering lines and force feedback were provided. The subject's task was to inset a cylindrical object into a round hole. The completion time was measured from the time of the starting signal to the time when the object was inserted into the hole. Completion time, when additional lines were given, was shorter than when the force feedback was provided, and when no additional information was provided. It was concluded that additional visual information contributed more to the recognition of the space rather than providing additional information about surface phenomena.
Analysis of eyepoint locations and accuracy of rendered depth in binocular head-mounted displays
Laurent Vaissie, Jannick P. Rolland, Grace M. Bochenek
Accuracy of rendered depth in virtual environments includes the correct specification of the eyepoints from which a stereoscopic pair of images is rendered. Rendered depth errors should be minimized for any virtual environment. It is however critical if perception is the object of study in such environments, or augmented reality environments are created where virtual objects must be registered with their real counterparts. Based on fundamental optical principles, the center of the entrance pupil is the eyepoint location that minimizes rendered depth errors over the entire field of view if eyetracking is enable. Because binocular head mounted displays (HMDs) have typically no eyetracking capability, the change in eyepoints location associate with eye vergence in HMDs is not accounted for. To predict the types and the magnitude of rendered depth errors that thus result, we conducted a theoretical investigation of rendered depth errors linked to natural eye movements in virtual environments for three possible eyepoints locations: the center of the entrance pupil, the nodal point, and the center of rotation of the eye. Results show that, while the center of rotation yields minimal rendered depth errors at the gaze point, it also yields rendered angular errors around the gaze point, not previously reported.
Autostereoscopic Displays
icon_mobile_dropdown
Advanced autostereoscopic display for G-7 pilot project
Tomohiko Hattori, Takeo Ishigaki, Kazuhiro Shimamoto, et al.
An advanced auto-stereoscopic display is described that permits the observation of a stereo pair by several persons simultaneously without the use of special glasses and any kind of head tracking devices for the viewers. The system is composed of a right eye system, a left eye system and a sophisticated head tracking system. In the each eye system, a transparent type color liquid crystal imaging plate is used with a special back light unit. The back light unit consists of a monochrome 2D display and a large format convex lens. The unit distributes the light of the viewers' correct each eye only. The right eye perspective system is combined with a left eye perspective system is combined with a left eye perspective system by a half mirror in order to function as a time-parallel stereoscopic system. The viewer's IR image is taken through and focused by the large format convex lens and feed back to the back light as a modulated binary half face image. The auto-stereoscopic display employs the TTL method as the accurate head tracking. The system was worked as a stereoscopic TV phone between Duke University Department Tele-medicine and Nagoya University School of Medicine Department Radiology using a high-speed digital line of GIBN. The applications are also described in this paper.
Micropolarizer-based multiple-viewer autostereoscopic display
Stephen A. Benton, Thomas E. Slowe, Adam B. Kropp, et al.
Autostereoscopic displays effectively 'steer' different image-bearing bundles of light rays to the two eyes of the observer(s). Typically, each observer has to find an imaginary point or line in space upon which to place her nose, or an active system tracks a single observer, aiming the imaginary point or line toward her nose via some sort of face tracking scheme. This paper describes a system of the second type that is specifically adapted to accommodate several arbitrarily-located viewers while maintaining good optical isolation of a stereoscopic pair of images, and while registering them so that the consonance of accommodation and convergence occurs at the front surface of the display for maximum comfort during interaction.
Image preparation for 3D LCD
Cees van Berkel
The simplicity and inherent robustness of the Philips 3D- LCD, both in manufacturing and usage, make it highly suitable for a cost effective, mass-market autostereoscopic display. For successful adoption in a wide range of applications, efficient 3D image preparation is very important. A generic expression for the relation between LCD pixels and the multiple perspective views is derived that can be used in the image preparation for different 3D-LCD systems. This paper then describes two approaches to 3D image preparation. One is an intuitive graphical user interface and the second is at source code programming level as an extension to the existing OpenGL 3D graphics API. Using the latter we examine the computer overhead of the 3D image preparation process.
Viewing-point detection system using specific image processing for eye-position tracking autostereoscopic display
Hiroshi Imai, Susumu Tsujikawa, Masao Imai
We have developed a viewing point detection system using specific image processing for an eye-position tracking autostereoscopic display. The system consists of a CCD camera with a coaxial IR light illuminator and an image processing unit. The image processing unit executes two specific processes. The first process enhances viewer's pupils form his/her image to obtain stable viewer pupil extraction. The second process predicts the viewing point in the even the viewer blinks. In applying the system to an eye-position tracking autostereoscopic display, it was confirmed that the system offers stable and continuous tracking performance.
Design and perception testing of a novel 3D autostereoscopic holographic display system
Grace M. Bochenek, Thomas J. Meitzler, Paul L. Muench, et al.
US Army Tank-Automotive Command researchers are in the early stages of developing an autostereoscopic, 3D holographic visual display system. The present system uses holographic optics, low and high-resolution optics, low and high- resolution projectors, and computer workstation graphics to achieve real-time, 3D user-interactivity. This system is being used to conduct 3D visual perception studies for the purpose of understanding the effects of 3D in military target visual detection and as an alternative technique to CAD model visualization. The authors describe the present system configuration, operation, some of the technical limitations encountered during the system development, and the result of a human perception test that compared subject response times, hit rates and miss rates of visual detection when subjects used conventional 2D methods versus the 3D holographic image produced by the holographic display system. The results of this study revealed that 3D HOE system increased the perception of accuracy of moving vehicles. This research has provided some insights into which technology will be the best for presenting 3D simulated objects to subjects or designers in the laboratory.
Multiperspective look-around autostereoscopic projection display using an ICFLCD
DTI has demonstrated a laboratory model of a multiple zone autostereoscopic display with look around capability and a 21' diagonal screen. The display exploits the extremely fast address rates and liquid crystal response speeds associated with ICFLCDs to generate eight to twenty four images during the 1/60th second that a single image is usually displayed. These images consists of the three color components of eight different perspective views of a scene. Optics are used to magnify the images and project them onto a special lens and diffuser screen, which in turn directs light from each image into a different viewing zone in front of the display. Since the perspective views are time multiplexed, each possesses the full resolution of the ICFLCD. An advanced method of generating gray scale using the digital ICFLCD in combination with a time varying light source was also demonstrated. For the most part, off the shelf components were used to construct the key elements of the system including the projection screen and the precision magnifying optics, promising easy commercialization and low production cost. The display is being developed into a prototype, to be followed by production of 1024 X 768 or 1280 X 1024 desktop rear projection display for workstation applications.
Stereoscopic display using a 1.2-m diameter stretchable membrane mirror
Stuart McKay, Steven Mason, Leslie S. Mair, et al.
A glasses-free stereoscopic display has been developed in which a large diameter concave Stretchable Membrane Mirror (SMM) is used both as a viewing screen and optical element. SMMs offer considerable advantages over traditional imaging optics in terms of reduced weight and cost, and are revolutionary in their ability to vary their radius of curvature to give a wide range of mirror f/Nos. This is achieved by controlling the magnitude of an applied pressure difference which acts over an edge clamped metallized polyester membrane, forming the basis of a SMM. A stereoscopic display has been developed in which a 1.2-m diameter SMM is membrane, forming the basis of a SMM. A stereoscopic display has been developed in which a 1.2-m diameter SMM is used. Stereo pairs are projected at the surface of the mirror and viewed through a pari of virtual viewing windows. Such a configuration minimizes light loss, giving a very bright image against the specular reflecting surface of the SMM. The image can be formed in front/on/behind the plane of the SMM, making both real and very large sized virtual images possible. Several formats ranging form simple stereo photographs to live stereo video feed in a telepresence display have been viewed using this system.
Non-glasses-type stereoscopic display system based on polarization
Jung-Young Son, Vadim V. Smirnov, You Seek Chun, et al.
The problems related with a dichroic type polarization filter plate which is used as a spatial image separator for a non-glasses type stereoscopic display device utilizing a liquid crystal display panel are discussed. The filter plate is consisted of many parallel line filters. Each line filter is directing the light with the same polarization only to its corresponding pixel lines in the display panel. The filter plate is cemented to the display panel, back- illuminated by a halogen lamp through two cross-polarized polarizers in side by side. Two Fresnel lenses located before the filter plate for collimating the illuminating beam and after the liquid crystal display panel for forming the images of two polarizers in front of the liquid crystal display panel as viewing zones are used.
Real-time 3D display with acousto-optical deflectors
Jung-Young Son, Vadim V. Smirnov, L. N. Asnis, et al.
A laser volumetric display system based on the sequential scanning of 2D images on a rotating diffusing screen is introduced. The system can generate volume imags having a dimension of 120 X 120 X 100 mm3 with 250,000 resolvable volume pixels. The images are displayed with the frame rate of 15 to 25 Hz. The distance between the projection objective and the screen is 3m. The volumetric images generated by the system are clear and sharp.
New Developments
icon_mobile_dropdown
Digital stereoscopic imaging
The convergence of inexpensive digital cameras and cheap hardware for displaying stereoscopic images has created the right conditions for the proliferation of stereoscopic imagin applications. One application, which is of growing importance to museums and cultural institutions, consists of capturing and displaying 3D images of objects at multiple orientations. In this paper, we present our stereoscopic imaging system and methodology for semi-automatically capturing multiple orientation stereo views of objects in a studio setting, and demonstrate the superiority of using a high resolution, high fidelity digital color camera for stereoscopic object photography. We show the superior performance achieved with the IBM TDI-Pro 3000 digital camera developed at IBM Research. We examine various choices related to the camera parameters, image capture geometry, and suggest a range of optimum values that work well in practice. We also examine the effect of scene composition and background selection on the quality of the stereoscopic image display. We will demonstrate our technique with turntable views of objects from the IBM Corporate Archive.
Stereoscopic viewer using a volume holographic memory
We present a stereoscopic vision system that the stereoscopic images pairs are recorded into a volume hologram. If the two stereoscopic image pairs and a reference beam are of the same wavelength, a stationary interference pattern in s formed in a volume hologram. When the reference beam are of the same wavelength, a stationary interference pattern is formed in a volume hologram. When the reference beam with Bragg matching condition is illuminated for reconstruction, stereoscopic images are suitably projected on the left and right display p;lane for stereoscopic viewing. We present experimental results of stereoscopic pairs recording and reading with a 45 degree cut Fe: LiNbO3 crystal.
Depth and Disparity Processing
icon_mobile_dropdown
New stereo matching algorithm
Yasser Abd-Elbasset Ahmed, Hossam Afifi, Gerardo Rubino
This paper present a new algorithm for stereo matching. The main idea is to decompose the original problem into independent hierarchical and more elementary problems that can be solved faster without any complicated mathematics using BBD. To achieve that, we use a new image feature called 'continuity feature' instead of classical noise. This feature can be extracted from any kind of images by a simple process and without using a searching technique. A new matching technique is proposed to match the continuity feature. The new algorithm resolves the main disadvantages of feature based stereo matching algorithms.
Enhancement of viewer comfort in stereoscopic viewing: parallax adjustment
One of the major deficiencies of stereoscopic visualization, viewer discomfort, can be caused by the non-robustness of human perception or by excessive 3D cues in the viewed images. In order to minimize this discomfort, the amount of parallax within each stereo pair needs to be reduced. Similarly to the case of 'continuous look-around', parallax adjustment requires the knowledge of images from virtual cameras. In the case of parallel geometry, the virtual cameras are located on the line between the true cameras. Since in a general scenario no constraint should be posed on the complexity of the viewed scene, 3D modeling techniques cannot be used. We evaluate the usefulness of parallax adjustment using two view reconstruction methods based on disparity-compensated linear interpolation: a quadtree method with block splitting adapted to object boundaries and a pixel based method. For all, but most complex stereoscopic images tested both algorithms performed very well, especially the pixel-based approach. In terms of the overall usefulness of parallax adjustment, the initial test have shown a very favorable viewer response; the perceived depth was judged to vary smoothly from zero through natural 3D to exaggerated 3D. The adjustment was convincing although not completely free of distortions.
Disparity estimation hardware for real-time stereoscopic applications
George E. Karastergios, Dimitris S. Kalivas, George D. Papadopoulos, et al.
In this paper we will present, the implementation of a hardware component that is being used for the calculation of the disparity field of two stereoscopic video signals. This component is a major part of an extensive 3D videoconferencing system, that was implemented under the European ACTS/PANORAMA project. Incorporating with a headtracker and an interpolator a combination of the intermediate views, can be achieved so as the viewer will have a motion parallax perception.
Stereo display of nested 3D volume data using automatic tunnelling
Roger J. Hubbold, David J. Hancock
We describe a ne technique for visualizing complex, nested features in multivariate volume data sets, such as those commonly found in medical imaging applications. Our work focuses on radiation therapy planning, where the problem is to locate 'hot' and 'cold' spots in a radiation dose field, inside a target tumor and surrounding organs. It is essential to visualize these different features simultaneously in order to understand their spatial relationships. To guarantee that certain key features inside a volume are visible, we dynamically create a series of circular tunnels through the enclosing volumes. As the viewpoint is rotated, the tunnel orientations remain aligned with the viewing direction. This guarantees visibility, while ensuring that a minimal amount of the enclosing volumes is removed, so retaining important, contextual, spatial cues. However, the changing tunnel orientations do not accord with our normal, everyday experience, leading to problems of interpretation. When viewing monoscopic images users have reported a variety of effects, such as difficulty in perceiving correct depths, as well as features which seem to swim independently during rotation. In this paper we report visualizing these volumes on a high-quality head- tracked, autostereoscopic display. Subjects in our test demonstrated a clear preference for stereoscopic viewing, as a way to resolve ambiguities.
Error-tolerant interpolation of intermediate views for real-time applications
Matthias Lueck, Hartmut Schroeder
Many applications in 3D imaging demand for the calculation of intermediate views. In this paper a system is proposed which consists of a predictive disparity estimation module and an interpolation module based on non-linear filter techniques. The predictive disparity estimation algorithm is of low computational costs and gives smooth and dense estimation results. Problems of block-based estimation algorithms in general are the edges of objects because occlusion occurs in this region and the disparity maps are inaccurate. A postprocessing algorithm of low complexity avoids blocking artifacts, that are very annoying in the regions of disparity discontinuities. The interpolation is based on rank order filters called weighted median filters. By an appropriate choice of the weighted median filter root structures main picture elements are also interpolated correctly if faulty disparity vectors occur. With this technique it is possible to avoid interpolation artifacts, resulting from inaccuracies of disparity maps estimated by block based algorithms. In this paper the development of weighted median filters for median applications and the adaptation of the filters to the synthesized viewpoint are presented.
Multipass stereo matching algorithm using high-curvature points on image profiles
Yuan-Chih Peng, Sheng-Jyh Wang
In this paper, we propose a new algorithm to do correspondence for stereo images. This algorithm applies two passes of feature-based matching to establish a coarse disparity map first. Then, by carefully matching the intensity information, a dense disparity map is generated. In this algorithm, instead of the commonly used 'edge' points, the high-curvature points of image profiles are chosen as the feature points to be matched. These high- curvature points can be easily extracted from the images by checking the 2nd derivatives of the intensity profiles. These high-curvature features can faithfully catch the major characteristics of the profile shape and can thus avoid some ambiguities in feature matching. A dissimilarity measure, which is closely related to the profile shape, is thus defined using these feature points. To reduce the ambiguity in local matching, the dynamic programing technique is used to achieve a global optimal correspondence. After the feature matchings, an intensity-based approach is used to establish a dense disparity map. Both the sum-of-squared- difference method and the dynamic programming method are used. By carefully checking the consistence between intensity continuity and disparity continuity, a fairly accurate disparity map can be efficiently generated even if the images are short of texture.
Special Session: Digital Stereoscopic and 3D Video--Communication and Entertainment for the Future
icon_mobile_dropdown
Stereoscopic and 3D visual communications for the future
Ralf Buschmann
This paper motivates the development of stereoscopic and 3D visual communication systems for the future and it presents final research results from the European ACTS AC092 PANORAMA.
Stereo/multiview video encoding using the MPEG family of standards
Compression of stereoscopic and multiview video data is important, because the bandwidth necessary for storage and transmission linearly increase with the number of camera channels. This paper gives an overview about techniques that ISO's Moving Pictures Experts Group has defined in the MPEG- 2 and MPEG-4 standards, or that can be applied in the context of these standards. A good tradeoff between exploitation of spatial and temporal redundancies can be obtained by application of hybrid coding techniques, which combine motion-compensated prediction along the temporal axis, and 2D DCT transform coding within each image frame. The MPEG-2 multiview profile extends hybrid coding towards exploitation of inter-viewchannel redundancies by implicitly defining disparity-compensated prediction. The main feature of the new MPEG-4 multimedia standard with respect to video compression is the possibility to encode objects with arbitrary shape separately. As one component of the segmented object's shape, it shall be possible to encode a dense disparity map, which can be accurate enough to allow generation of alternative view s by projection. This way, a very high stereo/multiview compressions ratio can be achieved. While the main application area of the MPEG-2 multiview profile shall be in stereoscopic TV, it is expected that multiview aspects of MPEG-4 will play a major role in interactive applications, e.g. navigation through virtual 3D worlds with embedded natural video objects.
Architecture for digital 3D broadcasting
Philip V. Harman
Recent discussions with a number of leading Cable Television broadcasters has indicated a willingness to include 3D capabilities in their role out of digital television services. Such a service would represent a landmark in the evolution of 3D. While this will provide a tremendous stimulation to numerous fledgling 3D industries throughout the world, the development of a digital 3D broadcast architecture that would meet the stringent requirements of the Cable companies will not be a simple exercise. During discussion, the Cable companies proposed the following specification for a digital Cable 3D service: 1) The 3D service must be totally 2D compatible; 2) The fact that 3D is being transmitted should not be detectable by the 2D viewer and not affect the 2D service in any way; 3) No additional bandwidth will be required for the 3D service; 4) The 3D service must be totally compatible with all existing and future 2D systems and equipment; 5) The 3D system must cater for existing stereoscopic display system and future 'multiple view' displays; 6) Unlimited supply of high quality, low cost, 3D material. Additionally, should the 3D service prove to be economically viable then: 7) The service must be capable of being upgraded to accept stereoscopic video images. In the case of 7, then 2D compatibility would not be required and a number of the other restrictions would be relaxed. An architecture that will meet these requirements is described.
Perceptual basis of stereoscopic video
Lew B. Stelmach, Wa James Tam, Daniel V. Meegan
We reviewed studies of viewers' reactions to stereoscopic image sequences. The dimensions considered were perceived image quality, sharpness, depth and naturalness. Stereoscopic displays produced a reliable and consistent increase in the perceived depth of image sequences. By comparison, improvements on other dimensions were not as robust. The key conclusion is that viewers' responses to stereoscopic image sequences vary along a number of independent dimensions. Overall preference for stereoscopic images will occur only if the enhanced depth perceived in a stereoscopic image sequence is not accompanied by distortions created by excessive disparity, ghosting/crosstalk, or conflicts between monoscopic and stereoscopic depth information.
Real-time synthesis of digital multiple-viewpoint stereoscopic images
Emile A. Hendriks, Andre Redert
In this paper we address the real time synthesis of virtual vies for multi viewpoint stereoscopic systems. For the viewpoint dependent generation of virtual vies two approaches can be adopted; a dynamic 3D-model reconstruction of the recorded scene from which the desired virtual views are derived or a disparity compensated interpolation strategy. The latter approach is most feasible for real time system. After a brief review of possible hardware choices we describe, as an example of the disparity compensated strategy, a dynamic programming based disparity estimation algorithm and a disparity compensated interpolation algorithm. For efficient implementation an alternative disparity representation format is presented. Both algorithms are specifically designed for hardware implementation to meet real time constraints. Hardware architectures and implementation considerations are given. Simulations show high quality results for typical teleconferencing scenes. A similar version of the interpolation algorithm is realized and successfully demonstrated in the PANORAMA project.
Digital signal processing for the analysis and coding of stereoscopic and 3D video
Michael G. Strintzis, Sotiris Malassiotis, Ioannis Kompatsiaris
The main problem in stereo vision is the reconstruction of the 3D surface of objects in the scene from a pair of stereo images. This is performed by establishing correspondence between homologous points in each stereo image. Once correspondence is established it is straightforward to compute the coordinates of the corresponding 3D point. Although, important image analysis tasks like scene segmentation and 3D motion estimation may be performed using a monocular sequence, one may achieve better result by exploiting depth information from a stereo image sequence. Especially if depth estimation is combined with 3D motion estimation a considerable improvement in the accuracy of the estimates is expected. In this paper we present algorithms for the analysis of stereoscopic image sequences. Apart from the usefulness of stereoscopic imagin in compute revision, its application in advanced telecommunications is also of prime importance. Since the bandwidth required to transmit both stereoscopic image streams is large, efficient coding technique should be employed to reduce the data rate. In this paper we present an object-based approach that lacks the disadvantages of tradition block-based techniques. Also, the ability of this algorithm to describe a scene in a structural way, in contrast to traditional waveform-based coding techniques, opens new areas of applications.
Computer-Based Stereoscopic Imaging
icon_mobile_dropdown
Converting existing applications to support high-quality stereoscopy
Robert A. Akka
With the recent standardization of OpenGL based windowed stereoscopy support on the PC platform, numerous developers of professional CAD/CAM/CAE software are now interest in adding stereoscopy support to their products. StereoGraphics Corporation is currently busy helping software developers to achieve this stereoscopy support.
Interfacing shuttering-type stereoscopic hardware with Windows/NT workstations
Lenny Lipton, Jeff Halnon
StereoGraphics Corporation has over a decade of experience providing stereoscopic display products for use with UNIX workstations and has become the leading vendor in that marketplace. To achieve this kind of marketplace acceptance, the company had to go to great lengths to solve a myriad of interface issues that arose because of a lack of standardization. More recently, the company developed and marketed a PC product that operates under DOS and Win95. This product specifically solves the problem of using shuttering eyewear for a flickerfree result when used in conjunction with non-stereo-ready video boards, i.e., those that do not operate at a high field rate. With the growing acceptance of WinNT workstations in applications such as mechanical CAD, a design effort was undertaken to create a new family of products that would prove ease of interface with this new workstation infrastructure. This paper describes the development of StereoGraphics products designed specifically for the WinNT workstation. The interface task is greatly simplified for machines employing video boards which include quad buffering, operate at a high field rate and use the VESA Standard Connector for Stereoscopic Display Hardware. The new products include: An infra-red emitter compatible with CrystalEyes, low cost wired shuttering eyewear of high optical quality,an d an emitter that will work with non-stereo-ready video boards. In addition, the company's polarization modulator, the ZScreen, which uses passive eyewear, also interfaces with the new NT infrastructure.
PC-based stereoscopic video walkthrough
Andrew J. Woods, Douglas Offszanka, Greg Martin
This paper describes a computer program which allows a user to semi-interactively navigate through a pre-recorded environment. The experience is achieved using a set of stereoscopic video sequences which have been recorded while walking around the various pathways of the chosen environment. In the completed system,the stereoscopic video sequences are played back in a sequence such that the operator is given the illusion of being able to continuously and semi-interactively navigate through the environment in stereoscopic 3D. At appropriate decision points the operator is given the option of choosing which direction to continue moving - thereby providing a level of interactivity. This paper discusses the combination of two recent advances in computer technology to transfer an existing video-disk based stereoscopic video walkthrough to run entirely on a PC. The increased computation power of PCs in recent years has allowed reasonably high-quality video playback to be performed on the desktop PC with no additional hardware. Also, Liquid Crystal Shutter (LCS) glasses systems are now widely available allowing high-quality stereoscopic images to be viewed easily on a PC. The demonstration system we have implemented interfaces with a large range of LCS glasses and allows the exploration of stereoscopic video walkthrough technology.
Stereoscopic Acquisition Systems
icon_mobile_dropdown
New acquisition system of arbitrary ray space
Toshiaki Fujii, Tadahiko Kimoto, Masayuki Tanimoto
Conventional ray-space acquisition system required very precise mechanisms to control the small movement of cameras or objects. Most of them adopted camera with a gantry or a turntable. Although they are good to acquire the ray-space of small objects, it is not suitable for ray-space acquisition of very large structures, such as a building, a tower, etc. This paper proposes a new ray-space acquisition system which consists of a camera and a 3D position and orientation sensor. It is not only a compact, easy-to-handle system, but also free from a limitation of size or shape in principle. It can obtain any ray-space data as far as the camera is located within the coverage of the 3D sensor. This paper describes our system and its specifications. Experimental results are also presented.
New stereoscopic system
Yasser Abd-Elbasset Ahmed, Hossam Afifi
In this paper we present a new design for a 3D system implemented by using a special 3D Ring Lens instead of traditional lenses. The new system has the ability to capture the 3D information with the way that simplifies the necessary steps for 3D reconstruction. The new system capture two images for every point in the object field of the object field of the new systems with a fixed geometric constraint between the correspondence points. It is independent of any intrinsic parameter or extrinsic parameters like epi-polar geometry of traditional stereo system. A description of the main features of the new lens and the new systems will be discussed. The new system is suitable for many stereo applications such as disparity, depth/map estimation, 3D reconstruction and robotics applications such as vehicle navigation and object tracking.
New Developments
icon_mobile_dropdown
Morphing in stereo animation
James Arthur Davis, David F. McAllister
There are several techniques that can be used to produce morphs of 3D objects. The traditional solution is to apply 3D algorithms that transform the shape and attributes of one object into those of another. The problems in 3D morphing include avoiding self-intersections during the morph, specification of corresponding regions in the source and target objects and the imposition of geometric constraints on the objects. At first glance, the application of well understood 2D morphic techniques to stereo imags would seem to be reasonable and much simpler alternative to the production of 3D models and the application of 3D morphing to those modes. While it is true that in certain cases the applicant of 2D linear morphing techniques to stereo images produces effective morphs, the use of this technique places very strict geometric constraints on the objects being morphed. When linear 2D morphic technique are applied to stereo images where the parallax encoded in the images is of utmost importance, they linearly interpolate points between the source and target images which interpolates the parallax, also. We examine the ramifications of this limitation and discus the geometric constraints under which stereo morphing is useful.
Software Techniques and Architectures
icon_mobile_dropdown
Dialogic generation system of realistic multimedia contents corresponding to dynamic intentions
Masataka Masuda, Yo Murao, Hajime Enomoto
To increase the utility reality system, a new software architecture corresponding to multiple users' intentions has been developed in our extensible Well system. The multiple users' intentions are various and changeable in a process of generating realistic multimedia contents. The structure of intentions is clarified. The system is based on a model driven method: four kinds of generic model are employed hierarchically. These models are defined by logical specifications. The specifications can be related to the object network. Generic object network of intentions is defined based on the structure of intentions. The software architecture that translates the intention into actual services is designed and implemented by specifications. The actualization is realized dynamically in dialogue processes between clients and servers along the generic object network. The dialogue processes offer dynamic services ranging from intention level to the implementation and offer services dynamically; it integrates multiple media and generates realistic multimedia contents efficiently. The software structures bringing this result are constructed systematically: intentions, dialogues, models, and transactional processes of services are all expressed hierarchically.
Transparently supporting a wide range of VR and stereoscopic display devices
Dave Pape, Daniel J. Sandin, Thomas A. DeFanti
This paper describes an architecture for virtual reality software which transparently supports a number of physical display systems and stereoscopic methods. Accurate, viewer- centered perspective projections are calculated, and graphics display options are set, automatically, independent of application code. The design is intended to allow greater portability of applications between different VR devices.
Interfaces
icon_mobile_dropdown
Haptic Workbench: a multisensory virtual environment
Duncan R. Stevenson, Kevin A. Smith, John P. McLaughlin, et al.
The Haptic Workbench combines stereo images, co-located force feedback and 3D audio to produce a small-scale hands- in virtual environment system. This paper present the Haptic Workbench, the HCI issues that arose and its deployment in prototype industrial applications. The problems associated with combining global graphic and local haptic rendering in an efficient and generalized manner are described. The benefits and the difficulties associated with this class of virtual environment system are discussed, the experience gained in applying it to industrial applications is described and conclusions are drawn about the appropriate use of co-located multi-sensory technologies in Virtual Environments.
Head tracking for viewpoint control in stereographic displays
Roger A. Browse, James C. Rodger, Sarah Pakowski, et al.
Future computer interfaces will likely use 3D displays with stereographic viewing to take advantage of the increased information inherent in 3D. The appropriate roles of devices to manipulate 3D displays, including the mouse, joystick and head tracking remain unresolved. Our research centers on the use of head tracking for the control of perspective. For monoscopic viewing, we previously found that viewers can control displays effectively with head movements. They learn rapidly to use head movements, though scene adjustments amplify or even reverse natural perspective changes, and this ability persist over time. With stereo viewing, if head movements do ont produce the expected change in perspective, the viewer may be confused, reducing the effectiveness of head tracking. We tested these conjectures in the experiment reported here, establishing the extent to which the flexibility found under monoscopic viewing extends to stereo. As in previous experiments, the viewer makes head movements to see a target sphere through a ring positioned in virtual space between the viewer and the target. We used a variety of ring sizes and position to measure the speed and directness of movement under four conditions that varied the scene location in depth, plus the extent and direction of perspective change. These combinations permit us to evaluate the effects of direction and extent of scene adjustment on viewers' ability to use head movements to alter virtual viewpoint. While we found no difference for reversed adjustments under monoscopic viewing, these conditions appear more difficult in stereo viewing. Furthermore, viewers perform better when perspective changes are amplified.
Physical presence: palettes in virtual spaces
George C. Williams, Haakon Faste, Ian E. McDowall, et al.
We have built a hand-held palette for touch-based interaction in virtual reality. this palette incorporates a high-resolution digitizing touch screen for input. It is see-through, and therefore does not occlude objects displayed behind it. These properties make it suitable for direct manipulation techniques in a range of virtual reality display systems. We implemented several interaction techniques based on this palette for an interactive scientific visualization task. These techniques, the tool's design, and its limitations are discussed in this paper.
Systems and Applications
icon_mobile_dropdown
Tracking systems and the value of inertial technology
Frido Kuijper, Andre T. Smits, Hans Jense
This paper intents to add to the literature on 3D position and orientation tracking system by describing TNO's experience with the InterSense tracking system that uses a combination of inertial and ultra-sound technology. From the results of a performance evaluation study and our practical experience with this system in military applications, the value of the system and its underlying inertial technology based hybrid concept is determined. The performance figures addressed in the study include noise and registration error characteristics. Orientation and position tracking performance results are provided for the InterSense system. The figures are compared with the figures for the Polhemus FASTRACK system. The hybrid tracking system concept as introduced by InterSense is of great value to virtual environment applications. The filter algorithms included in the InterSense tracking system to combine the two different sensor types result in a system that is both fast and noise free. The system is very well suited for most immersive applications. Registration error, however, is rather large, causing the system to be inadequate for augmented reality applications in its current implementation. TNO feels that the concept will evolve to an inertial technology based system combined with high accuracy auxiliary trackers that will meet the requirements for augmented reality applications.
Virtual environment for training in microsurgery
Kevin N. Montgomery, Michael Stephanides, Joel Brown, et al.
Microsurgery is a well-established medical field, and involves repair of approximately 1mm vessels and nerves under an operating microscope in order to reattach severed fingers or transfer tissue for reconstruction. Initial skill sin microvascular surgery are usually developed in the animal lab and subsequently in the operating room. Development of these skills typically requires about 6 months of animal based training before additional learning takes place in the operating room.
Virtual world for helping teens practice assertiveness skills
Kenneth Nemire, Joshua Beil, Ronald W. Swan
Smoking is on the rise among adolescents. This pilot project combined the well-documented benefits of Life Skills Training (LST) with the unique multisensory, 3D qualities of virtual environment (VE) technology to address some of the disadvantages of traditional prevention programs by engaging teens better, presenting information more persuasively, and making prevention programs continuously available in computer labs. In an eight-week pilot study, 45 seventh- grade students were randomly assigned to LST, VE, or non- intervention control groups. The VE system included goggles, synthesized speech, head and hand trackers, hand-held controller, and speech recognition. Questionnaires measured participants' smoking knowledge and behavior,a participants' reports on the usability of the VE system, and reports of simulator sickness symptoms. Structured interviews with randomly selected participants from each group revealed more detailed information. Data indicated the VE group retained more information and had more positive experiences learning about dangers of smoking and assertiveness skills than did the LST group. Usability data showed ease of use and learning of the VE system, with no significant symptoms of simulator sickness. These data indicated that this VE application is a promising tool for keeping teens healthy.
Development of a virtual laboratory for the study of complex human behavior
Jeff B. Pelz, Mary M. Hayhoe, Dana H. Ballard, et al.
The study of human perception has evolved from examining simple tasks executed in reduced laboratory conditions to the examination of complex, real-world behaviors. Virtual environments represent the next evolutionary step by allowing full stimulus control and repeatability for human subjects, and a testbed for evaluating models of human behavior.
Displays
icon_mobile_dropdown
Properties and applications of spherical panoramic virtual displays
Spherical panoramic virtual displays are a new environment for presenting high resolution visual information to an observer. The virtual image is seen with both eyes. The new environment provides the wide field of view image, typically 180 degrees in horizontal and vertical directions, forming a collimated image over a half dome. The users stand in front of a dome window and see a collimated image filling most of their visual field. The spherical panoramic virtual display consists of an optical systems and a unique projector system. The system relies on the Schmidt principle for a spherical mirror in reverse to form the image. The optical system has a very high degree of symmetry. At the center of the system, the image is free of all aberrations. Away from the center, the aberrations are function of the users position, the size of the display system and the apparent focal distance. An example calculations of the aberrations will be presented. The projection system has three properties, projects light in one direction, is substantially transparent, and is spherical in nature. An example of a scanning projector is discussed. Examples of potential applications are presented.
LOOKAROUND: a spherical VR environment
Su-Shing Chen
The BE Systems is developing a unique technology of augmented 360 degrees situational visualization for displaying fused multimedia in a virtual hemisphere. It is a software tool exploiting spherical visualization, spherical memory, and situational visualization of any angle in the visual sphere or hemisphere. Applications range from entertainment, medicine, military, to education.
Composing virtual environment using images of digital camera
Haike Guan, Shin Aoki, Koichi Ejiri
We present a new method to make virtual environment by using a sphere for reference. Images taken with digital camera or video camera are projected to the sphere. Relative orientation of adjacent images is determined by a linear transform and directions of all the images relative to the sphere are determined by multiplying the linear transform and directions of all the images relative to the sphere are determined by multiplying the linear transform matrix. Images are dynamically composed and projected back at a selected viewing direction without using an environment map. We have derived mathematical formula of the transform matrix. To improve the back projection speed, we also make environment maps by projecting images from the sphere to equator plane by setting projection center at southern or northern pole of the sphere. 3D scene can be projected to two circles in almost uniform pixel density without singularity at the poles. Distortion of lens causes large accumulated registration error. We developed a method for calibrating and correcting the distortion using matched corresponding points of adjacent images. A pyramid-based image matching method is also developed to reduce accumulated registration error.
Thin wide-field-of-view HMD with free-form-surface prism and applications
Shoichi Yamazaki, Kazutaka Inoguchi, Yoshihiro Saito, et al.
The HMD optical system composed of 'free form surface prism' (FFS prism) was presented by Canon Inc. at the 1996 SPIE conference. This prism was consists of aspherical surfaces without rotational symmetry. This HMD was suitable for compact HMD and was the 180,000 pixels display which has 34 degrees horizontal FOV and less than 15mm prism thickness. We have developed a new see-through 3D HMD with high resolution, wide field of view (FOV) by improving this FFS prism technique. The new HMD with 51 degrees horizontal FOV and large viewing eyebox shows clear full color image with 920,000 pixels. In spite of the wide FOV, the thickness of this new FFS prism is very thin, 17.9 mm. In this paper, we report this new HMD and 'the AR2 hockey system' as an example of this HMD application.
Dynamic focusing in head-mounted displays
Jannick P. Rolland, Myron W. Krueger, Alexei A. Goon
In stereoscopic virtual environment system, vergence eye movements are required but the absence of the need to accommodate is not consistent with real-world vision. Ideally, virtual objects would be displayed at the appropriate distances from the viewer and natural, concordant accommodation and vergence would be required. Based on optical principles and human vision, we investigate the feasibility of a novel display based on multiple depth planes. arrays to provide these cues. We then briefly discuss some design approaches to focusing at multiple depth planes.