Proceedings Volume 9011

Stereoscopic Displays and Applications XXV

cover
Proceedings Volume 9011

Stereoscopic Displays and Applications XXV

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 19 March 2014
Contents: 19 Sessions, 67 Papers, 0 Presentations
Conference: IS&T/SPIE Electronic Imaging 2014
Volume Number: 9011

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 9011
  • Stereoscopic Applications I
  • Autostereoscopic Displays I
  • Subjective Quality of 3D Systems
  • Stereoscopic Applications II
  • Depth Map Capture and Processing
  • 3D Display Systems
  • Human Factors I
  • 3D Developments
  • Stereoscopic Panoramas and 3D Imaging
  • Human Factors II
  • Digital Imaging for Autostereoscopy
  • Autostereoscopic Displays II
  • Optical Elements in 3D Systems
  • Interactive Paper Session: 3D Display Engineering
  • Interactive Paper Session: Stereoscopic Rendering and Standards
  • Interactive Paper Session: Depth Maps and View Synthesis
  • Interactive Paper Session: Stereoscopic Human Factors
  • Interactive Paper Session: Stereoscopic Perception
Front Matter: Volume 9011
icon_mobile_dropdown
Front Matter: Volume 9011
This PDF file contains the front matter associated with SPIE Proceedings Volume 9011, including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
Stereoscopic Applications I
icon_mobile_dropdown
The impact of stereo 3D sports TV broadcasts on user's depth perception and spatial presence experience
K. Weigelt, J. Wiemeyer
This work examines the impact of content and presentation parameters in 2D versus 3D on depth perception and spatial presence, and provides guidelines for stereoscopic content development for 3D sports TV broadcasts and cognate subjects. Under consideration of depth perception and spatial presence experience, a preliminary study with 8 participants (sports: soccer and boxing) and a main study with 31 participants (sports: soccer and BMX-Miniramp) were performed. The dimension (2D vs. 3D) and camera position (near vs. far) were manipulated for soccer and boxing. In addition for soccer, the field of view (small vs. large) was examined. Moreover, the direction of motion (horizontal vs. depth) was considered for BMX-Miniramp. Subjective assessments, behavioural tests and qualitative interviews were implemented. The results confirm a strong effect of 3D on both depth perception and spatial presence experience as well as selective influences of camera distance and field of view. The results can improve understanding of the perception and experience of 3D TV as a medium. Finally, recommendations are derived on how to use various 3D sports ideally as content for TV broadcasts.
Autostereoscopic Displays I
icon_mobile_dropdown
A novel stereoscopic display technique with improved spatial and temporal properties
Paul V. Johnson, Joohwan Kim, Martin S. Banks
Common stereoscopic 3D (S3D) displays utilize either spatial or temporal interlacing to send different images to each eye. Temporal interlacing sends content to the left and right eyes alternatingly in time, and is prone to artifacts such as flicker, unsmooth motion, and depth distortion. Spatial interlacing sends even pixel rows to one eye and odd rows to the other eye, and has a lower effective spatial resolution than temporal interlacing unless the viewing distance is large. We propose a spatiotemporal hybrid protocol that interlaces the left- and right-eye views spatially, but the rows corresponding to each eye alternate every frame. We performed psychophysical experiments to compare this novel stereoscopic display protocol to existing methods in terms of spatial and temporal properties. Using a haploscope to simulate the three protocols, we determined perceptual thresholds for flicker, motion artifacts, and depth distortion, and we measured the effective spatial resolution. Spatial resolution is improved, flicker and motion artifacts are reduced, and depth distortion is eliminated. These results suggest that the hybrid protocol maintains the benefits of spatial and temporal interlacing while eliminating the artifacts, thus creating a more realistic viewing experience.
Frameless multiview display modules employing flat-panel displays for a large-screen autostereoscopic display
A large-screen autostereoscopic display enables life-size realistic communication. In this study, we propose the tiling of frameless multi-view display modules employing flat-panel displays. A flat-panel multi-view display and an imaging system with a magnification greater than one are combined to construct a multi-view display module with a frameless screen. The module screen consists of a lens and a vertical diffuser to generate viewpoints in the observation space and to increase the vertical viewing zone. When the modules are tiled, the screen lens should be appropriately shifted to produce a common viewing area for all modules. We designed and constructed the multi-view display modules, which have a screen size of 27.3 in. and a resolution of 320 × 200. The module depth was 1.5 m and the number of viewpoints was 144. The viewpoints were generated with a horizontal interval of 16 mm at a distance of 5.1 m from the screen. Four modules were constructed and aligned in the vertical direction to demonstrate a middle-size screen system. The tiled screen had a screen size of 62.4 in. (589 mm × 1,472 mm). The prototype system can display almost human-size objects.
Vertical parallax added tabletop-type 360-degree three-dimensional display
Yasuhiro Takaki, Junya Nakamura
The generation of full-parallax and 360-degree three-dimensional (3D) images on a tabletop screen is proposed. The proposed system comprises a small array of high-speed projectors and a rotating screen. All projectors are located at different heights from the screen. The lens shift technique is used to superimpose all images generated by the projectors onto the rotating screen. Because the rotating screen has an off-axis lens function, the image of the projection lens generates a viewpoint in the space, and the screen rotation generates a number of viewpoints on a circle around the rotating screen. Because all projectors are located at different heights, different projectors generate the viewpoints at different heights. Therefore, multiple viewpoints are aligned in the vertical direction to provide the vertical parallax. The proposed technique was experimentally verified. Three DMD projectors were used to generate three viewpoints in the vertical direction. The heights of the viewpoints were 720, 764, and 821 mm. Each projector generated 900 viewpoints on a circle. The diameter of the rotating screen was 300 mm. The frame rate was 24.7 Hz. The generation of 360-degree 3D images with the horizontal and vertical parallaxes was verified.
A variable-collimation display system
Robert Batchko, Sam Robinson, Jack Schmidt, et al.
Two important human depth cues are accommodation and vergence. Normally, the eyes accommodate and converge or diverge in tandem; changes in viewing distance cause the eyes to simultaneously adjust both focus and orientation. However, ambiguity between accommodation and vergence cues is a well-known limitation in many stereoscopic display technologies. This limitation also arises in state-of-the-art full-flight simulator displays. In current full-flight simulators, the out-the-window (OTW) display (i.e., the front cockpit window display) employs a fixed collimated display technology which allows the pilot and copilot to perceive the OTW training scene without angular errors or distortions; however, accommodation and vergence cues are limited to fixed ranges (e.g., ~ 20 m). While this approach works well for long-range, the ambiguity of depth cues at shorter range hinders the pilot’s ability to gauge distances in critical maneuvers such as vertical take-off and landing (VTOL). This is the first in a series of papers on a novel, variable-collimation display (VCD) technology that is being developed under NAVY SBIR Topic N121-041 funding. The proposed VCD will integrate with rotary-wing and vertical take-off and landing simulators and provide accurate accommodation and vergence cues for distances ranging from approximately 3 m outside the chin window to ~ 20 m. A display that offers dynamic accommodation and vergence could improve pilot safety and training, and impact other applications presently limited by lack of these depth cues.
Subjective Quality of 3D Systems
icon_mobile_dropdown
Subjective evaluation of a 3D videoconferencing system
Hadi Rizek, Kjell Brunnström, Kun Wang, et al.
A shortcoming of traditional videoconferencing systems is that they present the user with a flat, two-dimensional image of the remote participants. Recent advances in autostereoscopic display technology now make it possible to develop video conferencing systems supporting true binocular depth perception. In this paper, we present a subjective evaluation of a prototype multiview autostereoscopic video conferencing system and suggest a number of possible improvements based on the results. Whereas methods for subjective evaluation of traditional 2D videoconferencing systems are well established, the introduction of 3D requires an extension of the test procedures to assess the quality of depth perception. For this purpose, two depth-based test tasks have been designed and experiments have been conducted with test subjects comparing the 3D system to a conventional 2D video conferencing system. The outcome of the experiments show that the perception of depth is significantly improved in the 3D system, but the overall quality of experience is higher in the 2D system.
Subjective quality assessment for stereoscopic video: case study on robust watermarking
R. Bensaied, M. Mitrea, A. Chammem, et al.
This paper investigates three key issues related to full reference subjective quality evaluation tests for stereoscopic video, namely, the number of quality levels on the grading scale, the number of observers in the evaluation panel, and the inter-gender variability. It is theoretically demonstrated that the scores assigned by the observers on a continuous grading scale can be a posteriori mapped to any discrete grading scale, with controlled statistical accuracy. The experiments, performed in laboratory conditions, consider image quality, depth perception and visual comfort. The original content (i.e. the full reference) is represented by the 3DLive corpus, composed of 2 hours 11 minutes of HD 3DTV content. The modified content (i.e. the content to be evaluated) is obtained by watermarking this corpus with four methods. A panel of 60 observers (32 males and 28 females) was established from which further randomly selected sub-panels of 30 and 15 observers were also subsequently extracted. In order to simulate a continuous scale, the subjective evaluation was carried out on 100 quality levels, which are a posteriori mapped to discrete scales of q quality levels, with q between 2 and 9. The statistical investigation focused on the Mean Opinion Score and considered three types of statistical inferences: outliers detection, confidence limits, and paired t-tests.
Measuring perceived depth in natural images and study of its relation with monocular and binocular depth cues
The perception of depth in images and video sequences is based on different depth cues. Studies have considered depth perception threshold as a function of viewing distance (Cutting and Vishton, 1995), the combination of different monocular depth cues and their quantitative relation with binocular depth cues and their different possible type of interactions (Landy, l995). But these studies only consider artificial stimuli and none of them attempts to provide a quantitative contribution of monocular and binocular depth cues compared to each other in the specific context of natural images. This study targets this particular application case. The evaluation of the strength of different depth cues compared to each other using a carefully designed image database to cover as much as possible different combinations of monocular (linear perspective, texture gradient, relative size and defocus blur) and binocular depth cues. The 200 images were evaluated in two distinct subjective experiments to evaluate separately perceived depth and different monocular depth cues. The methodology and the description of the definition of the different scales will be detailed. The image database (DC3Dimg) is also released for the scientific community.
Subjective evaluation of two stereoscopic imaging systems exploiting visual attention to improve 3D quality of experience
Philippe Hanhart, Touradj Ebrahimi
Crosstalk and vergence-accommodation rivalry negatively impact the quality of experience (QoE) provided by stereoscopic displays. However, exploiting visual attention and adapting the 3D rendering process on the fly can reduce these drawbacks. In this paper, we propose and evaluate two different approaches that exploit visual attention to improve 3D QoE on stereoscopic displays: an offline system, which uses a saliency map to predict gaze position, and an online system, which uses a remote eye tracking system to measure real time gaze positions. The gaze points were used in conjunction with the disparity map to extract the disparity of the object-of-interest. Horizontal image translation was performed to bring the fixated object on the screen plane. The user preference between standard 3D mode and the two proposed systems was evaluated through a subjective evaluation. Results show that exploiting visual attention significantly improves image quality and visual comfort, with a slight advantage for real time gaze determination. Depth quality is also improved, but the difference is not significant.
Subjective quality and depth assessment in stereoscopic viewing of volume-rendered medical images
Johanna Rousson, Jeanne Couturou, Arnout Vetsuypens, et al.
No study to-date explored the relationship between perceived image quality (IQ) and perceived depth (DP) in stereoscopic medical images. However, this is crucial to design objective quality metrics suitable for stereoscopic medical images. This study examined this relationship using volume-rendered stereoscopic medical images for both dual- and single-view distortions. The reference image was modified to simulate common alterations occurring during the image acquisition stage or at the display side: added white Gaussian noise, Gaussian filtering, changes in luminance, brightness and contrast. We followed a double stimulus five-point quality scale methodology to conduct subjective tests with eight non-expert human observers. The results suggested that DP was very robust to luminance, contrast and brightness alterations and insensitive to noise distortions until standard deviation σ=20 and crosstalk rates of 7%. In contrast, IQ seemed sensitive to all distortions. Finally, for both DP and IQ, the Friedman test indicated that the quality scores for dual-view distortions were significantly worse than scores for single-view distortions for multiple blur levels and crosstalk impairments. No differences were found for most levels of brightness, contrast and noise distortions. So, DP and IQ didn’t react equivalently to identical impairments, and both depended whether dual- or single-view distortions were applied.
Stereoscopic Applications II
icon_mobile_dropdown
Interlopers 3D: experiences designing a stereoscopic game
James Weaver, Nicolas S. Holliman
Background In recent years 3D-enabled televisions, VR headsets and computer displays have become more readily available in the home. This presents an opportunity for game designers to explore new stereoscopic game mechanics and techniques that have previously been unavailable in monocular gaming. Aims To investigate the visual cues that are present in binocular and monocular vision, identifying which are relevant when gaming using a stereoscopic display. To implement a game whose mechanics are so reliant on binocular cues that the game becomes impossible or at least very difficult to play in non-stereoscopic mode. Method A stereoscopic 3D game was developed whose objective was to shoot down advancing enemies (the Interlopers) before they reached their destination. Scoring highly required players to make accurate depth judgments and target the closest enemies first. A group of twenty participants played both a basic and advanced version of the game in both monoscopic 2D and stereoscopic 3D. Results The results show that in both the basic and advanced game participants achieved higher scores when playing in stereoscopic 3D. The advanced game showed that by disrupting the depth from motion cue the game became more difficult in monoscopic 2D. Results also show a certain amount of learning taking place over the course of the experiment, meaning that players were able to score higher and finish the game faster over the course of the experiment. Conclusions Although the game was not impossible to play in monoscopic 2D, participants results show that it put them at a significant disadvantage when compared to playing in stereoscopic 3D.
Architecture for high performance stereoscopic game rendering on Android
Julien Flack, Hugh Sanderson, Sampath Shetty
Stereoscopic gaming is a popular source of content for consumer 3D display systems. There has been a significant shift in the gaming industry towards casual games for mobile devices running on the Android™ Operating System and driven by ARM™ and other low power processors. Such systems are now being integrated directly into the next generation of 3D TVs potentially removing the requirement for an external games console. Although native stereo support has been integrated into some high profile titles on established platforms like Windows PC and PS3 there is a lack of GPU independent 3D support for the emerging Android platform. We describe a framework for enabling stereoscopic 3D gaming on Android for applications on mobile devices, set top boxes and TVs. A core component of the architecture is a 3D game driver, which is integrated into the Android OpenGL™ ES graphics stack to convert existing 2D graphics applications into stereoscopic 3D in real-time. The architecture includes a method of analyzing 2D games and using rule based Artificial Intelligence (AI) to position separate objects in 3D space. We describe an innovative stereo 3D rendering technique to separate the views in the depth domain and render directly into the display buffer. The advantages of the stereo renderer are demonstrated by characterizing the performance in comparison to more traditional render techniques, including depth based image rendering, both in terms of frame rates and impact on battery consumption.
Comprehensive evaluation of latest 2D/3D monitors and comparison to a custom-built 3D mirror-based display in laparoscopic surgery
Dirk Wilhelm, Silvano Reiser, Nils Kohn, et al.
Though theoretically superior, 3D video systems did not yet achieve a breakthrough in laparoscopic surgery. Furthermore, visual alterations, such as eye strain, diplopia and blur have been associated with the use of stereoscopic systems. Advancements in display and endoscope technology motivated a re-evaluation of such findings. A randomized study on 48 test subjects was conducted to investigate whether surgeons can benefit from using most current 3D visualization systems. Three different 3D systems, a glasses-based 3D monitor, an autostereoscopic display and a mirror-based theoretically ideal 3D display were compared to a state-of-the-art 2D HD system. The test subjects split into a novice and an expert surgeon group, which high experience in laparoscopic procedures. Each of them had to conduct a well comparable laparoscopic suturing task. Multiple performance parameters like task completion time and the precision of stitching were measured and compared. Electromagnetic tracking provided information on the instruments path length, movement velocity and economy. The NASA task load index was used to assess the mental work load. Subjective ratings were added to assess usability, comfort and image quality of each display. Almost all performance parameters were superior for the 3D glasses-based display as compared to the 2D and the autostereoscopic one, but were often significantly exceeded by the mirror-based 3D display. Subjects performed the task at average 20% faster and with a higher precision. Work-load parameters did not show significant differences. Experienced and non-experienced laparoscopists profited equally from 3D. The 3D mirror system gave clear evidence for additional potential of 3D visualization systems with higher resolution and motion parallax presentation.
A stereoscopic system for viewing the temporal evolution of brain activity clusters in response to linguistic stimuli
Angus Forbes, Javier Villegas, Kyle R. Almryde, et al.
In this paper, we present a novel application, 3D+Time Brain View, for the stereoscopic visualization of functional Magnetic Resonance Imaging (fMRI) data gathered from participants exposed to unfamiliar spoken languages. An analysis technique based on Independent Component Analysis (ICA) is used to identify statistically significant clusters of brain activity and their changes over time during different testing sessions. That is, our system illustrates the temporal evolution of participants' brain activity as they are introduced to a foreign language through displaying these clusters as they change over time. The raw fMRI data is presented as a stereoscopic pair in an immersive environment utilizing passive stereo rendering. The clusters are presented using a ray casting technique for volume rendering. Our system incorporates the temporal information and the results of the ICA into the stereoscopic 3D rendering, making it easier for domain experts to explore and analyze the data.
Depth Map Capture and Processing
icon_mobile_dropdown
Fusion of Kinect depth data with trifocal disparity estimation for near real-time high quality depth maps generation
Guillaume Boisson, Paul Kerbiriou, Valter Drazic, et al.
Generating depth maps along with video streams is valuable for Cinema and Television production. Thanks to the improvements of depth acquisition systems, the challenge of fusion between depth sensing and disparity estimation is widely investigated in computer vision. This paper presents a new framework for generating depth maps from a rig made of a professional camera with two satellite cameras and a Kinect device. A new disparity-based calibration method is proposed so that registered Kinect depth samples become perfectly consistent with disparities estimated between rectified views. Also, a new hierarchical fusion approach is proposed for combining on the flow depth sensing and disparity estimation in order to circumvent their respective weaknesses. Depth is determined by minimizing a global energy criterion that takes into account the matching reliability and the consistency with the Kinect input. Thus generated depth maps are relevant both in uniform and textured areas, without holes due to occlusions or structured light shadows. Our GPU implementation reaches 20fps for generating quarter-pel accurate HD720p depth maps along with main view, which is close to real-time performances for video applications. The estimated depth is high quality and suitable for 3D reconstruction or virtual view synthesis.
Depth map post-processing for depth-image-based rendering: a user study
Matej Nezveda, Nicole Brosch, Florian Seitner, et al.
We analyse the impact of depth map post-processing techniques on the visual quality of stereo pairs that contain a novel view. To this end, we conduct a user study, in which we address (1) the effects of depth map post­ processing on the quality of stereo pairs that contain a novel view and (2) the question whether objective quality metrics are suitable for evaluating them. We generate depth maps of six stereo image pairs and apply six different post-processing techniques. The unprocessed and the post-processed depth maps are used to generate novel views. The original left views and the novel views form the stereo pairs that are evaluated in a paired comparison study. The obtained results are compared with the results delivered by the objective quality metrics. We show that post-processing depth maps significantly enhances the perceived quality of stereo pairs that include a novel view. We further observe that the correlation between subjective and objective quality is weak.
Local disparity remapping to enhance depth quality of stereoscopic 3D images using stereoacuity function
Hosik Sohn, Yong Ju Jung, Yong Man Ro
This paper proposes a simple but effective method for local disparity remapping that is capable of enhancing the depth quality of stereoscopic 3D images. In order to identify and scale the unperceivable difference in disparities in a scene, the proposed approach decomposes the disparity map into a two disparity layers: one is the coarse disparity layer that contains the information of the global depth structures and the other is the detail disparity layer that contains the details of the depth structures. Then, the proposed method adaptively manipulates the detail disparity layer in depth and image spaces under the guidance of a stereoacuity function, which describes the minimum amount of perceivable disparity difference given a disparity magnitude. In this way, relative depths between objects (or regions) can be effectively emphasized while providing a spatial adaptability in the image space. Experimental results showed that the proposed method was capable of improving the depth quality and overall viewing quality of stereoscopic 3D images as well.
Efficient quality enhancement of disparity maps based on alpha matting
Nicole Brosch, Matej Nezveda, Margrit Gelautz, et al.
We propose an efficient disparity map enhancement method that improves the alignment of disparity edges and color edges even in the presence of mixed pixels and provides alpha values for pixels at disparity edges as a byproduct. In contrast to previous publications, the proposed method addresses mixed pixels at disparity edges and does not introduce mixed disparities that can lead to object deformations in synthesized views. The proposed algorithm computes transparencies by performing alpha matting per disparity-layer. These alpha values indicate the degree of affiliation to a disparity-layer and can hence be used as an indicator for a disparity reassignment that aligns disparity edges with color edges and accounts for mixed pixels. We demonstrate the capabilities of the proposed method on various images and corresponding disparity maps, including images that contain fuzzy object borders (e.g., fur). Furthermore, the proposed method is qualitatively and quantitatively evaluated using disparity ground truth and compared to previously published disparity post-processing methods.
3D Display Systems
icon_mobile_dropdown
Description of a 3D display with motion parallax and direct interaction
J. Tu, M. F. Flynn
We present a description of a time sequential stereoscopic display which separates the images using a segmented polarization switch and passive eyewear. Additionally, integrated tracking cameras and an SDK on the host PC allow us to implement motion parallax in real time.
LCD masks for spatial augmented reality
Quinn Y. J. Smithwick, Daniel Reetz, Lanny Smoot
One aim of Spatial Augmented Reality is to visually integrate synthetic objects into real-world spaces amongst physical objects, viewable by many observers without 3D glasses, head-mounted displays or mobile screens. In common implementations, using beam-combiners, scrim projection, or transparent self-emissive displays, the synthetic object’s and real-world scene’s light combine additively. As a result, synthetic objects appear low-contrast and semitransparent against well-lit backgrounds, and do not cast shadows. These limitations prevent synthetic objects from appearing solid and visually integrated into the real-world space. We use a transparent LCD panel as a programmable dynamic mask. The LCD panel displaying the synthetic object’s silhouette mask is colocated with the object’s color image, both staying aligned for all points-of-view. The mask blocks the background providing occlusion, presents a black level for high-contrast images, blocks scene illumination thus casting true shadows, and prevents blow-by in projection scrim arrangements. We have several implementations of SAR with LCD masks: 1) beam-combiner with an LCD mask, 2) scrim projection with an LCD mask, and 3) transparent OLED display with an LCD mask. Large format (80” diagonal) and dual layer volumetric variations are also implemented.
Transparent stereoscopic display and application
Nicola Ranieri, Hagen Seifert, Markus Gross
Augmented reality has become important to our society as it can enrich the actual world with virtual information. Transparent screens offer one possibility to overlay rendered scenes with the environment, acting both as display and window. In this work, we review existing transparent back-projection screens for the use with active and passive stereo. Advantages and limitations are described and, based on these insights, a passive stereoscopic system using an anisotropic back-projection foil is proposed. To increase realism, we adapt rendered content to the viewer's position using a Kinect tracking system, which adds motion parallax to the binocular cues. A technique well known in control engineering is used to decrease latency and increase frequency of the tracker. Our transparent stereoscopic display prototype provides immersive viewing experience and is suitable for many augmented reality applications.
A hand-held immaterial volumetric display
We have created an ultralight, movable, “immaterial” fogscreen. It is based on the fogscreen mid-air imaging technology. The hand-held unit is roughly the size and weight of an ordinary toaster. If the screen is tracked, it can be swept in the air to create mid-air slices of volumetric objects, or to show augmented reality (AR) content on top of real objects. Interfacing devices and methodologies, such as hand and gesture trackers, camera-based trackers and object recognition, can make the screen interactive. The user can easily interact with any physical object or virtual information, as the screen is permeable. Any real objects can be seen through the screen, instead of e.g., through a video-based augmented reality screen. It creates a mixed reality setup where both the real world object and the augmented reality content can be viewed and interacted with simultaneously. The hand-held mid-air screen can be used e.g., as a novel collaborating or classroom tool for individual students or small groups.
Human Factors I
icon_mobile_dropdown
Perceived crosstalk assessment on patterned retarder 3D display
CONTEXT: Nowadays, almost all stereoscopic displays suffer from crosstalk, which is one of the most dominant degradation factors of image quality and visual comfort for 3D display devices. To deal with such problems, it is worthy to quantify the amount of perceived crosstalk OBJECTIVE: Crosstalk measurements are usually based on some certain test patterns, but scene content effects are ignored. To evaluate the perceived crosstalk level for various scenes, subjective test may bring a more correct evaluation. However, it is a time consuming approach and is unsuitable for real­ time applications. Therefore, an objective metric that can reliably predict the perceived crosstalk is needed. A correct objective assessment of crosstalk for different scene contents would be beneficial to the development of crosstalk minimization and cancellation algorithms which could be used to bring a good quality of experience to viewers. METHOD: A patterned retarder 3D display is used to present 3D images in our experiment. By considering the mechanism of this kind of devices, an appropriate simulation of crosstalk is realized by image processing techniques to assign different values of crosstalk to each other between image pairs. It can be seen from the literature that the structures of scenes have a significant impact on the perceived crosstalk, so we first extract the differences of the structural information between original and distorted image pairs through Structural SIMilarity (SSIM) algorithm, which could directly evaluate the structural changes between two complex-structured signals. Then the structural changes of left view and right view are computed respectively and combined to an overall distortion map. Under 3D viewing condition, because of the added value of depth, the crosstalk of pop-out objects may be more perceptible. To model this effect, the depth map of a stereo pair is generated and the depth information is filtered by the distortion map. Moreover, human attention is one of important factors for crosstalk assessment due to the fact that when viewing 3D contents, perceptual salient regions are highly likely to be a major contributor to determining the quality of experience of 3D contents. To take this into account, perceptual significant regions are extracted, and a spatial pooling technique is used to combine structural distortion map, depth map and visual salience map together to predict the perceived crosstalk more precisely. To verify the performance of the proposed crosstalk assessment metric, subjective experiments are conducted with 24 participants viewing and rating 60 simuli (5 scenes * 4 crosstalk levels * 3 camera distances). After an outliers removal and statistical process, the correlation with subjective test is examined using Pearson and Spearman rank-order correlation coefficient. Furthermore, the proposed method is also compared with two traditional 2D metrics, PSNR and SSIM. The objective score is mapped to subjective scale using a nonlinear fitting function to directly evaluate the performance of the metric. RESULIS: After the above-mentioned processes, the evaluation results demonstrate that the proposed metric is highly correlated with the subjective score when compared with the existing approaches. Because the Pearson coefficient of the proposed metric is 90.3%, it is promising for objective evaluation of the perceived crosstalk. NOVELTY: The main goal of our paper is to introduce an objective metric for stereo crosstalk assessment. The novelty contributions are twofold. First, an appropriate simulation of crosstalk by considering the characteristics of patterned retarder 3D display is developed. Second, an objective crosstalk metric based on visual attention model is introduced.
Subjective evaluation of an active crosstalk reduction system for mobile autostereoscopic displays
Alexandre Chappuis, Martin Rerabek, Philippe Hanhart, et al.
The Quality of Experience (QoE) provided by autostereoscopic 3D displays strongly depends on the user position. For an optimal image quality, the observer should be located at one of the relevant positions, called sweet spots, where artifacts reducing the QoE, such as crosstalk, are minimum. In this paper, we propose and evaluate a complete active crosstalk reduction system running on an HTC EVO 3D smartphone. To determine the crosstalk level at each position, a full display characterization was performed. Based on the user position and crosstalk profile, the system first helps the user to find the sweet spot using visual feedback. If the user moves away from the sweet spot, then the active crosstalk compensation is performed and reverse stereo phenomenon is corrected. The user preference between standard 2D and 3D modes, and the proposed system was evaluated through a subjective quality assessment. Results show that in terms of depth perception, the proposed system clearly outperforms the 3D and 2D modes. In terms of image quality, 2D mode was found to be best, but the proposed system outperforms 3D mode.
Study of blur discrimination for 3D stereo viewing
Blur is an important attribute in the study and modeling of the human visual system. Blur discrimination was studied extensively using 2D test patterns. In this study, we present the details of subjective tests performed to measure blur discrimination thresholds using stereoscopic 3D test patterns. Specifically, the effect of disparity on the blur discrimination thresholds is studied on a passive stereoscopic 3D display. The blur discrimination thresholds are measured using stereoscopic 3D test patterns with positive, negative and zero disparity values, at multiple reference blur levels. A disparity value of zero represents the 2D viewing case where both the eyes will observe the same image. The subjective test results indicate that the blur discrimination thresholds remain constant as we vary the disparity value. This further indicates that binocular disparity does not affect blur discrimination thresholds and the models developed for 2D blur discrimination thresholds can be extended to stereoscopic 3D blur discrimination thresholds. We have presented fitting of the Weber model to the 3D blur discrimination thresholds measured from the subjective experiments.
The effect of stereoscopic acquisition parameters on both distortion and comfort
The purpose of our experiments was to investigate the effect of interaxial camera separation on the perceived shape and viewing comfort of 3D images. Horizontal Image Translation (HIT) and interaxial distance were altered together. Following Banks et al (2009), our stimuli were simple stereoscopic hinges and we measured the perceived angle as a function of camera separation. We compared the predictions based on ray tracing with the perceived 3D shape obtained psychophysically. 40 participants were asked to judge the angles of 250 hinges at different camera separations (interaxial and HIT linked a 20-100mm; angle range: 50°-130°). Comfort data was obtained using a five point Likert scale. Stimuli were presented in orthoscopic conditions with screen and observer Field of View (FOVO) matched at 45°. Our main results are: (1) For the 60mm camera separation, observers perceived a right angle correctly, but at other camera separations right angles were perceived as larger than 90° (camera separations > 60mm) or smaller than 90° (camera separations < 60 mm). (2) The observed perceptual deviations from a right angle were smaller than predicted based on disparity information (ray tracing model) alone. (3) We found an interaction between comfort and camera separation: only at the 60mm camera separation (e.g. at typical human eye separation) do we find a significant negative correlation between angle and comfort. All other camera separations, the disparity (angle) has no systematic effect on comfort. This research is set out to provide a foundation for tolerance limits for comfort and perceptual distortions brought about by various virtual camera separations.
3D Developments
icon_mobile_dropdown
Fully automatic 2D to 3D conversion with aid of high-level image features
Vikram Appia, Umit Batur
With the recent advent in 3D display technology, there is an increasing need for conversion of existing 2D content into rendered 3D views. We propose a fully automatic 2D to 3D conversion algorithm that assigns relative depth values to the various objects in a given 2D image/scene and generates two different views (stereo pair) using a Depth Image Based Rendering (DIBR) algorithm for 3D displays. The algorithm described in this paper creates a scene model for each image based on certain low-level features like texture, gradient and pixel location and estimates a pseudo depth map. Since the capture environment is unknown, using low-level features alone creates inaccuracies in the depth map. Using such flawed depth map for 3D rendering will result in various artifacts, causing an unpleasant viewing experience. The proposed algorithm also uses certain high-level image features to overcome these imperfections and generates an enhanced depth map for improved viewing experience. Finally, we show several 3D results generated with our algorithm in the results section.
Stereoscopy for visual simulation of materials of complex appearance
Fernando da Graça, Alexis Paljic, Dominique Lafon-Pham, et al.
The present work studies the role of stereoscopy on perceived surface aspect of computer generated complex materials. The objective is to investigate if, and how, the additional information conveyed by the binocular vision affects the observer judgment on the evaluation of flake density in an effect paint simulation. We have set up a heuristic flake model with a Voronoi: modelization of flakes. The model was implemented in our rendering engine using global illumination, ray tracing, with an off axis-frustum method for the calculation of stereo images. We conducted a user study based on a flake density discrimination task to determine perception thresholds (JNDs). Results show that stereoscopy slightly improves density perception. We propose an analysis methodology based on granulometry. This allows for a discussion of the results on the basis of scales of observation.
A multilayer display augmented by alternating layers of lenticular sheets
Hironobu Gotoda
A multilayer display is an autostereoscopic display constructed by stacking multiple layers of LC (liquid crystal) panels on top of a light source. It is capable of delivering smooth, continuous, and position-dependent images to viewers within a prescribed viewing zone. However, the images thus delivered may contain artifacts, which are inconsistent with real 3D scenes. For example, objects occluding one another may fuse together, or get obscured in the delivered images. To reduce such artifacts, it is often necessary to narrow the viewing zone. Using a directional rather than a uniform light source is one way to mitigate this problem. In this work, we present another solution to the problem. We propose an integrated architecture of multilayer and lenticular displays, where multiple LC panels are sandwiched between pairs of lenticular sheets. By associating a pair of lenticular sheets with a LC panel, each pixel in the panel is transformed into a view-dependent pixel, which is visible only from a particular viewing direction. Since all pixels in the integrated architecture are view-dependent, the display is partitioned into several sub-displays, each of which corresponds to a narrow viewing zone. The partitioning of display will reduce the possibility that the artifacts are noticeable in the delivered images. We will show several simulation results confirming that the proposed extension of multilayer display can deliver more plausible images than conventional multilayer display.
Stereoscopic Panoramas and 3D Imaging
icon_mobile_dropdown
Automatic detection of artifacts in converted S3D video
Alexander Bokov, Dmitriy Vatolin, Anton Zachesov, et al.
In this paper we present algorithms for automatically detecting issues specific to converted S3D content. When a depth-image-based rendering approach produces a stereoscopic image, the quality of the result depends on both the depth maps and the warping algorithms. The most common problem with converted S3D video is edge-sharpness mismatch. This artifact may appear owing to depth-map blurriness at semitransparent edges: after warping, the object boundary becomes sharper in one view and blurrier in the other, yielding binocular rivalry. To detect this problem we estimate the disparity map, extract boundaries with noticeable differences, and analyze edge-sharpness correspondence between views. We pay additional attention to cases involving a complex background and large occlusions. Another problem is detection of scenes that lack depth volume: we present algorithms for detecting at scenes and scenes with at foreground objects. To identify these problems we analyze the features of the RGB image as well as uniform areas in the depth map. Testing of our algorithms involved examining 10 Blu-ray 3D releases with converted S3D content, including Clash of the Titans, The Avengers, and The Chronicles of Narnia: The Voyage of the Dawn Treader. The algorithms we present enable improved automatic quality assessment during the production stage.
Integration of multiple view plus depth data for free viewpoint 3D display
Kazuyoshi Suzuki, Yuko Yoshida, Tetsuya Kawamoto, et al.
This paper proposes a method for constructing a reasonable scale of end-to-end free-viewpoint video system that captures multiple view and depth data, reconstructs three-dimensional polygon models of objects, and display them on virtual 3D CG spaces. This system consists of a desktop PC and four Kinect sensors. First, multiple view plus depth data at four viewpoints are captured by Kinect sensors simultaneously. Then, the captured data are integrated to point cloud data by using camera parameters. The obtained point cloud data are sampled to volume data that consists of voxels. Since volume data that are generated from point cloud data are sparse, those data are made dense by using global optimization algorithm. Final step is to reconstruct surfaces on dense volume data by discrete marching cubes method. Since accuracy of depth maps affects to the quality of 3D polygon model, a simple inpainting method for improving depth maps is also presented.
Human Factors II
icon_mobile_dropdown
Disparity modifications and the emotional effects of stereoscopic images
Takashi Kawai, Daiki Atsuta, Yuya Tomiyama, et al.
This paper describes a study that focuses on disparity changes in emotional scenes of stereoscopic (3D) images, in which an examination of the effects on pleasant and arousal was carried out by adding binocular disparity to 2D images that evoke specific emotions, and applying disparity modification based on the disparity analysis of famous 3D movies. From the results of the experiment, for pleasant, a significant difference was found only for the main effect of the emotions. On the other hand, for arousal, there was a trend of increasing the evaluation values in the order 2D condition, 3D condition and 3D condition applied the disparity modification for happiness, surprise, and fear. This suggests the possibility that binocular disparity and the modification affect arousal.
Improving perception of binocular stereo motion on 3D display devices
Petr Kellnhofer, Tobias Ritschel, Karol Myszkowski, et al.
This paper investigates the presentation of moving stereo images on different display devices. We address three important issues. First, we propose temporal compensation for the Pulfrich effect when using anaglyph glasses. Second, we describe, how content-adaptive capture protocols can reduce false motion-in-depth sensation for time-multiplexing based displays. Third, we conclude with a recommendation how to improve rendering of synthetic stereo animations.
Measurement of perceived stereoscopic sensation through disparity metrics and compositions
Satoshi Toyosawa, Takashi Kawai
Literatures use disparity as a principle measure evaluating discomfort, various artifacts, or movie production styles associated to stereoscopy, yet, statistics used to represent image or frame are often different. The current study examines 20 disparity statistics to find metrics that would best represent subjective stereoscopic sensation. Additionally, effect of disparity distribution pattern within an image is considered: Here, the patterns are categorised either single-peak or multiple-peak from the shape of disparity histogram. In the experiment, 14 stereoscopic images were presented to 15 subjects. Each subject evaluated perceived sense of distance and volume (3D space) through 7 points Likert scale. The result shows that the statistics that correlated significantly to the subjective sensation differed by the disparity compositions, hence, the metrics should be chosen accordingly. For the sense of distance, maximum, range, and the difference between 95th and 5th percentiles were found to be appropriate metrics under the single-peak, and minimum, contrast, and 5th percentile were representative under the multiple-peak. Similarly, for the sense of volume, range was found to be appropriate under the single-peak, but no metrics was found under the multiple-peak. The discrepancy is assumed due to different observation styles under differently composed images. We believe that the current study provides optimal disparity metrics for stereoscopic sensation measurements.
Stereo and motion cues effect on depth perception of volumetric data
Isaac Cho, Zachary Wartell, Wenwen Dou, et al.
Displays supporting stereoscopic and head-coupled motion parallax can enhance human perception of containing 3D surfaces and 3D networks but less for so volumetric data. Volumetric data is characterized by a heavy presence of transparency, occlusion and highly ambiguous spatial structure. There are many different rendering and visualization algorithms and interactive techniques that enhance perception of volume data and these techniques‟ effectiveness have been evaluated. However, how VR display technologies affect perception of volume data is less well studied. Therefore, we conduct two formal experiments on how various display conditions affect a participant‟s depth perception accuracy of a volumetric dataset. Our results show effects of VR displays for human depth perception accuracy for volumetric data. We discuss the implications of these finding for designing volumetric data visualization tools that use VR displays. In addition, we compare our result to previous works on 3D networks and discuss possible reasons for and implications of the different results.
Digital Imaging for Autostereoscopy
icon_mobile_dropdown
Compression for full-parallax light field displays
Danillo B. Graziosi, Zahir Y. Alpaslan, Hussein S. El-Ghoroury
Full-parallax light field displays utilize a large volume of data and demand efficient real-time compression algorithms to be viable. Many compression techniques have been proposed. However, such solutions are impractical in bandwidth, processing or power requirements for a real-time implementation. Our method exploits the spatio angular redundancy in a full parallax light field to compress the light field image, while reducing the total computational load with minimal perceptual degradation. Objective analysis shows that depending on content, bandwidth reduction from two to four orders of magnitude is possible. Subjective analysis shows that the compression technique produces images with acceptable quality, and the system can successfully reproduce the 3D light field, providing natural binocular and full motion parallax.
Joint estimation of high resolution images and depth maps from light field cameras
Kazuki Ohashi, Keita Takahashi, Toshiaki Fujii
Light field cameras are attracting much attention as tools for acquiring 3D information of a scene through a single camera. The main drawback of typical lenselet-based light field cameras is the limited resolution. This limitation comes from the structure where a microlens array is inserted between the sensor and the main lens. The microlens array projects 4D light field on a single 2D image sensor at the sacrifice of the resolution; the angular resolution and the position resolution trade-off under the fixed resolution of the image sensor. This fundamental trade-off remains after the raw light field image is converted to a set of sub-aperture images. The purpose of our study is to estimate a higher resolution image from low resolution sub-aperture images using a framework of super-resolution reconstruction. In this reconstruction, these sub-aperture images should be registered as accurately as possible. This registration is equivalent to depth estimation. Therefore, we propose a method where super-resolution and depth refinement are performed alternatively. Most of the process of our method is implemented by image processing operations. We present several experimental results using a Lytro camera, where we increased the resolution of a sub-aperture image by three times horizontally and vertically. Our method can produce clearer images compared to the original sub-aperture images and the case without depth refinement.
Enhancing multi-view autostereoscopic displays by viewing distance control (VDC)
Silvio Jurk, Bernd Duckstein, Sylvain Renault, et al.
Conventional multi-view displays spatially interlace various views of a 3D scene and form appropriate viewing channels. However, they only support sufficient stereo quality within a limited range around the nominal viewing distance (NVD). If this distance is maintained, two slightly divergent views are projected to the person’s eyes, both covering the entire screen. With increasing deviations from the NVD the stereo image quality decreases. As a major drawback in usability, the manufacturer so far assigns this distance. We propose a software-based solution that corrects false view assignments depending on the distance of the viewer. Our novel approach enables continuous view adaptation based on the calculation of intermediate views and a column-bycolumn rendering method. The algorithm controls each individual subpixel and generates a new interleaving pattern from selected views. In addition, we use color-coded test content to verify its efficacy. This novel technology helps shifting the physically determined NVD to a user-defined distance thereby supporting stereopsis. The recent viewing positions can fall in front or behind the NVD of the original setup. Our algorithm can be applied to all multi-view autostereoscopic displays — independent of the ascent or the periodicity of the optical element. In general, the viewing distance can be corrected with a factor of more than 2.5. By creating a continuous viewing area the visualized 3D content is suitable even for persons with largely divergent intraocular distance — adults and children alike — without any deficiency in spatial perception.
Autostereoscopic Displays II
icon_mobile_dropdown
Vision-based calibration of parallax barrier displays
Nicola Ranieri, Markus Gross
Static and dynamic parallax barrier displays became very popular over the past years. Especially for single viewer applications like tablets, phones and other hand-held devices, parallax barriers provide a convenient solution to render stereoscopic content. In our work we present a computer vision based calibration approach to relate image layer and barrier layer of parallax barrier displays with unknown display geometry for static or dynamic viewer positions using homographies. We provide the math and methods to compose the required homographies on the fly and present a way to compute the barrier without the need of any iteration. Our GPU implementation is stable and general and can be used to reduce latency and increase refresh rate of existing and upcoming barrier methods.
Time-division multiplexing parallax barrier based on primary colors
4-view parallax barrier is considered to be a practical way to solve the viewing zone issue of conventional 2-view parallax barrier. To realize a flickerless 4-view system that provides full display resolution to each view, quadruple timedivision multiplexing with a refresh rate of 240 Hz is necessary. Since 240 Hz displays are not easily available yet at this moment, extra efforts are needed to reduce flickers when executing under a possible lower refresh rate. In our last work, we have managed to realize a prototype with less flickers under 120 Hz by introducing 1-pixel aperture and involving anaglyph into quadruple time-division multiplexing, while either stripe noise or crosstalk noise stands out. In this paper, we introduce a new type of time-division multiplexing parallax barrier based on primary colors, where the barrier pattern is laid like “red-green-blue-black (RGBK)”. Unlike other existing methods, changing the order of the element pixels in the barrier pattern will make a difference in this system. Among the possible alignments, “RGBK” is considered to be able to show less crosstalk while “RBGK” may show less stripe noise. We carried out a psychophysical experiment and found some positive results as expected, which shows that this new type of time-division multiplexing barrier shows more balanced images with stripe noise and crosstalk controlled at a relatively lower level at the same time.
Multi-user autostereoscopic display based on direction-controlled illumination using a slanted cylindrical lens array
Daisuke Miyazaki, Yui Hashimoto, Takahiro Toyota, et al.
This research aims to develop an auto-stereoscopic display, which satisfies the conditions required for practical use, such as, high resolution and large image size comparable to ordinary display devices for television, arbitrary viewing position, multiple viewer availability, suppression of nonuniform luminance distribution, and compact system configuration. In the proposed system, an image display unit is illuminated with a direction-controlled illumination unit, which consists of a spatially modulated parallel light source and a steering optical system. The steering optical system is constructed with a slanted cylindrical array and vertical diffusers. The direction-controlled illumination unit can control output position and horizontal angle of vertically diffused light. The light from the image display unit is controlled to form narrow exit pupil. A viewer can watch the image only when an eye is located at the exit pupil. Auto-stereoscopic view can be achieved by alternately switching the position of an exit pupil at viewer's both eyes, and alternately displaying parallax images. An experimental system was constructed to verify the proposed method. The experimental system consists of a LCD projector and Fresnel lenses for the direction-controlled illumination unit, and a 32 inch full-HD LCD for image display.
Optical Elements in 3D Systems
icon_mobile_dropdown
Accommodation response measurements for integral 3D image
H. Hiura, T. Mishina, J. Arai, et al.
We measured accommodation responses under integral photography (IP), binocular stereoscopic, and real object display conditions, and viewing conditions of binocular and monocular viewing conditions. The equipment we used was an optometric device and a 3D display. We developed the 3D display for IP and binocular stereoscopic images that comprises a high-resolution liquid crystal display (LCD) and a high-density lens array. The LCD has a resolution of 468 dpi and a diagonal size of 4.8 inches. The high-density lens array comprises 106 x 69 micro lenses that have a focal length of 3 mm and diameter of 1 mm. The lenses are arranged in a honeycomb pattern. The 3D display was positioned 60 cm from an observer under IP and binocular stereoscopic display conditions. The target was presented at eight depth positions relative to the 3D display: 15, 10, and 5 cm in front of the 3D display, on the 3D display panel, and 5, 10, 15 and 30 cm behind the 3D display under the IP and binocular stereoscopic display conditions. Under the real object display condition, the target was displayed on the 3D display panel, and the 3D display was placed at the eight positions. The results suggest that the IP image induced more natural accommodation responses compared to the binocular stereoscopic image. The accommodation responses of the IP image were weaker than those of a real object; however, they showed a similar tendency with those of the real object under the two viewing conditions. Therefore, IP can induce accommodation to the depth positions of 3D images.
Optimized design of directional backlight system for time-multiplexed autostereoscopic display based on VHOE
Yong Seok Hwang, Byeong Mok Kim, Eun Soo Kim
In this paper, we propose the novel collimative backlight system for VHOE-based time multiplexed autostereoscopic display. We investigate the deterministic parameters of LGP and light sources for output beam forming such as uniform intensity distribution, uniform angular distribution and the degree of the collimation and control of output angle and propose a novel composite between light source and designed LGP.
Analysis of multiple recording methods for full resolution multi-view autostereoscopic 3D display system incorporating VHOE
In this paper, we propose multiple recording process of photopolymer for a full-color multi-view including multiple-view auto-stereoscopic 3D display system based on VHOE (Volume Holographic Optical Element). To overcome the problems such as low resolution, and limited viewing zone of conventional 3D-display without glasses, we designed multiple recording condition of VHOE for multi-view display. It is verified that VHOE may be optically made by angle-multiplexed recording of pre-designed multiple-viewing zone that uniformly is recorded through optimized exposuretime scheduling scheme. Here, VHOE-based backlight system for 4-view stereoscopic display is implemented, in which the output beams that playing a role reference beam from LGP(Light guide plate)t may be sequentially synchronized with the respective stereo images displayed on the LCD panel.
Interactive Paper Session: 3D Display Engineering
icon_mobile_dropdown
Practical resolution requirements of measurement instruments for precise characterization of autostereoscopic 3D displays
Pierre Boher, Thierry Leroux, Véronique Collomb-Patton, et al.
Different ways to evaluate the optical performances of auto-stereoscopic 3D displays are reviewed. Special attention is paid to the crosstalk measurements that can be performed by measuring, either the precise angular emission at one or few locations on the display surface, or the full display surface emission from very specific locations in front of the display. Using measurements made in the two ways with different instruments on different auto-stereoscopic displays, we show that measurement instruments need to match the resolution of the human eye to obtain reliable results in both cases. Practical requirements in terms of angular resolution for viewing angle measurement instruments and in terms of spatial resolution for imaging instruments are derived and verified on practical examples.
Stereoscopic model for depth-fused 3D (DFD) display
H. Yamamoto, H. Sonobe, A. Tsunakawa, et al.
This paper proposes a stereoscopic model for DFD display that explains the continuous depth modulation and protruding depth perception. The model is composed of four steps: preparation of DFD images, geometrical calculation of viewed images, human visual function for detecting intensity changes, and stereoscopic depth perception. In this paper, two types of displayed images for DFD display are prepared: the former pairs are for conventional DFD, where a fused image is located between the layered images; the latter pairs are for protruding DFD, where a fused image is located closer than the foreground image or further than the background image. Viewed images at both eye positions are simulated geometrically in computer vision optics model. In order to detect intensity changes, we have utilized Laplacian operation on a Gaussian blurred image. Stereoscopic depths are calculated by matching the zero crossing position on the Laplacian operated images. It is revealed that our stereoscopic model explains both conventional and protruding DFDs.
Parallax multi-viewer autostereoscopic three-dimensional display
It is a widely held belief that in the long run, three-dimensional (3D) display should supply stereo to multiple viewers without wearing any viewing aids and free to move. Over the last few decades, great e®orts have been made to approach auto-stereoscopic (AS) display for multiple viewers. Spatial multiplexing technique has ¯rst been employed to accommodate multiple viewers simultaneously in stereoscopic planar display. However, resolution of each view image decreases as the number of viewers increases. Recent development of high-speed liquid crystal display (LCD), which is capable of operating 240-Hz frame rate, makes feasible multi-viewer display via time multiplexing and improving image quality at the same time. In this paper, we propose a display adjustment algorithm that enables high-quality auto-stereoscopic display for multiple viewers. The proposed method relies on spatio-temporal parallax barrier to channel desired stereo pair to corresponding viewers according to their locations. We subsequently conduct simulations that demonstrate the e®ectiveness of the proposed method.
Floating volumetric display using an imaging element that consists of a 90º prism sheet and a linear Fresnel lens
Yuki Maeda, Daisuke Miyazaki, Takaaki Mukai, et al.
We propose a floating volumetric display system using a novel imaging element whose aperture is large and can be made easily at low cost. Diffuse rays in a horizontal direction for an observer are formed by a 90º prism sheet, which has a shape of an array of 90º V-grooves, as a result of twice total internal reflection. On the other hand, the diffuse rays in a longitudinal direction for the observer are formed by a linear Fresnel lens. A formed image by the proposed imaging element does not distort in the horizontal direction because the rays in the horizontal direction converge by retroreflection. The proposed imaging element can be produced easier than a conventional distortion-free imaging element and display a larger floating image. A floating three-dimensional image was displayed by a volumetric display system based on optical scanning of an inclined image plane. A position of a two-dimensional real image formed by the proposed imaging element was moved by an optical scanner at a faster rate than the duration time of persistence of vision. A stack of moved images created the floating three-dimensional volume image.
Interactive Paper Session: Stereoscopic Rendering and Standards
icon_mobile_dropdown
A rendering approach for stereoscopic web pages
Jianlong Zhang, Wenmin Wang, Ronggang Wang, et al.
Web technology provides a relatively easy way to generate contents for us to recognize the world, and with the development of stereoscopic display technology, the stereoscopic devices will become much more popular. The combination of web technology and stereoscopic display technology will bring revolutionary visual effect. The Stereoscopic 3D (S3D) web pages, in which text, image and video may have different depth, can be displayed on stereoscopic display devices. This paper presents the approach about how to render two viewing S3D web pages including text, images, widgets: first, an algorithm should be developed in order to display stereoscopic elements like text, widgets by using 2D graphic library; second, a method should be presented to render stereoscopic web page based on current framework of the browser; third, a rough solution is invented to fix the problem that comes out in the method.
The rendering context for stereoscopic 3D web
Qinshui Chen, Wenmin Wang, Ronggang Wang
3D technologies on the Web has been studied for many years, but they are basically monoscopic 3D. With the stereoscopic technology gradually maturing, we are researching to integrate the binocular 3D technology into the Web, creating a stereoscopic 3D browser that will provide users with a brand new experience of human-computer interaction. In this paper, we propose a novel approach to apply stereoscopy technologies to the CSS3 3D Transforms. Under our model, each element can create or participate in a stereoscopic 3D rendering context, in which 3D Transforms such as scaling, translation and rotation, can be applied and be perceived in a truly 3D space. We first discuss the underlying principles of stereoscopy. After that we discuss how these principles can be applied to the Web. A stereoscopic 3D browser with backward compatibility is also created for demonstration purposes. We take advantage of the open-source WebKit project, integrating the 3D display ability into the rendering engine of the web browser. For each 3D web page, our 3D browser will create two slightly different images, each representing the left-eye view and right-eye view, both to be combined on the 3D display to generate the illusion of depth. And as the result turns out, elements can be manipulated in a truly 3D space.
The design and implementation of stereoscopic 3D scalable vector graphics based on WebKit
Zhongxin Liu, Wenmin Wang, Ronggang Wang
Scalable Vector Graphics (SVG), which is a language designed based on eXtensible Markup Language (XML), is used to describe basic shapes embedded in webpages, such as circles and rectangles. However, it can only depict 2D shapes. As a consequence, web pages using classical SVG can only display 2D shapes on a screen. With the increasing development of stereoscopic 3D (S3D) technology, binocular 3D devices have been widely used. Under this circumstance, we intend to extend the widely used web rendering engine WebKit to support the description and display of S3D webpages. Therefore, the extension of SVG is of necessity. In this paper, we will describe how to design and implement SVG shapes with stereoscopic 3D mode. Two attributes representing the depth and thickness are added to support S3D shapes. The elimination of hidden lines and hidden surfaces, which is an important process in this project, is described as well. The modification of WebKit is also discussed, which is made to support the generation of both left view and right view at the same time. As is shown in the result, in contrast to the 2D shapes generated by the Google Chrome web browser, the shapes got from our modified browser are in S3D mode. With the feeling of depth and thickness, the shapes seem to be real 3D objects away from the screen, rather than simple curves and lines as before.
Interactive Paper Session: Depth Maps and View Synthesis
icon_mobile_dropdown
Discontinuity preserving depth estimation using distance transform
Woo-Seok Jang, Yo-Sung Ho
Image interpolation methods at arbitrary view positions have become quite important due to the development of threedimensional multi-view image display devices. Accurate depth information is required for natural image generation. Over the past several decades, a variety of stereo-image-based depth estimation methods have been developed to obtain high-quality depth data. However, obtaining accurate depth information still remains problematic due to difficult correspondence matching in image occlusion regions. In particular, for the discontinuous depth edge region, unclear color values exist, which lead to ineffective of corresponding matching. Thus, we propose a discontinuity preserving depth estimation method to solve the problem. The distance transform (DT) calculates the distance to the closest edge for each pixel of the input image. By controlling the color weighting term using DT values of stereo images, we carry out better correspondence matching in discontinuous regions. Experimental results indicate that the proposed method outperforms other methods. Visual comparison of the experimental results demonstrates that the proposed stereo-imagebased depth estimation method improves the quality of the depth map in discontinuous edge regions.
View synthesis from wide-baseline views using occlusion aware estimation of large disparities
Ahmed S. Elliethy, Hussein A. Aly, Gaurav Sharma
Accurate disparity estimation is a key ingredient required when generating a high fidelity novel view from a set of input views. In this paper, a high quality disparity estimation method is proposed for view synthesis from multiple input images with large disparities and occlusions. The method optimally selects one out of three image pairs to estimate the disparity map for different regions of the novel view. The novel view is then formed using this disparity map. We introduce two novel elements: a) an enhanced visibility map that is able to segment the scene accurately near object boundaries and b) a backward unilateral and bilateral disparity estimation procedure using the Gabor transform on an expandable search window to tackle large disparities. The quality of the interpolated virtual views produced by the proposed method is assessed and compared against two of the prominent previously-reported methods. The proposed method offers a significant improvement both in terms of visual quality of the interpolated views as well as the peak signal-to-noise ratio (PSNR) and structured similarity image index (SSIM) metrics.
Superpixel-based 3D warping using view plus depth data from multiple viewpoints
Tomoyuki Tezuka, Keita Takahashi, Toshiaki Fujii
This paper presents a method of virtual view synthesis using view plus depth data from multiple viewpoints. Intuitively, virtual view generation from those data can be easily achieved by simple 3D warping. However, 3D points reconstructed from those data are isolated, i.e. not connected with each other. Consequently, the images generated by existing methods have many holes that are very annoying due to occlusions and the limited sampling density. To tackle this problem, we propose two steps algorithm as follows. In the first step, view plus depth data from each viewpoint is 3D warped to the virtual viewpoint. In this process, we determine which neighboring pixels should be connected or kept isolated. For this determination, we use depth differences among neighboring pixels, and SLIC-based superpixel segmentation that considers both color and depth information. The pixel pairs that have small depth differences or reside in same superpixels are connected, and the polygons enclosed by the connected pixels are inpainted, which greatly reduces the holes. This warping process is performed individually for each viewpoint from which view plus depth data are provided, resulting in several images at the virtual viewpoint that are warped from different viewpoints. In the second step, we merge those warped images to obtain the final result. Thanks to the data provided from different viewpoints, the final result has less noises and holes compared to the result from single viewpoint information. Experimental results using publicly available view plus depth data are reported to validate our method.
Stereoscopic augmented reality with pseudo-realistic global illumination effects
Recently, augmented reality has become very popular and has appeared in our daily life with gaming, guiding systems or mobile phone applications. However, inserting object in such a way their appearance seems natural is still an issue, especially in an unknown environment. This paper presents a framework that demonstrates the capabilities of Kinect for convincing augmented reality in an unknown environment. Rather than pre-computing a reconstruction of the scene like proposed by most of the previous method, we propose a dynamic capture of the scene that allows adapting to live changes of the environment. Our approach, based on the update of an environment map, can also detect the position of the light sources. Combining information from the environment map, the light sources and the camera tracking, we can display virtual objects using stereoscopic devices with global illumination effects such as diffuse and mirror reflections, refractions and shadows in real time.
Development of free-viewpoint image synthesis system using time varying projection and spacetime stereo
Tatsuro Mori, Keita Takahashi, Toshiaki Fujii
The goal of our research is to develop a real-time free-viewpoint image synthesis system for dynamic scenes using multi-view video cameras. To this end, depth estimation that is efficient and suitable for dynamic scenes is indispensable. A promising solution for this is view-dependent depth estimation where per-pixel depth maps are estimated directly for the target views to synthesize. Such view dependent methods were successfully adopted in previous works, but their depth estimation quality was limited especially for textureless objects, resulting in low quality virtual views. This limitation comes from a fact that their depth estimation depended only on passive approaches such as traditional stereo triangulation. To tackle this problem, we considered to use active methods in addition to the passive stereo triangulation. Inspired by the success of recent commercial depth cameras, we developed a customized active illumination using a DLP projector. The projector casts spacially incoherrent patterns to the scene and makes textureless regions identifiable from the cameras, so that stereo triangulation among the multi-view cameras can be greatly improved. Moreover, making the illuminations time-varying, we can stabilize depth estimation more by using spatiotemporal matching across multi-view cameras based on the concept of spacetime stereo method, and also remove the artificial patterns from the synthesized virtual views by averaging successive time frames. Our system consisting of 16 video cameras synchronized with the DLP projector runs in real-time (about 10 fps) thanks to our sophisticated GPGPU implementation.
General stereoscopic distortion rectification due to arbitrary viewer motion in binocular stereoscopic display
Background: In binocular stereoscopic display, stereoscopic distortions due to viewer motion, such as depth distortion, shear distortion, and rotation distortion, result in misperception of the stereo content and reduce visual comfort dramat­ ically. In the past, perceived depth distortion has been thoroughly addressed, and shear distortion has been investigated within the context of multi-view display to accommodate motion parallax. However, the impact of rotation distortion has barely been studied. Therefore, no technique is available to address stereoscopic distortions due to general viewer motion. Objective: To preserve an undistorted 3D perception from a fixed viewpoint irrespective of viewing position. Method: We propose a unified system and method that rectifies stereoscopic distortion due to general affine viewer motion and delivers a fixed perspective of the 3D scene without distortion irrespective of viewer motion. The system assumes eye tracking of the viewer and pixel-wisely adjusts the display location of the stereo pair based on tracked viewer eye location. Results: For demonstration purpose, we implement our method on controlling perceived depth in binocular stereoscopic display of red and cyan anaglyph 3D. The user first perceives the designed perspective of the 3D scene at the reference position. The user then moves to 6 different positions with various distances and angles relative to the screen. At all positions, the users report to perceive a much more consistent stereo content with the adjusted displays and at the same time, experience improved visual comfort. Novelty: We address stereoscopic distortions with a goal to maintain a fixed perspective of the stereo scene, and propose a unified solution that simultaneously rectifies the stereoscopic distortions resulted from arbitrary viewer motion.
Wide-field-of-view image pickup system for multiview volumetric 3D displays using multiple RGB-D cameras
A real-time and wide-field-of-view image pickup system for coarse integral volumetric imaging (CIVI) is realized. This system is to apply CIVI display for live action videos generated by the real-time 3D reconstruction. By using multiple RGB-D cameras from different directions, a complete surface of the objects and a wide field of views can be shown in our CIVI displays. A prototype system is constructed and it works as follows. Firstly, image features and depth data are used for a fast and accurate calibration. Secondly, 3D point cloud data are obtained by each RGB-D camera and they are all converted into the same coordinate system. Thirdly, multiview images are constructed by perspective transformation from different viewpoints. Finally, the image for each viewpoint is divided depending on the depth of each pixel for a volumetric view. The experiments show a better result than using only one RGB-D camera and the whole system works on the real-time basis.
Joint upsampling and noise reduction for real-time depth map enhancement
Kazuki Matsumoto, Chiyoung Song, Francois de Sorbier, et al.
An efficient system that upsamples depth map captured by Microsoft Kinect while jointly reducing the effect of noise is presented. The upsampling is carried by detecting and exploiting the piecewise locally planar structures of the downsampled depth map, based on corresponding high-resolution RGB image. The amount of noise is reduced by accumulating the downsampled data simultaneously. By benefiting from massively parallel computing capability of modern commodity GPUs, the system is able to maintain high frame rate. Our system is observed to produce the upsampled depth map that is very close to the original depth map both visually and mathematically.
Interactive Paper Session: Stereoscopic Human Factors
icon_mobile_dropdown
Stereoscopic visual fatigue assessment and modeling
Danli Wang, Tingting Wang, Yue Gong
Evaluation of stereoscopic visual fatigue is one of the focuses in the user experience research. It is measured in either subjective or objective methods. Objective measures are more preferred for their capability to quantify the degree of human visual fatigue without being affected by individual variation. However, little research has been conducted on the integration of objective indicators, or the sensibility of each objective indicator in reflecting subjective fatigue. The paper proposes a simply effective method to evaluate visual fatigue more objectively. The stereoscopic viewing process is divided into series of sessions, after each of which viewers rate their visual fatigue with subjective scores (SS) according to a five-grading scale, followed by tests of the punctum maximum accommodation (PMA) and visual reaction time (VRT). Throughout the entire viewing process, their eye movements are recorded by an infrared camera. The pupil size (PS) and percentage of eyelid closure over the pupil over time (PERCLOS) are extracted from the videos processed by the algorithm. Based on the method, an experiment with 14 subjects was conducted to assess visual fatigue induced by 3D images on polarized 3D display. The experiment consisted of 10 sessions (5min per session), each containing the same 75 images displayed randomly. The results show that PMA, VRT and PERCLOS are the most efficient indicators of subjective visual fatigue and finally a predictive model is derived from the stepwise multiple regressions.
Visual discomfort under various brightness conditions using eye movements in watching stereoscopic 3D video
Visual discomfort is caused by various factors when watching stereoscopic 3D contents. In particular, brightness change is known as one of the major factors related to visual discomfort. However, most previous research about visual discomfort dealt with binocular disparity as related to accommodation and vergence linkage. In this paper, we analyze visual discomfort caused by brightness change using eye-movements and a subjective test. Eye-movements are computed using eye pupil motion as detected from a near-infrared eye image. We measure eye-blinking and pupil size while watching stereoscopic 3D videos with global and local brightness variations. The results show that viewers felt more visual discomfort in local change than in global change of brightness in a scene.
On the comparison of visual discomfort generated by S3D and 2D content based on eye-tracking features
The changing of TV systems from 2D to 3D mode is the next expected step in the telecommunication world. Some works have already been done to perform this progress technically, but interaction of the third dimension with humans is not yet clear. Previously, it was found that any increased load of visual system can create visual fatigue, like prolonged TV watching, computer work or video gaming. But watching S3D can cause another nature of visual fatigue, since all S3D technologies creates illusion of the third dimension based on characteristics of binocular vision. In this work we propose to evaluate and compare the visual fatigue from watching 2D and S3D content. This work shows the difference in accumulation of visual fatigue and its assessment for two types of content. In order to perform this comparison eye-tracking experiments using six commercially available movies were conducted. Healthy naive participants took part into the test and gave their answers feeling the subjective evaluation. It was found that watching stereo 3D content induce stronger feeling of visual fatigue than conventional 2D, and the nature of video has an important effect on its increase. Visual characteristics obtained by using eye-tracking were investigated regarding their relation with visual fatigue.
Perception and annoyance of crosstalk in stereoscopic 3D projector systems
Kun Wang, Börje Andrén, Mahir Hussain, et al.
Crosstalk is a cause of a major perceptual problem in the 3D display system shown itself mostly as ghosting. In this work we aimed at investigating how much perceived crosstalk that is acceptable for the end-users in movie type contents played by 3D projection systems. Two types of 3D projection systems (one system using active shutter glasses, and the other system using passive polarized glasses) were compared in the experiment. The study included an objective measurement of crosstalk in the 3D projection system and a subjective users’ experience of the visible distortions. The results shows that 10% can be considered as a crosstalk threshold for end-users not to be annoyed (MOS<3.5) by the distortions and thus acceptable. The distortions start to be perceived at about 3% crosstalk. The study found a linear relationship between perceived crosstalk and the amount of crosstalk. The perceived crosstalk also varies largely depending on the video contents.
Interactive Paper Session: Stereoscopic Perception
icon_mobile_dropdown
Eliciting steady-state visual evoked potentials by means of stereoscopic displays
Enrico Calore, Davide Gadia, Daniele Marini
Brain-Computer Interfaces (BCIs) provide users communication and control capabilities by analyzing their brain activity. A technique to implement BCIs, used recently also in Virtual Reality (VR) environments, is based on the Steady State Visual Evoked Potentials (SSVEPs) detection. Exploiting the SSVEP response, BCIs could be implemented showing targets flickering at different frequencies and detecting which is gazed by the observer analyzing her/his electroencephalographic (EEG) signals. In this work, we evaluate the use of stereoscopic displays for the presentation of SSVEP eliciting stimuli, comparing their effectiveness between monoscopic and stereoscopic stimuli. Moreover we propose a novel method to elicit SSVEP responses exploiting the stereoscopic displays capability of presenting dichoptic stimuli. We have created an experimental scene to present flickering stimuli on an active stereoscopic display, obtaining reliable control of the targets’ frequency independently for the two stereo views. Using an EEG acquisition device, we analyzed the SSVEP responses from a group of subjects. From the preliminary results, we got evidence that stereoscopic displays represent valid devices for the presentation of SSVEP stimuli. Moreover, the use of different flickering frequencies for the two views of a single stimulus proved to elicit non-linear interactions between the stimulation frequencies, clearly visible in the EEG signal. This suggests interesting applications for SSVEP-based BCIs in VR environments able to overcome some limitations imposed by the refresh frequency of standard displays, but also the use of commodity stereoscopic displays to implement binocular rivalry experiments.
A new multimodal interactive way of subjective scoring of 3D video quality of experience
Taewan Kim, Kwanghyun Lee, Sanghoon Lee, et al.
People that watch today's 3D visual programs, such as 3D cinema, 3D TV and 3D games, experience wide and dynamically varying ranges of 3D visual immersion and 3D quality of experience (QoE). It is necessary to be able to deploy reliable methodologies that measure each viewers subjective experience. We propose a new methodology that we call Multimodal Interactive Continuous Scoring of Quality (MICSQ). MICSQ is composed of a device interaction process between the 3D display and a separate device (PC, tablet, etc.) used as an assessment tool, and a human interaction process between the subject(s) and the device. The scoring process is multimodal, using aural and tactile cues to help engage and focus the subject(s) on their tasks. Moreover, the wireless device interaction process makes it possible for multiple subjects to assess 3D QoE simultaneously in a large space such as a movie theater, and at di®erent visual angles and distances.
Effect of local crosstalk on depth perception
Hiroshi Watanabe, Hiroyasu Ujike, John Penczek, et al.
Interocular crosstalk has a significant undesirable effect on the quality of 3D displays that utilize horizontal disparity. This study investigates observer sensitivity when judging the depth order of two horizontally aligned dots on a 3D display and assesses 3D display uniformity by obtaining this index for various locations on the display. Visual stimulus is two horizontally disparate dots, with nine steps of horizontal disparity. A dot pair is presented at five screen locations. An observer wearing polarized glasses sits 57 cm from a display, observes it through a slit, and judges the depth order of two dots. Each of the 20 observers responds 16 times per disparate dot pair, and we calculate the rate at which observers judge the dot on the right to be nearer in 16 trials for each display, screen location, and disparity. We then plot the rate as a function of the left–right dot disparity and fit a psychometric function to the plot. A curve slope at a response probability of 50% is used to gauge the sensitivity of depth order judgment. Results show the depth sensitivity variation across the display surface depends on interocular-crosstalk variation across the display thus its uniformity of the display.