PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 11897 including the Title Page, Copyright information, and Table of Contents.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The underwater polarization dehazing imaging has attracted a lot of interest due to the potential applications in corresponding fields. There are some progress on the underwater polarization dehazing imaging by introducing the deep learning into polarization dehazing imaging. In this work, the underwater active polarization dehazing imaging based on the deep learning model is studied. A modified All-in-One Dehazing Network model with three input channels is designed under the framework of TensorFlow. The polarization image data of three different polarization components are designed as the training set with the convolution neural network (CNN).This light-weight CNN is designed to achieve underwater dehazing imaging of different targets with different turbidity. Experiment results indicate that the prediction and estimation using modified AOD-Net have better accuracy than that of the traditional dehazing model.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Light scattering is inevitable in optical imaging and random speckles will be formed when light passes through the scattering media due to the influence of multiple scattering. Recent research results show that the super-resolution imaging can be achieved by the scattering effect.In this work, the effective focusing of the laser beam through the is realized when the wavefront of the incident light is modulated by the feedback optimization algorithm (area-by-area modulation algorithm) with the spatial light modulator. The experiment results indicate that the maximum light intensity growth factor (132 in this work) and the focusing spot size (1/10 compared to the traditional lens focusing) are dependent on the total number of modulation units on the spatial light modulator and the phase accuracy of each modulation unit. In particular, the focusing of a vector beam through the scattering media provides a new way to control the the maximum light intensity growth factor and the focusing spot size.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image deblurring and inpainting are traditional image processing problems, and the effects achieved for high-resolution images are not satisfactory. In recent years, Convolutional Sparse coding (CSC) has been received more attention and introduced into image processing, such as blind deblurring. However, none of the works address the issue containing both blur and inpainting. In this work, we propose a novel framework of CSC for simultaneous image deblurring and inpainting. First, we learn a dictionary instead of applying a given dictionary for better image representation. Second, we use the learned dictionary with the ℓ1 norm to regularize images. In addition, we apply a total anisotropic variation to enhance the edges of the image. Usually, we use the alternating direction method of multipliers (ADMM) formulation in the Fourier domain for the dictionary. We demonstrate the proposed training scheme for simultaneous image deblurring and inpainting, achieving state-of-the-art results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The monocular camera is widely used in robots and unmanned vehicles system because it is low cost and easy to calibrate and identify. However, the depth lack of the monocular camera hinders positioning and determining the real size of obstacles in the unmanned vehicle system. To solve the problem, we propose a collaborative structure to accurately acquire the position of static or dynamic obstacles based on the partially observing information from multiple monocular cameras. After that, a reinforcement learning based obstacle avoidance algorithm is proposed for unmanned vehicles under an unknown environment. Specifically, we discuss the influence of obstacles' moving orientations on the performance of obstacles adaptive avoidance. Simulation results verify the feasibility of the proposed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, Augmented reality (AR) technology has gained great attention. One critical component of AR devices is the see-through optical combiner. Several approaches have been proposed, such as using either refractive, reflective, diffractive, holographic optics, or a combination of them. Meta-surfaces can realize the design of more required optical element due to its superb abilities of controlling the amplitude, phase, polarization or other parameters of wave-front at subwavelength scale. Thus, we proposed a waveguide coupler based on a polarization independent doublet meta-surface, which can realize field of view (FOV) to 50° at the wavelength of 638 nm. Compare to the diffractive grating based optical combiner architecture, our doublet meta-surfaces improve the problem of FOV limited by the refractive index of the waveguide and uneven brightness in AR display devices. The polarization independent doublet meta-surfaces presented in this paper can provide a large phase mutation rate to achieve a large deflection angle. The meta-surfaces consist of a group of Silicon nanofins with different side length, but same height arranged on a rectangle glass substrate lattice with the reflection index of 1.764. Each unit structure has different electromagnetic responses to plane electromagnetic waves with different incident angles. By designing the shape and size of each silicon nanorods and other parameters, the phase and amplitude modulation we need can be achieved. Here we only consider one-dimensional eye-box expansion at a single wavelength. The design approach can be generalized to two-dimensional eye-box expansion. Besides, full-color display can be achieved by coordination of the coupling-in and coupling-out device in waveguides.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Spectral imaging can capture both spatial and spectral data of a scene, providing an efficient technique for analysis and identification. To improve the efficiency of data acquisition, compressive sensing (CS) methods have been introduced into spectral imaging systems. In this work, we propose a novel macropixel segmentation method to realize effective and non-mechanical single-pixel multispectral imaging. A series of macropixel-based patterns are designed to modulate data cube of target object. Spatial light modulator (SLM) and multispectral filter array are utilized to generate such patterns. CS algorithm is used to recover data cube from 1-D signal acquired by a single-pixel detector. Alignment of binary patterns with the subareas of macropixel filter array is conducted in the experimental set-up. Without mechanical or dispersive structure, the proposed method holds great potential in miniaturization and integration of spectral imaging devices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Perovskites have been widely used in solar cells manufacturing due to their extraordinary photoelectric characteristics. The crystal quality of perovskite plays an important role in photoelectric conversion. Although conventional crystal quality detection methods, such as scanning electron microscopy(SEM) and atomic force microscope(AFM), have good performance of high spatial resolution, they are usually time-consuming, expensive, and sometimes damage the samples unavoidably. Hyperspectral imaging(HSI) technology has been utilized to monitor material growth process in recent years, due to its advantages of high spectral resolution, non-invasive and fast detection speed. Micro-hyperspectral imaging(MHSI) technology combines both HSI and microscopic technology, enabling it suitable for micro- and nanoscale material analysis. In this work, we have developed a kind of MHSI system. 3D data of perovskite monocrystals were obtained by transmission mode at room temperature. Perovskite mono-crystals were prepared by one-step solution self-assembly method. The experimental results illustrate that the specific absorption wavelength of perovskite is directly proportional to the thickness of mono-crystals. When the thickness increases, the absorption wavelength will shift red. The thickness factor is also verified by white light interference. The composition ratio of perovskite monocrystals has a certain dependence on its absorbance before 540 nm. The higher the proportion of Br atom is, the weaker the light absorption is, and auxiliary verification was carried out by energy dispersive spectrometer(EDS). In conclusion, MHSI technology can effectively monitor and analyze the preparation process and quality evaluation of micro- and nanoscale materials and structure, it shows a wide application prospect in material science and medical fields.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fluorescent molecules play an important role in many fields due to their characteristic of high sensitivity and easy operation. The traditional method for detecting fluorescent molecules is Laser Scanning Confocal Microscopy, but it has light pollution and can only detect a single class of fluorescent molecules at once. Recently, Microscopic Hyperspectral Imaging(MHSI) technology has been used to the detection of fluorescent molecules due to its high spectral resolution and non-destructive detection. However, the low spatial resolution of MHSI makes it difficult to conduct high-precision molecular research. Therefore, it is important to develop an image processing method to improve it. In this work, a twostep data processing method was proposed to enhance automatic classification effect of fluorescent molecules. We used a microscopic hyperspectral system to image the mixed five kinds of fluorescent molecular samples in transmission mode. The first step is based on the difference in unit slope of spectra curve (wavelength range was chosen from 410 nm to 550 nm) between fluorescence molecule and the background, and an image segmentation method based on minimum light transmission point is proposed. The second step is to calculate the relative absorbance of each voxel according to the nearest background voxel found on the basis of image segmentation, and to take the absorbance as the final classification feature. Compared with the traditional transmittance feature on six kinds of machine learning classification models, the average classification accuracy of the new features can be improved by 2.2%, and the time consumption per classification can be reduced by 1/3 approximately. In conclusion, the proposed two-step data processing method is suitable for the classification of multi-kinds of fluorescent molecules, which has the advantages of high efficiency and accuracy, and is expected to be widely used in biology, medicine, materials and other fields.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Medical CT image amplification and reconstruction system based on deep learning.Shu wang Chen, Yun Wang, Meng Wang .Institute of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang. SRGAN (Super-Resolution Generative Adversarial Networks) algorithm is used in the system and the medical CT image super-resolution reconstruction is completed. The medical CT image processed by this method can better reflect the various details of the image, which is conducive to the observation and correct diagnosis for the doctors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fourier ptychographic microscopy (FPM) is a recently developed computational imaging technology, which can bypass the limitation of optical-spatial-bandwidth product to achieve imaging with both high resolution and wide field-of-view. Original FPM setup is limited to image thin samples with angle-varied illuminations. A diffuser scanning FPM method, which scans a diffuser with unknown profile between the object and objective lens to modulate the complex exit wavefront of the object, is implemented to address this 3D refocusing problem. However, since the profile of the diffuser is unknown, the number of images required for method to converge is large. A mechanical translation stage is also required, which may introduce the systematic error of the position. In this paper, we propose a new FPM method with wavefront modulation imposed by spatial light modulator (SLM). The SLM is placed between the object and the objective lens to produce specific pattern profile, which is used to modulate the complex exit wavefront of the object. Fourier ptychographic phase retrieval process is employed to recovery the complex exit wavefront of the object. With the use of SLM, it is unnecessary to employ 2D motorized translation stage to move the imposed pattern profile so that the error of translation stage is eliminated. Moreover, since the imposed pattern profile is known, the number of images required for the algorithm to converge is reduced. Simulation and experiment are performed to validate the effectiveness of our proposed method. Compared with the method, which utilizes unknown diffuser, our method can achieve a more accurate reconstruction result with a faster data acquisition and higher convergence speed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The image reconstruction of an object passed through a scattering media has attracted a lot of interest due to its potential application in corresponding fields. Recently, the deep learning techniques have been introduced into the computational imaging through scattering media and obtained good results. In this work, a modified U-Net model with dense blocks is designed under the framework of PyTorch, MobileNet is used as the backbone model. The network is trained by using mean square error (MSE) loss function. The features of image can be extracted and the information of every pixel of the speckle field can be classified and restored in this model through depth separable convolution. Thus, the speckle field can be reconstructed. The experimental results show that this network has good generalization ability for image reconstruction and improves the ability of information acquisition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, the temperature of butane flame is measured by Thin Filament Pyrometer(TFP). The luminous length of butane flame is about 80mm. The high temperature resistant material tungsten-rhenium wire is placed into the flame, the diameter of tungsten-rhenium wire is 0.1mm, 0.25mm, 0.5mm.A scientific sCMOS camera calibrated by emissivity calibration experiment was used to measure the radiance of tungsten-rhenium filament, and Planck's blackbody radiation law was used to calculate the temperature of the filament., so as to get the temperature of the butane flame. The results show that the highest temperature of the butane flame measured by the filament pyrometer is 1122K.Standard armored K-type thermocouple is used to verify the experimental accuracy, and the measurement error between the calculated value and the standard value is less than 5%. The experimental results of butane combustion show that this method can be applied to similar temperature measurement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we address the rain streak removal from a single image. In order to efficiently detect and remove the annoying rain streaks, we propose a global single-directional gradient prior with the L0 norm to model the rain streak. To preserve the abundant information of the background, we learn a convolutional sparse coding (CSC) to represent the background. Furthermore, we develop an alternating direction method of multipliers (ADMM) to solve multi-variable optimization problems. Experiments on synthesized and real-world images show that the proposed method outperforms state-of-art methods in terms of rain streak removal and background preservation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Infrared (IR) small target detection in a single frame is a challenging task due to the lack of texture and color information and the interference of background clutters. In light of the two-dimensional Gaussian-like shape of IR small target, two properties from the perspective of local gradient and directional curvature (LGDC) are characterized. Specifically speaking, the local gradients in four quadrants as well as the curvatures from four directions should distribute in a regular way in the target region. Therefore, an LGDC map is computed from the input IR image so that the contrast between target and background can be greatly improved. By this means, we are able to extract the IR small target by a simple threshold related to the mean and standard deviation values of the LGDC map. Experiments implemented on real IR images verify that the proposed method can achieve satisfactory performance in terms of local contrast enhancement and background clutter suppression.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Abstract: Stereo matching is one of the key techniques in stereo vision. Due to the existing adaptive window stereo matching algorithm is insufficient to extract features in low-texture regions, a weighted dynamic adaptive window stereo matching algorithm based on pixel gradient value is proposed. Firstly, the Sobel operator is used to calculate the gradient value of each pixel in the image and the phase information is introduced. According to the two, the pixel is divided into strong, medium and weak texture regions, and then different thresholds are assigned to the pixels in different regions. Then the image is converted from the RGB color space to the HSV color space and the matching window is dynamically generated according to the region threshold and color threshold. The HAD cost calculation function was established and the traditional Census algorithm was improved. The disparity map was obtained by nonlinear fusion calculation, and the obtained disparity map was detected by subpixel and filtered by median value. Finally, the high-precision disparity map was obtained. Experimental results show that the proposed algorithm is effective, has high matching accuracy, and has good robustness to optical distortion and edge information conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Compared with images taken in the air, underwater images are susceptible due to the adverse effects of complex water conditions, such as absorption, attenuation, scattering, various noises, and uneven lighting, etc. These effects severely reduce the quality of underwater imaging and give rise to many problems among which is image color distortion. To fully acknowledge the underwater scenes, it is important to acquire the images with accurate colors, especially for marine rescue, underwater target recognition, and marine ecological research. To address this problem, this paper proposes an underwater imaging technology based on a light field camera array. The application of this technology can not only perform accurate color correction processing on the captured images but also expand the dynamic range of underwater images, thereby obtain enhanced high-quality, true-color underwater images. The experimental system in this paper contains a 3*3 camera array that simultaneously captures images of underwater targets with different exposure times. As the light attenuation coefficient of different wavelengths in water varies with the wavelength, during the color restoration process, we measure the relative value of the underwater spectral power distribution concerning the air, calculate the tristimulus value change according to the colorimetric principle, and convert the tristimulus value change in the CIEXYZ color space to the CIE La*b* color space. The color on the underwater images obtained by the camera array is then compensated. Then, in the image fusion process, images with different exposure are used to recover the response function of the imaging process, and multiple images are fused into a single, high-dynamic-range radiance map using high quality tone mapping technique in the radiance value dynamic display high-contrast images on devices with limited range. Experimental results show that our light-field imaging color correction and dynamic range expansion processing methods have obvious advantages over single cameras in underwater color correction and underwater moving target imaging detection, which provide advanced technology and equipment for underwater in-situ applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The development trend of optical imaging systems today is lightweight, large zoom, and low power consumption. The traditional zoom imaging system is composed of glass lens, mechanical control module and image sensor. Its working principle is to realize zooming by mechanically adjusting the distance between each lens group.The shortcomings of traditional zoom imaging systems are its complex structure, inconvenient operation and discrete variable multiples. The combination of adjustable focus lens and glass lens makes the zoom optical imaging system lighter and easier to obtain image detail information. In this paper, we propose a variable magnification imaging system composed of a combination of a liquid crystal lens and a glass lens. This variable magnification imaging system consists of an optical lens group, a liquid crystal lens array and a CMOS detector. The optical lens group is a telephoto lens composed of traditional glass lenses.The liquid crystal lens is a liquid crystal device that electronically controls the focal length. The object is first imaged through the optical lens group, the liquid crystal lens array then images the image formed by the optical lens group, and finally the image is received by the CMOS. This is the imaging principle of this system;By adjusting the voltage applied to the liquid crystal lens array to adjust its focal length, and then moving the CMOS mechanically, zoom imaging can be achieved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Waste classification based on deep neural networks is up against the dataset deficiency. However, that is too expensive and time-consuming for collecting and labeling waste samples. We proposed an improved ResNet-18 model based on Model Agnostic Meta-Learning (MAML) to improve classification accuracy with a few-shot waste classification dataset. the feature extraction part of the improved model includes a convolution layer and four residual blocks; the classification part of the improved model includes a max-pooling layer and three fully connected layers. Moreover, GroupNorm is adopted to reduce the impact of different feature distributions normalization on the classification accuracy. With initial parameters from the MAML training on the Mini-ImageNet dataset, the model improve accuracy only with one training iteration results on few waste samples. The experiments verified the effectiveness of our model on the Mini-ImageNet dataset and a few-shot waste classification dataset
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Improving the accuracy while maintaining the real-time performance of object tracking is a major challenge for computer vision field. In this paper, an improved Similarity-Perception-Siamese (SP-Siam) network tracking algorithm based on SiamFC is proposed. The algorithm introduces squeeze-and-excitation (SE) block and residual network for similarity map based on Siamese network, adaptively recalibrates the channel characteristic response of similarity map between target and the search inputs by explicitly modeling the interdependence between channels. This study also verifies the network performance on Object Tracking Benchmark (OTB) tracking datasets. The experimental results show that the squeeze-and-excitation block of similarity map has brought significant performance improvement to the existing Siamese network at slight additional computational cost achieved the goal of improving network performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the rapid development of computer vision, archaeology, medicine, reverse engineering and other fields, optical 3D measurement, as one of its crucial technologies, has been utilized more and more widely. In actual measurement, due to the limitation of the measurement range of the measuring equipment and the occlusion of the measured object, it is difficult to obtain the complete shape of the object through single measurement, thus requires multiple measurement from different perspectives and registration of the point cloud data obtained from each perspective together. To realize the registration and stitching of two point clouds with relative low overlap rate, this paper proposes a method based on curvature features and direction vector threshold. In the registration step, the curvature feature of the point cloud data is utilized to achieve accurate matching, and the Kdtree nearest neighbor search method is used to improve the matching points searching speed. In order to further reduce the registration error, the wrong point pairs are eliminated with the direction vector threshold method. The OpenMP multi-threaded parallel calculation method is used in the process of calculating the direction vector to improve the efficiency and speed. Subsequently, the rotation matrix R and the translation vector t between two point clouds are obtained by singular value decomposition method. Finally, the obtained transformation matrix is used to realize the rigid body transformation between the point clouds. Experimental results show that the proposed algorithm can effectively improves the registration accuracy and time efficiency of point cloud data with low initial overlap rate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Coded aperture snap shot spectral imager (CASSI) is a potential method to get hyperspectral images. One of the latest designs of CASSI is a dual-camera design, which adds a grayscale camera to capture the same scene. In this paper, an improved method based on two-step iterative shrinkage thresholding algorithms (TwIST) is proposed to utilize the images containing the information of the structure of the objects from the grayscale camera more efficiently. The information come from the auxiliary camera and the CASSI detector is used to construct an estimated 3D hyperspectral data. Then we use TwIST and TV regularization to reconstruct the residual image based on the residual data. The final reconstructed hyperspectral image equals the sum of the estimated image and the reconstruct residual image. This method ensures that the result is more similar to the structure of the original image. The simulation results show that our method improves the image quality of the reconstructed hyperspectral images for all the data we have tried. The simulation results show that our method improves the image quality of the reconstructed hyperspectral images and use less run time compared to the original method. The corresponding peak signal-to-noise ratio (PSNR) is increased by 8.99 dB. The structural similarity (SSIM) is increased by 0.0757. The spectrum angular mapper (SAM) is reduced by 0.1987.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to solve the problem of quality degradation of underwater image due to absorption and scattering of water body, this paper proposes a method of underwater image enhancement based on the combination of computational imaging and deep learning. The method has achieved good results in removing image blur and scattering noise. It can effectively enhance the target images in turbid water, which will allow underwater image applications to have a wider range of areas.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Compared with traditional 2D imaging, omnidirectional imaging techniques can provide users with 360°×180° immersive visual experience, which also make the objective quality assessment of omnidirectional images more challenging. In this work, a spherical triangle mesh representation and multi-channel residual graph convolution network (denoted as Multi-RES-GCN) based blind omnidirectional image quality assessment (IQA) is proposed. The proposed method includes two important stages: omnidirectional image’s spherical triangle mesh generation and optimization, and quality predictor based on Multi-RES-GCN. In the first stage, the spherical representation of omnidirectional image is used (called as spherical image), a new scheme of spherical triangle mesh generation and optimization is proposed, which can reasonably sample pixels on the spherical image and optimize the sampled points to generate more accurate triangular meshes. In the second stage, the spherical image is divided into six view regions, and the triangle mesh nodes are classified into the view regions according to their positions, and then input to the quality predictor. The quality predictor is composed of Multi-Res-GCN and Estimator. Multi-Res-GCN can model nodes and the dependency relationship between nodes. Estimator is designed to regress the features extracted by Multi-Res-GCN to the weights and quality score of each view region, and the final quality score of omnidirectional image is predicated by calculating the weighted summation of these quality scores. Experimental results demonstrate that the proposed method outperforms other state-of-the-art IQA metrics on two omnidirectional IQA databases.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The variable head pose and low-quality eye images in natural scenes can lead to low accuracy of gaze estimation. In this paper, we propose a multi-feature fusion gaze estimation model based on the attention mechanism. First, face and eye feature extractors based on the group convolution channel and spatial attention mechanism (GCCSAM) are designed to use channel and spatial information to adaptively select and enhance important features in face images and two eye images, and suppress information irrelevant to gaze estimation. Then we design two feature fusion networks to fuse the features of face, two eyes and pupil center position, thus avoiding the effects of two-eye asymmetry and inaccurate head pose estimation on gaze estimation. The average angular error of the proposed method is 4.1° on MPIIGaze and 5.2° on EyeDiap. Compared with the current mainstream methods, our method effectively improves the accuracy and robustness of gaze estimation in natural scenes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In-line X-ray phase contrast imaging (IL-PCI) is a promising technology for clinical diagnosis because of its great advantage in distinguishing low contrast tissues and simple structure to implement. In order to recover the phase projections from the phase contrast measurements, conventional phase retrieval methods were developed based on assumptions such as homogeneous material, weak attenuation, and thus suffered from limited generalizability, practicability and feasibility. Deep learning-based methods have been proposed for phase retrieval and great success has been achieved. While the practical physical model of phase contrast imaging hasn’t been fully considered including the non-ideal effects of finite size of the x-ray micro focal spot, finite pixel size of the detector and the system noise. In this paper, a convolutional network based on generative adversarial network is proposed to retrieve the phase projections with fully considering the non-ideal effects in IL-PCI. The network composed of a generating network from which the phase projections were retrieved and a discriminating network from which the difference between the output of generation network and the reference phase projection is processed and backpropagated to the input of the network. Phase contrast measurements of the microspheres phantom were simulated and retrieved by the conventional methods and the proposed network. Results show the superiority of the proposed network in spatial resolution and noise suppression compared with the conventional method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent years deep convolutional neural networks(CNNs) have got great success in the single image superresolution(SISR). However, existing CNN-based SISR methods are hard to achieve ideal performance due to the limited information contained in a single low resolution (LR) image. Moreover, when the scale factor is large, SISR methods become difficult to learn and reconstruct unknown information, giving rise to poor performance. To address these issues, we propose a deep residual learning super-resolution framework MFSRResNet using multi-frame LR images as input. Our method MFSRResNet is based on the SRResNet architecture. The main modification is the number of input frames and number of convolutional layer feature maps. We use five-frame LR images as input rather than a single-frame LR image. We create multi-frame LR images by randomly downsampling a HR image and make sure sub-pixel shifts among them. The multi-frame input method increases the amount of information obtained at the input end, thus substantially improves the reconstruction results. Experiments show that MFSRResNet can well integrate the information between different LR images, and get better reconstruction results. MFSRResNet demonstrate the state-of-the-art performances on all benchmark datasets in terms of Peak signal-to-noise ratio (PSNR) and Structural similarity (SSIM). The significant performance improvement in PSNR/SSIM of MFSRResNet is 2.67dB/0.0495(×3),2.27dB/0.05498(×4) and 1.56dB/0.0504(×8) in average on two benchmark datasets Set5 and Set14 respectively compared with current state-of-the-art SISR methods RCAN.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to achieve global optimization in conventional optical imaging systems, complex optical design is required to eliminate various aberrations simultaneously. Imaging through scattering media can be achieved by the speckle autocorrelation method based on optical memory effect in a single-shot non-invasive way. By adding a scattering medium to the imaging system, multi-aberrations can be eliminated simultaneously. As an example, in a simple optical imaging system with spatially incoherent illumination, a ground glass plate is placed between the lens and the camera as a scattering medium. Finally, multi-aberrations such as spherical aberration, coma aberration and chromatic aberration are eliminated at the same time. Therefore, scattering media can be used as a tool to optimize optical imaging systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As a novel asynchronous imaging sensor, event camera features low power consumption, low temporal latency and high dynamic range, but abundant noise. In real applications, it is essential to suppress the noise in the output event sequences before successive analysis. However, the event camera is of address-event-representation (AER), and requires developing new denoising techniques rather than conventional frame-based image denoising methods. In this paper, we propose two learning-based methods for the denoising of event-based sensor measurements, i.e., convolutional denoising auto-encoder (ConvDAE) and sequence-fragment recurrent neural network (SeqRNN). The former converts the event sequence into 2D images before denoising, which is compatible with existing deep denoisers and high-level vision tasks. The latter, utilizes recurrent neural network’s advantages in dealing with time series to realize online denoising while keeping the event’s original AER representation. Experiments based on real data demonstrate the effectiveness and flexibility of the proposed methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The paper presents an effective method for calculating global illumination of scenes with a voxel representation of geometry, based on the Path Tracing method. The method is aimed to obtain a fast and reliable estimate of the luminance of global illumination in scenes obtained as a result of three-dimensional real-world scanning. To further increase the calculation efficiency, the method can be transferred to the GPU. The article discusses the main works on the visualization of volumetric geometry. The octree was chosen as the main representation of the geometry in memory, which allows accelerating ray tracing in the scene and reducing the amount of memory required. Optimizations of the ray tracing algorithm in octree and voxel matrix representations are described and applied. A method for estimating the calculation accuracy by evaluating the mean square error over the entire image is presented. Testing and comparison of the method results on two representations of geometry in memory: octree and three-dimensional matrix, as well as a comparison of the method with the original Path Tracing method, which revealed a twofold acceleration of the calculation, were carried out. Optimizations are proposed to further improve the efficiency of the method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This work evaluates deep learning-based myocardial infarction (MI) quantification using Segment cardiovascular magnetic resonance (CMR) software. Segment CMR software incorporates the expectation-maximization, weighted intensity, a priori information (EWA) algorithm used to generate the infarct scar volume, infarct scar percentage, and microvascular obstruction percentage. Here, Segment CMR software segmentation algorithm is updated with semantic segmentation with U-net to achieve and evaluate fully automated or deep learning-based MI quantification. The direct observation of graphs and the number of infarcted and contoured myocardium are two options used to estimate the relationship between deep learning-based MI quantification and medical expertbased results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The extent to which the arbitrarily selected L2 regularization hyperparameter value affects the outcome of semantic segmentation with deep learning is demonstrated. Demonstrations rely on training U-net on small LGEMRI datasets using the arbitrarily selected L2 regularization values. The remaining hyperparameters are to be manually adjusted or tuned only when 10% of all epochs are reached before the training validation accuracy reaches 90%. Semantic segmentation with deep learning outcomes are objectively and subjectively evaluated against the manual ground truth segmentation
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Today one of the most important problems in modern Augmented and Mixed reality applications is the analysis of surrounding objects in order to trace variability of the observed environment, the behavior of which is usually unpredictable. The solution to this problem can be widely applied in many mobile applications as a tool to interact with surrounding objects in the environment. Augmented and Mixed realities are both extremely promising directions for further research of interaction with the real environment. In this paper, we suggest a method of creation of geometric data based on a series of points in order to reconstruct the surface of real objects. This allows us to ensure the interaction with virtual objects. The proposed method comprises three steps: point detection, points clusterization into multiple groups of points, depending on their location, and the creation of geometric data such as lines and surfaces.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.