Proceedings Volume 12138

Optics, Photonics and Digital Technologies for Imaging Applications VII

cover
Proceedings Volume 12138

Optics, Photonics and Digital Technologies for Imaging Applications VII

Purchase the printed version of this volume at proceedings.com or access the digital version at SPIE Digital Library.

Volume Details

Date Published: 17 May 2022
Contents: 8 Sessions, 36 Papers, 24 Presentations
Conference: SPIE Photonics Europe 2022
Volume Number: 12138

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Front Matter: Volume 12138
  • Learning-based Solutions
  • Image Analysis
  • Image Acquisition and Computational Imaging
  • Applications
  • Standardization of Plenoptic Coding and Media Security Frameworks
  • Displays and Projections
  • Poster Session
Front Matter: Volume 12138
icon_mobile_dropdown
Front Matter: Volume 12138
This PDF file contains the front matter associated with SPIE Proceedings Volume 12138 including the Title Page, Copyright information, Table of Contents, and Committee Page.
Learning-based Solutions
icon_mobile_dropdown
Noise robust focal distance detection in laser material processing using CNNs and Gaussian processes
Sepehr Elahi, Can Polat, Omid Safarzadeh, et al.
In this work, we investigate the effects of noise on real-time focal distance control for laser material processing by generating the images of a sample at different focal lengths using Fourier optics and then designing, training, and testing a deep learning model in order to detect the focal distances from the simulated images with varying standard deviations of added noise. We simulate both input noise, such as noise due to surface roughness, and output noise, such as detection camera noise, by adding zero-mean Gaussian noise to the source wave and the simulated image, respectively, for different focal distances. We then train a convolutional neural network combined with a Gaussian process classifier to predict focus distances of noisy images together with confidence ratings for the predictions.
Machine learning-based high-precision and real-time focus detection for laser material processing systems
Can Polat, Gizem Nuran Yapici, Sepehr Elahi, et al.
This work explores a real-time and high precision focus finding for the ultrafast laser material processing for a different types of materials. Focus detection is essential for laser machining because an unfocused beam cannot affect the material and, at worst, a destructive effect. Here, we compare CNN and non-CNN-based approaches to focus detection, ultimately proposing a robust CNN model that can achieve high performance when only trained on a portion of the dataset. We use an ordinary lens (11 mm focal length, 0.25 NA) and a CMOS camera. Our robust CNN model achieved a focus prediction accuracy of 95% when identifying focus distances in {-150, -140,...,0,...,150} µm, each step is about 7% of the Rayleigh length, and a high processing speed of 1000+ Hz on a CPU
Sargassum detection and path estimation using neural networks
Sargassum has affected the Mexican Caribbean coasts since 2015 in atypical amounts, causing economic and ecological problems. Removal once it reaches the coast is complex since it is not easily separated from the sand, damaging dune vegetation, heavy transport compacts the sand and further deteriorates the coastline. Therefore, it is important to detect and estimate the sargassum mats path to optimize the collection efforts in the water. There have been some improvements in systems that rely on satellite images to determine areas and possible paths of sargassum, but these methods do not solve the problems near the coastline where the big mats observed in deep sea end up segregating in little mats which often do not show up in the satellite images. Besides, the temporal scales of nearshore sargassum dynamics are characterized by finer temporal resolution. This paper focuses on cameras located near the coast of Puerto Morelos reef lagoon where images are recorded of both beach and near-coastal sea. First, we apply preprocessing techniques based on time that allows us to discriminate the moving sargassum mats from the static sea bottom, then, using classic image processing techniques and neural networks we detect, trace, and estimate the path of the mat towards the place of arrival on the beach. We compared classic algorithms with neural networks. Some of the algorithms we tested are k-means and random forest for segmentation and dense optical flow to follow and estimate the path. This new methodology allows to supervise in real time the demeanor of sargassum close to shore without complex technical support.
Neuron segmentation in epifluorescence microscopy imaging with deep learning
Epifluorescence Microscopy Imaging is a technique used by neuroscientists for observation of hundreds of neurons at the same time, with single-cell resolution and low cost from living tissue. Recording, identifying and tracking neurons and their activity in those observations is a crucial step for researching. However, manual identification of neurons is a hardworking task as well as prone to errors. For this reason, automatized applications to process the recordings to identify functional neurons are required. Several proposals have emerged; they can be classified in four kinds of approaches: 1) matrix factorization, 2) clustering, 3) dictionary learning and 4) deep learning. Unfortunately, they have resulted inadequate to solve this problem. In fact, it remains as an open problem; two major reasons are: 1) lack of datasets duly labeled and 2) existing approaches do not consider the temporal dimension or just consider a tiny fraction of it, integrating all the frames in a single image is very common but inefficient because temporal dynamics are disregarded. We propose an application for automatic segmentation of neurons with a Deep Learning approach, considering temporal dimension through recurrent neural networks and using a dataset labeled by neuroscientists. Additional aspects considered in our proposal include motion correction and validation to ensure that segmentations correspond to truly functional neurons. Furthermore, we compare this application with a previous proposal which uses sophisticated digital image processing techniques on the same dataset.
Multimodal super-resolution reconstruction based on encoder-decoder network
Aiming at the problem of poor fusion quality of traditional algorithms, especially the lack of texture information in infrared images or visible images inability to obtain sufficiently bright images results in images with poor signal-to-noise ratios and superimposed significant read noise in lousy weather conditions, a deep learning method of infrared-visible images fusion based on encoder-decoder architecture is proposed. The image fusion problem is cleverly transformed into the issue of maintaining the structure and intensity ratio of the infrared-visible image. The corresponding loss function is designed to expand the weight difference between thermal target and background. In addition, a single image super-resolution reconstruction based on a regression network is introduced to tackle the traditional network mapping function not suitable for natural scenes. The forward generation and reverse regression models are considered to reduce the irrelevant function mapping space and approach the authentic scene data through double mapping constraints. Compared with other state-of-the-art approaches, our experimental results achieve appealing performance on visual effects and objective assessments. In addition, we can stably provide high-resolution reconstruction results consistent with the human visual observation in different scenes while avoiding the trade-off between spatial resolution and thermal radiation information typical of conventional fusion imaging.
Synthetic apertures for array ptychography imaging via deep learning
Fourier Ptychography is a phase recovery technique that uses synthetic aperture concept to recover high-resolution sample images. It has made great breakthroughs in microscopic fields such as biological cells. However, Fourier Ptychography is still restricted by many macroscopic fields of remote detection such as sea, land and air due to its non-active imaging. In this paper, a fast Fourier Ptychography technique based on via deep learning is proposed. Firstly, different from the previous macro scanning, a 3-3 array camera is used to quickly obtain part of the spectrum of the object to be measured. Secondly, the network is constructed by using the large aperture imaging results under non-laser irradiation as the ground truth. Finally, 9 low-resolution images are used to obtain high resolution results. Compared with other advanced methods, the results obtained in this paper have satisfactory resolution and eliminate most of the influence of speckle caused by laser irradiation.
Infrared image super-resolution pseudo-color reconstruction based on dual-path propagation
Imaging systems with different imaging sensors are widely used in the surveillance, military, and medical fields. Infrared imaging sensors are widely used because they are less affected by the environment and can fully obtain the radiation information of objects, but they also have the characteristics of being insensitive to the brightness changes in the visual field and losing color information. The visible light imaging sensor can obtain rich texture information and color information but will lose scene information under bad weather conditions. Pseudo-color of infrared image and visible image can synthesize new image with complementary information of source image. This paper proposed a pseudo-color deep learning method for infrared and visible images based on a dual-path propagation codec structure. Firstly, the residual channel attention module is introduced to extract features at different scales, which can retain more meaningful information and enhance important information. Secondly, an improved fusion strategy based on visual saliency is used to pseudo-color the feature map. Finally, the pseudo-color results are recovered by reconstructing the network. Compared with other advanced methods, our experimental results achieve the satisfactory visual effect and objective evaluation performance.
Image Analysis
icon_mobile_dropdown
Effective laser pest control with modulated UV-A light trapping for mushroom fungus gnats
Sumesh Nair, Chia-Wei Hsu, Yvonne Y. Hu, et al.
Fungus gnats (Sciaridae) are one of the notorious pests in mushrooms plantations. Fungus gnats are generally dealt with using relatively inefficient physical and chemical means, such as sticky traps and pesticides, respectively. Here, we have proposed an integrated pest control system composed of a UV-A LED source at 365 nm, a galvo-mirror pair, and a 445 nm high-power laser diode. We used a 365 nm UV-A LED as the innovative light trap, since fungus gnats’ show maximum attraction to 365 nm, based on previous studies. We also modulated the UV-A LED at various frequencies and the response of the gnats to each of these frequencies were observed. The entire module was controlled using a NVIDIA Nano microcontroller and a DAC card. Our experiments show that the laser module, with the UV-A LED modulated at 40 Hz, could achieve 50% elimination rate in about two minutes, in a test container of 10 × 10 × 10 cm3 . Without the light trap, random laser scanning achieved 69% elimination rate in a single scan, taking approximately 4 minutes, indicating the highly improved performance of the UV-A modulated light trap. In conclusion, we have proposed an affordable and easily replicable laser-based pest control system with UV-A light trap modulated at 40 Hz for precisely eliminating fungus gnats in mushroom plantations rapidly.
Optical coherence tomography versus microscopy for the study of Aloe Vera leaves
The aim of this study is to compare the advantages and limitations of two optical methods, namely Optical Coherence Tomography (OCT) and microscopy for minute investigation of the structure of Aloe Vera leaves. Microscopy has the advantage of a higher resolution, but the disadvantage that the object under investigation is completely damaged (as the leaf must be peeled off). On the contrary, an advantage of OCT is that it is non-invasive with the potential added benefit of on-site measurements (if portable). Depending on the OCT method used, different resolution values are achievable. In principle, Time Domain (TD) OCT can achieve lateral resolutions similar to microscopy but the method is slow for depth investigations. Spectrometer-based and Swept Source (SS) OCT trade lateral resolution for speed of acquisition. In order to acquire sufficient axial range A-scans, low numerical aperture interface optics is used, that exhibits lower transversal resolution. The main limitation of the spectrometer based and swept source OCT is therefore the achievable lateral resolution, which might not be good enough to reveal the detailed structure of noteworthy parts of leaves, for example, their stomata. The present study experimentally compares Aloe Vera data obtained using an optical microscope at different magnifications, and an in-house SS-OCT system with a 1310 nm center wavelength. For gathering additional information, an analysis of the normalized A-scan OCT images was also performed. This reveals additional parts of the leaf structure, while it still falls short of what can be obtained by using conventional microscopy.
Integration of augmented reality and image processing in plasma dynamic analysis: digital concepts and structural system design
Haider Al-Juboori, Tom McCormack
Augmented reality (AR) is a part of environment-enhancing technologies, vastly utilized in the new concepts of computer vision, that has only lately started to apply in different fields of high-resolution as well as ultrafast imaging. With the current technical advances, AR is well-positioned to provide computer-guided assistance for a wide variety of emission imaging applications, such as plasma dynamic detection in visible wavelengths that will be represented in this paper. The main investigations in this work are based on the ultrafast time-resolved visible measurements of colliding laser-produced plasmas (CLPP) experiments, that will be supported with techniques of digital image processing, image analysis, and augmented reality to characterize and track the response and unique behaviour of the colliding plasma as well as stagnation layer features that giving a good indication of the degree of plume interpenetration, for all presented experiments. Concretely, the performance of the CLPP studies is strongly dependent on the choice of the experimental conditions. The work describes the core design and demonstrates the imaging performance and analysis for colliding plasma and stagnation layer which create from homogeneous target material, (i.e., Aluminum ‘Al, Z= 13’, or Silicon (Si, Z =14)). The outcomes and design concepts of AR present in this paper can give the milestone to give remarkable improvements to the field of plasma dynamic understanding, especially with images captured in the nanosecond time scale. Additionally, the study provides a considerable amount of detailed data related to the geometrical analysis of the interaction zone which extends the understanding of the behavior of particular species within colliding laser-produced plasmas.
COVID-19 detection from lung ultrasound images
Early-stage detection of Coronavirus Disease 2019 (COVID-19) is crucial for patient medical attention. Since lungs are the most affected organs, monitoring them constantly is an effective way to observe sickness evolution. The most common technique for lung-imaging and evaluation is Computed Tomography (CT). However, its costs and effects over human health has made Lung Ultrasound (LUS) a good alternative. LUS does not expose the patient to radiation and minimizes the risk of contamination. Also, there is evidence of a relation between different artifacts on LUS and lung’s diseases coming from the pleura, whose abnormalities are related with most acute respiratory disorders. However, LUS often requires an expert clinical interpretation that may increase diagnosis time or decrease diagnosis performance. This paper describes and compares machine learning classification methods namely Naive Bayes (NB) Support Vector Machine (SVM), K-Nearest Neighbor (K-NN) and Random Forest (RF) over several LUS images. They obtain a classification between lung images with COVID-19, pneumonia, and healthy patients, using image’s features previously extracted from Gray Level CoOccurrence Matrix (GLCM) and histogram’s statistics. Furthermore, this paper compares the above classic methods with different Convolutional Neural Networks (CNN) that classifies the images in order to identify these lung’s diseases.
Image Acquisition and Computational Imaging
icon_mobile_dropdown
Optical spatial differentiation with ultrathin freestanding subwavelength gratings
We investigate spatial differentiation of optical beams using guided mode resonances in suspended dielectric one-dimensional photonic crystals. Various SiN grating structures are characterized under various incidence, polarization and beam size illuminations. We first observe first- and second-order spatial differentiation in transmission of Gaussian beams impinging at oblique and normal incidence, respectively, on gratings designed to be resonant for either TE- or TM-polarized incident light. Polarization-independent first-order spatial differentiation is then demonstrated with a specifically designed, doubly-resonant, one-dimensional and symmetric grating structure. Such ultrathin and essentially lossfree nanostructured dielectric films are promising for various optical processing, optomechanics and sensing applications.
BDIC: boosting the performance of optical microscopy using blind deconvolution and illumination correction
Shuhe Zhang, Tos T. J. M. Berendschot, Jinhua Zhou, et al.
We propose the combination of single image blind deconvolution and illumination correction (BDIC) to enhance the image quality of a microscopy system. We evaluated the performance this method by calculating the peak signal-to-noise ratio and structural similarity of both raw and enhanced images with respect to the reference images. Both subjective and objective assessments show that BDIC increases the image's quality including its contrast and signal-to-noise ratio, without losing image resolution and structural information. To demonstrate its applicability, we also applied BDIC to different samples including plant root tissue and human blood smears.
Mid-infrared speckle reduction technique for hyperspectral imaging
Maroun Hjeij, Luiz Poffo, Bastien Billiot, et al.
Recently, we proposed an active mid-infrared (MIR) hyperspectral imaging system for early detection of plant water stress. A quantum cascade tunable laser powered this stand-off detection system, in which “speckle” in images due to coherent nature of laser radiation. The speckle affects the spatial resolution of the images. In this article, we evaluate several speckle suppression methods suitable for the mid-infrared region. To quantify the reduction of speckle, we compare their spatial contrast. We combined optical techniques and showed their ability to reduce the speckle contrast from its initial value 0.413 down to 0.11, representing a 73% reduction
Processing of the spectral and spatial information in the devices performing image multispectral analysis
Boris S. Gurevich, Kirill V. Zaitchenko
Both spatial and spectral information is important for the images analysis. However, their simultaneous processing is connected with certain difficulties caused basically by the deficiencies of the devices of spectrum analysis of the highdefinition images. These devices provide the limited amount of the selected spectral components as well as low processing rate and reliability due to the presence of the mechanically moving parts. These deficiencies can be eliminated by means of application of acousto-optic tunable filters (AOTF) as selective elements. The functional circuit of the devices providing processing of both kinds of information by means of multispectral image processing using AOTF has been shown and described. The character of AOTF diffraction efficiency decreasing due to the deviation of the selected sub-image wavelength from the value defined by Bragg condition, has also been described. The expression and curve describing the decreasing have been obtained from the states of Kogelnik’s couple waves theory. Also the interconnection between spatial and spectral information amounts which could be provided by the multispectral device based on AOTF, has been established and analyzed.
Applications
icon_mobile_dropdown
Towards a demonstrator setup for a wide-field-of-view visible to near-infrared camera aiming to characterize the solar radiation reflected by the Earth
Climate change monitoring is still a major challenge, which is currently typically addressed using radiometers monitoring the radiative fluxes at the top of the atmosphere. To improve the current state-of-the-art monitoring instruments, we pursue the development of novel space instrumentation, combining a radiometer with two additional imagers, improving the spatial resolution to a few kilometers allowing scene identification, while enabling a spectral distinction between the reflected solar radiation (RSR) using a visible to near-infrared (400 – 1100 nm) camera, and the Earth’s emitted thermal radiation using a thermal infrared (8 – 14 μm) camera. In this paper, we present a novel camera design optimized towards RSR monitoring, while targeting a compact design and minimizing the number of aspheric components. More specifically, our optimized imaging design shows a wide field of view (138°) enabling to observe the Earth from limb to limb, a compact volume fitting within 1 CubeSat Unit (1U), a wide spectral range (400 – 900 nm) to retrieve the RSR with a certainty of more than 95%, a spatial resolution better than 5 km at nadir, and a close to diffraction-limited performance. After optimization towards the nominal design, possible design alternatives are considered and discussed, enabling a cost-efficient design choice. Following, the mechanical and optical design tolerances are evaluated using a statistical Monte Carlo analysis, indicating a robust and tolerant design that can be manufactured using ultra-precision diamond tooling. Finally, stray-light analysis was performed enabling evaluation of ghost reflection and evaluating the necessity of an anti-reflection coating. Consequently, we can conclude our proposed imaging designs show a promising performance optimized towards Earth observation, paving the way to an improved climate change monitoring.
On-board satellite data processing to achieve smart information collection
Nowadays, it is a reality to launch, operate, and utilize small satellites at an affordable cost. However, bandwidth constraint is still an important challenge. For instance, multispectral and hyperspectral sensors generate a significant amount of data subjected to communication channel impairments, which is addressed mainly by source and channel coding aiming at an effective transmission. This paper targets a significant further bandwidth reduction by proposing an on-the-fly analysis technique on the satellite to decide which information is effectively useful for specific target applications, before coding and transmitting. The challenge would be detecting clouds and vessels having the measurements of red-band, green-band, blue-band, and near infrared band, aiming at sufficient probability of detection, avoiding false alarms. Furthermore, the embedded platform constraints must be satisfied. Experiments for typical scenarios of summer and winter days in Stockholm, Sweden, are conducted using data from the Mimir's Well, the Saab AI-based data fusion system. Results show that non-relevant content can be identified and discarded, pointing out that for the cloudy scenarios evaluated, up to 73.1% percent of image content can be suppressed without compromising the useful information into the image. For the water regions in the scenarios containing vessels, results indicate that a stringent amount of data can be discarded (up to 98.5%) when transmitting only the regions of interest (ROI).
Compact angle diversity receiver concept for visible light positioning
Felix Lichtenegger, Claude Leiner, Christian Sommer, et al.
In this work, we investigate a novel angle diversity receiver concept for visible light positioning. The receiver concept, consisting an ultrathin Fresnel lens, embedded in an aperture, mounted on top of a CMOS sensor has been tested and optimized by ray-tracing simulations. This angle-dependent receiver system has the advantage of compact dimensions, a high field-of-view, an off-the-shelf-sensor and relatively high amount of collected light. The origination of the previously calculated Fresnel lens structure is performed by means of grayscale laser lithography. In the presented receiver system, the incoming radiant intensity distribution is converted into an irradiance distribution on the CMOS sensor, where different angles of incidence of incoming light are refracted towards different areas on the CMOS sensor. To verify the optical system experimentally, a prototype of the receiver is placed in a goniometer setup to record images under controlled angles of incidence. Irradiance distributions recorded in the experiment are compared to irradiance distributions obtained by a realistic ray-tracing model. By direct comparison between experiment and simulation, we can verify the optical functionality of the developed optical system of the receiver and investigate the effect of manufacturing imperfections.
Path following of field-tracked robots based on model predictive control with visual-inertial odometry and identified state-space dynamic model
Model predictive control (MPC) with prediction and control horizons under multivariable constraints can prompt field tracked vehicles to follow the reference path accurately. However, a kinematic model or a classic dynamic model of a vehicle is needed in MPC, and both of them must be linearized and hence the computation cost is large. Also, the parameters of a classic dynamic model are difficult to be measured. In this paper, system identification approach for estimated the linear state-space dynamic model of a field tracked vehicle in farm has been utilized. The dynamic model has been identified with more than 50% estimated fitting. Using the dynamic model, a linear MPC can be adopted, and hence the computation can be saved more than 2/3, compared with the conventional nonlinear MPC with a kinematic model. Furthermore, the tracked vehicle adopted the linear MPC with the dynamic model can achieve superior S-curve and L-shape path following.
Multi-incident holography profilometry for low- and high gradient object
Moncy S. Idicula, Patryk Mitura, Michal Józwik, et al.
Digital holographic microscopy (DHM) is a non-contact, profilometric tool that allows obtaining microscopic object topography from captured holograms. However, the use of DHM is limited when the object under observation has a high gradient or is discontinuous. Multi-angle digital holographic profilometry (MIDHP) is an alternative solution for overcoming this limitation for measuring the topography with discontinuities. This method combines digital holography and multi-angle interferometry. The method requires a certain number of holograms that are processed into longitudinal scanning function (LSF). The topography of the object is recovered by finding the maxima of the LSF. MIDHP enables to enlarge the measurement range and provides a high axial resolution. This paper investigates MIDHP to measure surfaces with various (low and high) surface gradients. The calculations of LSF requires many Fourier Transforms (FT) and the computations are slow. In this paper, we improve LSF calculations by introducing two algorithms. The first algorithm reduces number of FT needed by applying summation in frequency domain. Second approach applies the method of 3D filtering, which improves the quality of the reconstructed shape. The introduced approaches are verified both numerically and experimentally.
Standardization of Plenoptic Coding and Media Security Frameworks
icon_mobile_dropdown
JPEG pleno light field: current standard and future directions
Cristian Perra, Saeed Mahmoudpour, Carla Pagliari
This paper reports the status of the JPEG Pleno standard for the light field imaging modality and discusses current activities towards future developments of the standard. Currently, three parts (Part 1: Framework, Part 2: Light field coding with Amendment 1: Profiles and levels for JPEG Pleno light field coding system and Part 3: Conformance testing) have been standardized. Part 4: Reference software is in the final approval stage before standardization. The four-dimensional nature of plenoptic data poses many challenges. The literature presents several subjective and objective assessment models for the evaluation of two-dimensional (image/video) data. As for plenoptic data this topic is still under investigation, one of the JPEG Pleno standardization activities targets the development of plenoptic image quality assessment standard. Finally, as one of the possible directions within future JPEG Pleno standardization activities, JPEG Pleno is currently exploring state-of-the-art light field coding architectures exploiting learning-based approaches to assess the potential of these coding methods in terms of compression efficiency.
Definition of common test conditions for the new JPEG pleno holography standard
The JPEG committee started the definition of a new standard for holographic content coding under the project JPEG Pleno (ISO/IEC 21794). As a first standardization effort targeting holographic content, multiple challenges were faced, notably to select appropriate testing content, type of anchors to be used for comparison with proponent’s proposals and how to evaluate the quality for this type of content. This paper describes the development and options of the Common Test Conditions (CTC) defined to evaluate the responses to the call for proposals. The application of standard coding technology developed for image/video compression to holographic content led to several complicated issues to solve related to appropriate anchor selection and the generation of test content. Furthermore, knowledge on how to evaluate compression methodologies for holographic data was very limited until recently. Relevant studies typically use signal fidelity for quality evaluation by computing the SNR or the PSNR between the signal and correspondent decoded signal versions, in the hologram plane or the object plane, respectively. Although signal fidelity is always a measure of the ability of the compression technology to recreate a similar signal, it is usually not the best metric for perceptual evaluation. This paper describes the methodologies defined in the CTC for the perceptual and objective quality evaluation of the holographic data.
A standard way for computing numerical reconstructions of digital holograms
Tobias Birnbaum, David Blinder, Raees K. Kizhakkumkara Muhamad, et al.
There exist a multitude of methods and processing steps for the numerical reconstruction of digital holograms. Because these are not standardized, most research groups follow their own best practices, making it challenging to compare numerical results across groups. Meanwhile, JPEG Pleno holography seeks to define a new standard for the compression of digital holograms. Numerical reconstructions are an essential tool in this research because they provide access to the holographically encoded 3D scenes. In this paper, we outline the available modules of the numerical reconstruction software developed for the purpose of this standardization. A software package was defined that is able to reconstruct all holograms of the JPEG Pleno digital hologram database, evaluating several core experiments. This includes Fresnel holograms recorded or generated in Fresnel or (lensless) Fourier geometries as well as near-field holograms, that require rigorous propagation through the angular spectrum method. Specific design choices are explained and highlighted. We believe that providing information on the current consensus on the numerical reconstruction software package will allow other research groups to replicate the results of JPEG Pleno and improve comparability of results.
Media security framework inspired by emerging challenges in fake media and NFT
Frederik Temmermans, Deepayan Bhowmik, Fernando Pereira, et al.
Advances in deep neural networks (DNN) and distributed ledger technology (DLT) have shown major influence on media security, authenticity and privacy. Current deepfake techniques can produce near realistic media content which can be used in both good and bad intended use cases. At the same time, DLTs are finding their way in the industry as fair, transparent and reliable means for content distribution. In particular non-fungible tokens (NFTs) are emerging in the digital art market. However, such new developments also introduce new challenges, including the need for robust and reliable metadata, a mechanism to secure the media and associated metadata, means to verify authenticity and interoperability between various stakeholders. This paper identifies emerging challenges in fake media and NFT, and proposes a novel framework to effectively cope with secure media applications allowing for a structured, systematic, and interoperable solution. The framework relies on an architecture that is modular, flexible, extensible, and scalable in the sense that it can be implemented in both lighter as well as more feature-rich and more complex configurations depending on the underlying application, needed features and available resources, while enabling products and services in various ecosystems with desired trust and security capabilities. The framework is inspired by activities and developments within JPEG standardisation related to security, authenticity and privacy.
Displays and Projections
icon_mobile_dropdown
Accuracy of 3D image manipulation through linear transformation of wide-angle hologram
Tomasz Kozacki, Moncy S. Idicula, Maksymilian Chlipala, et al.
Methods for manipulating the geometry of 3D image by linear transforming of hologram data are very efficient. Such techniques are desirable also for wide-angle viewing holographic display. Especially when there is no direct access to the reconstructed image due to lack of numerical hologram reconstruction techniques. Hence, methods based on directly manipulating the holographic image cannot be applied. In this paper, we investigate theoretically, numerically, and experimentally the image manipulations based on hologram stretching for the case of wide-angle near eye holographic display. We show that for wide-angle display the significant Petzval curvature and astigmatism errors are obtained in image reconstruction. Finally, applicability of the investigated image manipulation methods is shown with experiments of hologram reconstructions in a wide-angle holographic display.
Optimal dense and random addressing design of emissive points in a retinal projection device
One of the big challenges of Augmented Reality (AR) is to create ergonomic smart glasses. Our laboratory proposes an unconventional concept of AR glasses based on self-focusing of multiple beams into the eye. The device comprises a dense electrode network that allows extracting light from a dense waveguide network at the intersections points. Each beam emitted at these points is reflected by a holographic element that directs light towards the eye with a proper angular direction. The image pixels are created on the retina by the interferences of various beam distributions. This paper presents a method to optimize the design of waveguides and electrodes to increase the number of pixels on the final image. Our method considers a first waveguide (resp. electrode) as a curve described by a succession of segments with a unique absolute angle crossing a horizontal (resp. vertical) axe. The other waveguides (resp. electrodes) are created by the translation of the curves so that the minimal distance between two curves is equal to a fixed value. We use the B-Splines mathematical model to approximate the succession of segments. An iterative method adapted to B-Splines allows calculating the intersections between waveguides and electrodes. An Emissive Point Distribution (EPD) is obtained by selecting random groups of waveguides and electrodes. Each EPD forms one pixel onto the retina. The Point Spread Function (PSF) of this EPD characterizes the self-focusing efficiency. We calculate the Signal-to-Noise-Ratio of each EPD to evaluate the quality of the whole self-focusing process and we compare it to the Airy function. Our new mathematical model improves by a factor 3.5 the number of pixels for an equivalent SNR in comparison to the model we previously used.
Composite waveguide holographic display
Waveguide-type displays with volume phase holograms are notable for their small size, large eyebox and high transmission in both the projected image and see-through channels. However, as the aperture, field of view and working spectral range grow, the variation of the hologram replay conditions across its’ surface increases and sets a performance limitation in terms of resolution and diffraction efficiency. In order to overcome it, we propose to use a composite hologram, i.e. a volume phase grating split into sub-apertures with independently varying parameters as the fringes tilt, their pattern, the holographic layer thickness and modulation depth. This approach allows to increase the number of free variables in the design drastically, without introducing any additional optical components. We present the optical design and modelling algorithm for a display with a composite outcoupling hologram. The algorithm relies on simultaneous raytracing through auxiliary optical systems and the diffraction efficiency computations with the coupled waves theory equations. We demonstrate its’ application on an example of polychromatic display with extended field of view. It operates in the spectral range of 480-620 nm and covers the field of 6° × 8° with 8 mm exit pupil diameter. The output hologram has a uniformized diffraction efficiency over the field and the spectral range varying from 26.0% to 65.3%, thus providing throughput around 50% for the projected image and its’ optimal overlapping with the see-through scene. At the same time, the image quality is high and uniform with the PTV angular size of the projected spot better than 1.88′ for the entire image. We compare the results for the initial design using a single classical grating with that for a composite hologram comprising of 4 sub-apertures and show the achieved gain in performance – the gain in DE is up to 13.8% and that in aberration is 0.4 ′ . Also we demonstrate that the computed parameters are feasible and provide a brief sensitivity analysis for them.
Poster Session
icon_mobile_dropdown
Gibbs ringing and the Fresnel transform
Gibbs ringing is an artefact that occurs when a discontinuous signal is truncated in the Fourier domain. It is a phenomenon which occurs frequently in optics as apodization - the action of an aperture - and which can be interpreted as an idealised low pass filtering process. Diffraction can be approximately modelled using the Fresnel transform. The spectral method of calculating the Fresnel transform, a workhorse in digital holography and other fields, interprets the Fresnel transform as an all-pass filter. In this paper, we analyse the relationship between these phenomena and propose how to use this interpretation to improve image quality.
Monocentric cameras design for 3D scenes capturing and projection
Due to the historical reasons all the image capturing and projection systems work with a “flat-to-flat” configuration: the image is detected in a camera focal plane and then projected to a flat display or a flat screen. Recently, we entered a new era with two major technical levers – curved sensors and 3D/immersive imaging. This new combination allows us, on the one hand, to easily capture spherical images and, on another hand, to view spherical images without any intermediate plane picture. Indeed, the image used in an immersive projection system can be assimilated to a sphere where the user can move his head in different directions. Meanwhile, a camera based on curved sensor would be able to capture almost a perfect spherical scene. All the basic processes for editing and post-production can thus be done on a spherical data basis. In this work we consider design of lenses for capturing and projecting images on spherical surfaces. Due to the spherical symmetry reasons it is just natural to use monocentric lenses for these purposes. Such a design evolves from a simple ball lens, where the pupil center coincides with the center of symmetry, to a more realistic design with 4 components in 2 groups. We consider a lens with 12 mm focal length and F/1.77 aperture, covering the field of view up to 90 degrees. It works with an object located 3 m away from the camera and the spatial resolution reaches 57 lines/mm. The same design can be re-scaled and modified to serve as a projection system working with a curved screen. We consider a spherical screen with 12 m radius, which can be related to a planetarium cupola. We analyze image quality of such a system and show that the image distortion should be re-defined and the corrected value is lower than a conventional one by factor of 1.4. Also, we perform an end-to-end image simulation to demonstrate that the projected wide-angle scene is close enough to a one observed directly by a human eye.
Global intelligent system for waste disposal objects monitoring using the discrete orthogonal transformations based on neural network remote sensing image processing
Remote sensing of the Earth allows to receive medium information, a high spatial resolution from space vehicles, and to conduct hyperspectral measurements. This study presents a remote sensing application using time-series Landsat satellite images to monitor the solid waste disposal site (WDS). The article proposes algorithms for working with spatial information, namely the transformation (convolution) of these manifolds into a one-dimensional sample. Recursive quasi-continuous sweeps are used for which the following conditions are satisfied: 1) preservation of the topological proximity of the original and expanded spaces, 2) preservation of correlations between the elements of the original and transformed spaces. An automated system is proposed for detecting and investigating waste objects based on the concept of fractal sets and convolutional neural networks. The first neural network detects WDS, the second works to localize the waste objects. This technique can become the object of further research on developing a medical-prophylactic expert system at the territorial level to detect and neutralize unauthorized waste disposal sites based on medium and highresolution space images. As a result, the proposed method demonstrates good accuracy in detecting the solid waste disposal site on real satellite images.
Technique for analyzing the working table on a robotic complex based on the study of point data in a two-dimensional measurement space
Evgenii Semenishchev, Aleksandr Zelensky, Andrey Alepko, et al.
The article discusses the analysis technique working table for a robotic complex based on the study of point data in a twodimensional measurement space. The resulting data is a two-dimensional data set. Data processing is performed iteratively, using a smoothing and interpolation method. The method is based on the use of a multicriteria objective function. Minimization of the objective function is performed simultaneously by the standard deviation of the input data from the values obtained as a result of processing and the criterion minimizing the mean square of the difference between the values generated as a result of processing. The adjustment parameter allows you to set the weight of the criterion. The use of such a filter allows you to automatically set the degree of smoothness of the output function and determine the areas that have a deviation relative to the measurement error threshold set by the operator. At the second iteration, the marked areas and the area next to them are measured with the minimum grid spacing. Areas that are not marked as locally damaged are analyzed with a reduced step (mismatched points) and reprocessed by a multi-criteria method. If new errors are found, the area is rescanned, otherwise it is accepted as acceptable for use. The resulting three-dimensional matrix makes it possible to evaluate areas with large errors (deviations) at standard displacements of the working tool and the working table, and to calculate the compensation parameters of the displacements. As field data, we used data from the analysis of deviations obtained by analyzing the desktop with a minimum grid step of measurements and a shift in three coordinates of the machine and the proposed approach. The result obtained made it possible to identify the same areas of deviations as the complete one obtained during the full measurement cycle. For various conditions, it was possible to reduce the analysis time up to 10 times, in some cases it was more than 12 hours. Tables of nature data and examples of calculation of predictive values of final analysis cycles are given.
3D reconstruction for SLAM using multisensor fusion and block-based inpainting
A. Zelensky, V. Voronin, N. Gapon, et al.
Simultaneous localization and mapping (SLAM) systems are useful for camera tracking, and 3-D reconstructions may be desired for many robotic tasks. There is a problem consisting of a decrease in the accuracy of planning the movement trajectory caused by incorrect sections on the depth map due to incorrect distance determination to objects. Such defects appear as a result of poor lighting, specular or fine-grained surfaces of objects. As a result, the effect of increasing the boundaries of objects (obstacles) appears, and the overlapping of objects makes it impossible to distinguish one object from another. In this paper, we propose a multisensor SLAM system capable of recovering a globally consistent 3-D structure. The proposed method mainly takes two steps. The first step is to fusion images from visible cameras and depth sensors based on the PLIP model (parameterized model of logarithmic image processing) close to the human visual system's perception. The second step is image reconstruction. This article presents an approach based on a modified exemplar block-based algorithm using the autoencoder-learned local image descriptor for image inpainting. For this purpose, we learn the descriptors using a convolutional autoencoder network. Then, a 3-D point cloud is generated by using the reconstructed data. Our system outperforms the state-of-the-art methods quantitatively in reconstruction accuracy on a benchmark for evaluating RGB-D SLAM systems.
Multi-level deep learning depth and color fusion for action recognition
A. Zelensky, V. Voronin, M. Zhdanova, et al.
The solution to the problem of recognizing human actions on video sequences is one of the key areas on the path to the development and implementation of computer vision systems in various spheres of life. At the same time, additional sources of information (such as depth sensors, thermal sensors) allow to get more informative features and thus increase the reliability and stability of recognition. In this research, we focus on how to combine the multi-level decompression for depth and color information to improve the state of art action recognition methods. We present the algorithm, combining information from visible cameras and depth sensors based on the deep learning and PLIP model (parameterized model of logarithmic image processing) close to the human visual system's perception. The experiment results on the test dataset confirmed the high efficiency of the proposed action recognition method compared to the state-of-the-art methods that used only one modality image (visible or depth).
Multisensor characterization of WEEE polymers: spectral fingerprints for the recycling industry
Waste from electronic equipment (WEEE) is a fast-growing complex waste stream, and plastics represent around 25% of its total. The proper recycling of plastics from WEEE depends on the identification of polymers prior to entering the recycling chain. Technologies aiming for this identification must be compatible with conveyor belt operations and fast data acquisition. Therefore, we selected three promising sensor types to investigate the potential of optical spectroscopy-based methods for identification of plastic constituents in WEEE. Reflectance information is obtained using Hyperspectral cameras (HSI) in the short-wave infrared (SWIR) and mid-wave infrared (MWIR). Raman point acquisitions are well-suited for specific plastic identification (532 nm excitation). Integration times varied according to the capabilities of each sensor, never exceeding 2 seconds. We have selected 23 polymers commonly found in WEEE (PE, PP, PVC ABS, PC, PS, PTFE, PMMA), recognising spectral fingerprints for each material according to literature reports. Spectral fingerprint identification was possible for 60% of the samples using SWIR-HSI; however, it failed to produce positive results for black plastics. Additional information from MWIR-HSI was used to identify two black samples (70% identified using SWIR + MWIR). Fingerprint assignment in shorttime Raman acquisition (1 -2 seconds) was successful for all samples. Combined with the efficient mapping capabilities of HSI at time scales of milliseconds, further developments promise great potential for fast-paced recycling environments. Furthermore, integrated solutions enable increased accuracy (cross-validations) and hence, we recommend a combination of at least 2 sensors (SWIR + Raman or MWIR + Raman) for recycling activities.
The Venus infrared atmospheric gases linker instrument concept for solar occultation studies of Venus atmosphere composition and structure onboard the Venus Orbiter Mission of the Indian Space Research Organization
In this paper, we describe the concept of the Venus InfraRed Atmospheric Linker (VIRAL) spectrometer for investigation of the composition and structure of the planetary atmosphere at the top and above the cloud layer of Venus onboard the Venus Orbiter Mission announced by the Indian Space Research Organization (ISRO). VIRAL includes two channels, an infrared echelle spectrometer channel and an ultra-high resolution heterodyne interferometer channel. Here, we present the concept of the echelle channel only. The instrument is designed to perform solar occultation, providing an optimal photon yield combined with a superior spectral resolving power that exceeds 20,000. VIRAL echelle spectrometer will cover the wavelength range from 2.3 to 4.3 μm, and achieve high vertical resolution (with a footprint of about 1 km at the limb) to allow the detailed altitude profiling of the Venusian upper atmosphere with its composition and structure. We present the instrument concept, its preliminary optical design and science objectives of the experiment.
IVOLGA: a high-resolution heterodyne near-infrared spectroradiometer for Doppler studies of Venus atmospheric dynamics
Sergei Zenevich, Iskander Sh. Gazizov, Maxim V. Spiridonov, et al.
The unusual dynamics of the Venus atmosphere remains one of the most intriguing puzzles of our sister planet. IVOLGA (an oriole - a Russian abbreviation to “Fiber Infrared Heterodyne Analyzer”) will be the first-ever spaceborne instrument to measure transmittance spectra of Venus atmosphere with an ultra-high spectral resolution (≥1 000 000) in the near-infrared. The instrument’s capability to observe the fully resolved shape of individual lines in CO2 rovibrational bands at 1.6 – 2 um opens an opportunity to retrieve information about the chemical composition, wind velocity, and temperature of the sounded atmospheric layers. Observations will provide data on wind and temperature profiles in the altitude range of 75-140 km with a high vertical resolution by sounding the atmosphere of Venus in solar occultation mode. The instrument based on commercial telecommunication optical and electronic components is a compact, affordable and reliable unit, planned to be launched onboard the Venus Orbiter Mission of the Indian Space Research Organization.