Share Email Print

Proceedings Paper

Segmentation of illuminated areas of light using fully-convolutional neural networks and computer vision algorithms for augmented reality systems
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

The relevance of this topic is due to the rapid development of virtual and augmented reality systems. The problem lies in the formation of natural conditions for lighting objects of the virtual world in real space. To solve a light sources determination problem and recovering its optical parameters were proposed the fully-convolutional neural network, which allows catching the 'behavior of light' features. The output of FCNN is a segmented image with light levels and its strength. Naturally, the fully-convolutional neural network is well suited for image segmentation, so as an encoder was taken the architecture of VGG-16 with layers that pools and convolves an input image to 1x1 pixel and wisely classifies it to one of a class which characterizes its strength. As image dataset was synthesized by Integra developed realistic scene rendering software 'Lumicept', which has on its boat powerful tools for modeling and passing the behavior of light, so there is no doubt of wrong behavior or visualization of light rays and its secondary lighting, that guarantees proper optical parameters and its classification. Lumicept renders the image and its multi-color mask, where each color corresponds to it's optical strengthens. More 'cold' colors mean less intensive illumination when 'hot' colors correspond for the light sources, in digit equivalent that values ranging from 0 to 500 nits (or candela per square meter), where 0 is not lit at all area and 500 is a value of a light brightness of a typical room lamp. These images were used to feed CNN to dense layer, where the network learn features to recognize and as output upsamples to a segmentation image. To say more closely about an upsample layer, this is a kind of a function that brings a low-resolution image to a high-one by duplicating each pixel twice, this is called the nearest neighbor approach. Now FCNN decision can be used in tasks of definition of lighted areas of an accommodation, restoring brightness parameters, taking features of shadows, analyzing its secondary illumination and classifies it to one of a brightness level, which nowadays is one of a major task in augmented reality systems to place a synthesized object to our environment to match the specified optical parameters and lighting of a room, also speaking about determination of a light, the CNN encoder can determine the type of illumination, by this, is meant ceiling ones or wall light sources. Neural network training was conducted on 221 train images and 29 validation images with learning rate 1E-2 and 200 epochs, after training the loss was 0,2. As a test was used an ‘intersection over union’ method, that compares the ground truth area of an input image and output image, comparing its pixels and giving the result of accuracy. The mean IoU is 0.7, almost rightly classifying the first class with a value of 90 percents of accordance and the last class with a probability of 30 percents. Lately, the FCNN will be trained on more images and will be trained to determine light sources location.

Paper Details

Date Published: 21 June 2019
PDF: 6 pages
Proc. SPIE 11062, Digital Optical Technologies 2019, 110621N (21 June 2019); doi: 10.1117/12.2526150
Show Author Affiliations
Maxim Sorokin, ITMO Univ. (Russian Federation)
Andrey Zhdanov, ITMO Univ. (Russian Federation)
Dmitry Zhdanov, ITMO Univ. (Russian Federation)
Igor S. Potemin, ITMO Univ. (Russian Federation)
Nikolay Bogdanov, ITMO Univ. (Russian Federation)

Published in SPIE Proceedings Vol. 11062:
Digital Optical Technologies 2019
Bernard C. Kress; Peter Schelkens, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?