Submissions to the conference should have abstract text lengths 1,000 words or less.

The field of digital image processing has experienced continuous and significant expansion in recent years. The usefulness of this technology is apparent in many different disciplines covering entertainment through remote sensing. The advances and wide availability of image processing hardware along with advanced algorithms have further enhanced the usefulness of image processing. The Application of Digital Image Processing conference welcomes contributions of new results and novel techniques from this important technology.

Papers are solicited in the broad areas of digital image processing applications, including:

Application areas New imaging modalities and their processing Immersive imaging Image and video processing and analysis New standards in image and video applications Security in imaging Imaging requirements and features Imaging systems Compression Human visual system and perceptual imaging Artificial intelligence in imaging Novel and emerging methods in imaging ;
In progress – view active session
Conference 12226

Applications of Digital Image Processing XLV

22 - 25 August 2022
View Session ∨
  • 1: Compression I
  • 2: Compression II
  • 3: Human Visual System and Perception
  • 4: Imaging Systems
  • 5: New Imaging Standards
  • 6: Imaging Applications
  • 7: Image and Video Processing
  • 8: New Imaging Modalities and Applications
  • Poster Session
  • Panel Discussion on Advanced Video Compression and Applications
Information

POST-DEADLINE ABSTRACT SUBMISSIONS

  • Submissions accepted through 5-July

Call for Papers Flyer
Session 1: Compression I
12226-1
Author(s): Dan Grois, Comcast Corp. (Israel); Alex Giladi, Comcast Corp. (United States)
12226-3
Author(s): Philippe de Lagrange, Gwenaëlle Marquant, InterDigital R&D France (France)
Show Abstract + Hide Abstract
Version 1 of VVC specification was released in July 2020. VVC is the successor of HEVC, with 40 ot 50% better compression, and includes multi-layer profiles from the beginning, with a feature that differentiate it from its predecessors: a single decoder instance decodes all layers. This paper reports on a coding performance evaluation of spatial scalable coding with VVC, with both objective metrics and subjective tests. It shows that dual-layer coding can be on par or outperform single-layer coding in specific conditions. It also discusses coding and decoding complexity, and compares scalable VVC with LCEVC.
12226-4
Author(s): Ryan Lei, Facebook Inc. (United States)
Show Abstract + Hide Abstract
Recently, 3GPP has started the exploration project to evaluate and select the next generation video codec candidates after AVC and HEVC. In this project, a very comprehensive set of test conditions are defined to test submitted codec candidates, including AV1, VVC, and EVC, etc. These test conditions are defined to focus on 5 main usage scenarios for 5G video delivery, including high latency VOD usages for HD and 4K streaming, low latency usages such as screen content coding, real time communication and gaming streaming. Based on these set of test conditions, reference encoders of the proposed codec candidates are thoroughly tested to provide the benchmarking data for the final selection. In this paper, we will first discuss the detailed test scenarios and test configurations for the 3GPP benchmarking test. Then, the detailed encoding parameters for AV1 reference encoder that complies with this set of test conditions will be introduced and the benchmarking test result will be presented. Finally, we will introduce the encoding settings that can be applied to the AV1 reference encoder without legacy restrictions defined in the test conditions, and the corresponding compression efficiency improvement that can be achieved in the actual production usage.
12226-5
Author(s): Touradj Ebrahimi, Davi Nachtigall Lazzarotto, Michela Testolina, Ecole Polytechnique Fédérale de Lausanne (Switzerland)
Show Abstract + Hide Abstract
DNA is an excellent medium for the storage of information. Not only it offers a long-term and robust storage mechanism but also it is eco-friendly and has unparalleled storage capacity, At the same time, the basic elements in DNA storage are quaternary, and therefore there is a need for efficient representation of information in quaternary ACGT elements. Furthermore, the biological constraints add additional complexity to how information should be represented in ACGT. In this paper, we propose an efficient solution for the storage of visual information in DNA with practical constraints in mind and assess its performance.
12226-6
Author(s): Vibhoothi ., Daniel Joseph Ringis, Anil Kokaram, François Pitié, Trinity College Dublin (Ireland); Yeping Su, Neil Birkbeck, Balu Adsumilli, Jessie Lin, Google (United States)
Show Abstract + Hide Abstract
Since the adoption of VP9 by Netflix in 2016, Royalty-free coding standards continued to gain prominence through the activities of the AOMedia consortium’s AV1 Codec. Our previous work on VP9 and HEVC with standard dynamic range (SDR) content, shows that per-clip optimization of the Lagrangian multiplier can lead to significant gains in BDRATE using an off the shelf minimiser. In this work we instead treat the RDO parameters independently on a frame-hierarchy basis and test the idea with 4K HDR content. We explore the use of a wider range of canned-optimisers for improving convergence criteria to minimise computational load. Preliminary results with AV1 show that, by treating the optimization as a multivariable estimation problem across keyframe types, we can improve BDRATE gains by a factor of 10 (from 0.5% to 5%).
12226-7
Author(s): Yuriy A. Reznik, Karl O. Lillevold, Abhijith Jagannath, Nabajeet Barman, Brightcove, Inc. (United States)
Show Abstract + Hide Abstract
One of the biggest challenges in modern-era streaming is the fragmentation of codec support across receiving devices. For example, modern Apple devices can decode and seamlessly switch between H.264/AVC and HEVC streams. Most new TVs and set-top boxes can also decode HEVC, but they cannot switch between HEVC and H.264/AVC streams. And there are still plenty of older devices/streaming clients that can only receive and decode H.264/AVC streams. With the arrival of next-generation codecs - such as AV1 and VVC, the fragmentation of codec support across devices becomes even more complex. This situation brings a question – how we can serve such a population of devices most efficiently by using codecs delivering the best performance in all cases yet producing the minimum possible number of streams and such that the overall cost of media delivery is minimal? In this paper, we explain how this problem can be formalized and solved at the stage of dynamic generation of encoding profiles for ABR streaming. The proposed solution is a generalization of the context-aware encoding (CAE) class-of techniques, considering multiple sets of renditions generated using each codec and codec usage distributions by the population of the receiving devices. We also discuss several streaming system-level tools needed to make the proposed solution practically deployable.
Session 2: Compression II
12226-8
Author(s): Philippe Bordes, Franck Galpin, Hassane Guermoud, Thierry Dumas, InterDigital, Inc. (France)
Show Abstract + Hide Abstract
JVET has developed a new Enhanced Compression Model (ECM) for testing future video coding algorithms on top of the Versatile Video Coding (VVC) standard. VVC supports reference picture resampling (RPR) to change frame resolution without inserting an intra refresh picture. This feature is well adapted to video streaming and low delay scenarios since it allows graceful bit-rate adaptation, whereas traditional techniques based on streams switching can generate bitrate leaps. In this paper, some adaptations to implement RPR in ECM are discussed and some modifications of RPR to improve ECM efficiency in the context of super-resolution and low-delay coding are proposed.
12226-9
Author(s): Fabrice Urban, Karam Naser, Franck Galpin, Tangi Poirier, InterDigital, Inc. (France)
Show Abstract + Hide Abstract
For each generation of video coding standard, increasing the block partitioning flexibility has been very efficient. In VVC, the latest video standard developed by MPEG, block partitioning using the QuadTree plus Binary Tree and Ternary Tree (QTBTTT) was the tool bringing the highest compression gains. In this paper, we study block partitioning flexibility to improve video compression efficiency beyond VVC. We first study the flexibility of existing QTBTTT partitioning on coding performances. We then show the improvement of compression efficiency of future video coding standards by adding new asymmetric split, named as Asymmetric Binary Tree (ABT).
12226-10
Author(s): Yifan Wang, Zhanxuan Mei, The Univ. of Southern California (United States); Ioannis Katsavounidis, Meta (United States); Chung-Chieh Jay Kuo, The Univ. of Southern California (United States)
Show Abstract + Hide Abstract
Image coding has been studied for more than four decades. Image coding standards have been developed and widely used today such as JPEG and JPEG-2000. Furthermore, intra coding schemes of modern video coding standards also provide very effective image coding solutions. Examples include H.264/AVC intra, WebP from VP8 intra, BPG from H.265/HEVC intra, AV1 intra, and H.266/VVC intra. Block transform coding is commonly used in these codecs where images are partitioned into blocks of different sizes and pixel values in blocks are transformed from the spatial domain to the spectrum domain for energy compaction before quantization and entropy coding. Another powerful tool is intra prediction which reduces the pixel correlation using pixel values from neighboring blocks at a low cost. Residuals after intra prediction are still coded by block transform coding. Recently, deep-learning-based compression methods have attracted a lot of attention due to their superior rate-distortion performance. The learning-based image coding paradigm has the following two characteristics. Traditional image codecs only explore correlation in the same image while learning-based image codecs can exploit correlation from other images (i.e., inter-image correlation). Traditional image codecs only capture the representation with variable block size while learning-based image codecs can exploit the multi-scale representation based on pooling. In other words, traditional image codecs primarily explore correlation at the block level while learning-based image codecs can exploit short, middle, and long-range correlations using the multi-scale representation. Furthermore, different loss functions can be easily designed in learning-based schemes to fit the human visual system (HVS) and attention can be introduced to the learning-based schemes conveniently. To achieve low-complexity learning-based image coding, we propose a multi-grid multi-block-size vector quantization (MGBVQ) method based on these characteristics in this work. Generally speaking, MGBVQ decomposes pixel correlations into long-, mid- and short-range correlations. It represents and encodes long-range correlations in coarser grids due to their smoothness. Thus, it leads to a multi-grid (MG) coding architecture. Second, mid- and short-range correlations can be effectively coded by a suite of vector quantizers (VQs). Along this line, we argue the effectiveness of VQs of very large block sizes and present a convenient way to implement them. To make MGBVQ effective, we develop a set of coding tools, including quad-tree early termination, adaptive codebook selection, large block size vector quantization wrapping, etc. Experimental results show that MGBVQ offers a good tradeoff between three factors: bit rates, coded image quality, and computational complexity. Furthermore, it provides a progressive coded bitstream.
12226-11
Author(s): Xiteng Liu, Advanced Micro Devices, Inc. (Canada)
Show Abstract + Hide Abstract
We exemplify a new method for high efficiency sensing, in contrast to compressed sensing (CS). We analyze weaknesses of CS in depth. This is the first appearance in the literature. Based on insight into CS weaknesses, high efficiency sensing remedies the CS weaknesses with radically rectified rationale and immensely improved performance. We make a wide spectrum of important innovations on rationale, methodology, transform and techniques. Demo software and test data are downloadable from our website www.lucidsee.ca .
12226-12
Author(s): Foued Ben Amara, Faouzi Kossentini, Hassene Tmar, Intel Corp. (Canada)
Show Abstract + Hide Abstract
The SVT-AV1 encoder is an open-source AV1 encoder that was co-developed by Intel and Netflix, and that was later adopted by the Alliance for Open Media as the AV1 productization reference encoder. This paper describes the latest algorithmic improvements in the SVT-AV1 encoder when used in latency-constrained transcoding applications. The paper will outline the SVT-AV1 encoder’s ability to achieve great speed-quality tradeoffs in medium-latency (2-5 seconds) and low-latency (less than 1 second) applications, using both constrained variable-bitrate and constant-bitrate rate control algorithms. Simulation results that demonstrate the excellent SVT-AV1 tradeoffs relative to those of x264/AVC and x265/HEVC will be presented.
12226-13
Author(s): Pankaj Topiwala, Wei Dai, FastVDO Inc. (United States)
Session 3: Human Visual System and Perception
12226-14
Author(s): Touradj Ebrahimi, Michela Testolina, Ecole Polytechnique Fédérale de Lausanne (Switzerland)
Show Abstract + Hide Abstract
Assessing the quality of an image is an important task in many imaging applications. Examples of such applications include image enhancement, image compression, and image processing. with progress in imaging technologies, the need for high-quality imaging is increasing, and therefore, the need for assessment methodologies and metrics that can predict if the quality of an image is undistinguishable from the original. This paper proposes a methodology in order to assess if a processed image (e.g. after compression/decompression) is perceptually similar to the uncompressed.
12226-15
Author(s): Roberto Herrera-Charles, Teodoro Álvarez-Sánchez, Ctr. de Investigación y Desarrollo de Tecnología Digital, Instituto Politécnico Nacional (Mexico); Jesus Antonio Alvarez-Cedillo, Unidad Profesional Interdisciplinaria de Ingeniería y Ciencias Sociales y Administrativas, Instituto Politécnico Nacional (Mexico)
12226-16
Author(s): Yuriy A. Reznik, Nabajeet Barman, Brightcove, Inc. (United States)
Show Abstract + Hide Abstract
In adaptive streaming system delivering videos to players embedded in a web page, the player size is often adaptively resized during a streaming session depending on the user preferences (position and size of the browser window) and the device type. The rendition selection logic in such players is usually limited in that they only consider network throughput and in some advanced systems, also the player size. This results in a sub-optimal selection of renditions, often selecting the rendition closest to the player size which then is either upscaled or downscaled at the player, without any consideration of the decision on the perceived video quality. An optimal rendition selection algorithm, however, should take into account the display parameters and perceptual quality models to improve the end-user Quality of Experience, as many studies on subjective image quality assessment have demonstrated the importance of physical parameters like image resolution, image/display size, and viewing distance on the end user perceived quality. This paper addresses this research gap by presenting an algorithm to select the optimal rendition from the encoding ladder for web streaming video players using the perceived picture quality score estimated considering the player size and the display parameters as well as characteristics of the encoded video.
12226-17
Author(s): Pankaj Topiwala, Wei Dai, FastVDO Inc. (United States)
12226-18
Author(s): Darren Ramsook, Anil Kokaram, Trinity College Dublin (Ireland); Neil Birkbeck, Yeping Su, Balu Adsumilli, YouTube, Google (United States)
Show Abstract + Hide Abstract
Deep Neural Networks (DNNs) has been increasingly successful in video processing tasks. This success can be attributed to the ongoing investigation into the architectures of these networks and their ability of creating intermediate layers that are similar to the Human Visual System (HVS). In this paper we seek to extend the use of perceptually relevant losses in training a DNN for video compression artefact removal. We will use internal representations of pre-trained networks as the basis of the loss functions. Specifically, the L-PIPs metric and a perceptual discriminator will be responsible for low-level and high-level features respectively.
12226-19
Author(s): Divya Mishra, Ben-Gurion Univ. of the Negev (Israel); Ron Shmueli, AFEKA Engineering College (Israel); Ofer Hadar, Ben-Gurion Univ. of the Negev (Israel)
Show Abstract + Hide Abstract
Most of the recent state-of-the-art deep learning-based methods for image super-resolution assume an ideal degradation kernel (like bicubic down-sampling) on standard datasets. These methods performs poorly on real-world images in practice, since real degradations are far away and more complex in nature from pre-defined assumed kernels. Motivated by this real-time problem, this paper proposing a progressive method that implicitly define image-specific kernel without any explicit degradation estimation for blind super-resolution. The benefit of this approach is that, it is non-iterative and different from recent state-of-art kernel estimation methods like Iterative kernel correction(IKC), InternalGAN and Correction filter based super-resolution methods. Later, the image quality assessment is done via both qualitatively (vision based) and quantitatively using no-reference image quality assessment metrics. The proposed method outperforms state-of-the-art models by incorporating domain knowledge from recently implemented unsupervised single-image super-resolution techniques.
Session 4: Imaging Systems
12226-21
Author(s): Jose Alejandro Gonzalez Sarabia, Jose A. González-Fraga, Univ. Autónoma de Baja California (Mexico); Vitaly Kober, Ctr. de Investigación Científica y de Educación Superior de Ensenada B.C. (Mexico)
Show Abstract + Hide Abstract
An important step in the SLAM process is detection and analysis of keypoints found in the environment. Performing a good correspondence of these points allows us to build an optimal point cloud for maximum localization accuracy of the mobile robot and, therefore, to build a precise map of the environment. In this paper, we perform an extensive comparison study of the correspondences made by various combinations of detectors/descriptors and compare the performance of different iterative closest points (ICP) algorithms used in the RGB-D SLAM problem. An adaptive RGB-D SLAM system is proposed, and its performance with the TUM RGB-D dataset is presented and discussed.
12226-22
Author(s): Ahmet Çapci, ASELSAN A.S. (Turkey); Hüseyin Emre Güven, Cognizen Inc. (Turkey); Behçet Ugur Töreyin, Informatics Institute, Istanbul Technical Univ. (Turkey)
Show Abstract + Hide Abstract
Image distortion caused by the structure of thermal sensors has become an important problem. Since every detector or pixel in the sensor reacts differently even when fed the same signal, the correction is necessary for good imaging. In this study, we propose a deep learning based approach proposed for both cooled and uncooled thermal imagers. We created various thermal datasets to train models for temporal noise for both cooled and uncooled thermal imagers and compared the results. Our deep learning model accounts for the entirety of NUC, BPR, IOP operations. We also show that the optical artifacts or distortions can be eliminated using deep learning. As such, we demonstrate this with different system architectures that are suitable for embedded systems.
12226-23
Author(s): Touradj Ebrahimi, Yuhang Lu, Ecole Polytechnique Fédérale de Lausanne (Switzerland)
Show Abstract + Hide Abstract
Detectors are among popular tools in many imaging applications. Examples include face, deepfake, and change detectors. The need for reliable detectors is so important that many benchmarks and grand challenges have been organized in order to find the best approaches. In this paper, we propose a new framework to assess detectors and demonstrate their feasibility for use in identifying best solutions to detect in realistic conditions, Doing so, we demonstrate that many solutions ending on the top of ranking in challenges and benchmarks, end up being efficient at the conditions defined by the latter but often do so by deteriorating their efficiency in real-life applications.
12226-24
Author(s): Touradj Ebrahimi, Michela Testolina, Ecole Polytechnique Fédérale de Lausanne (Switzerland)
Show Abstract + Hide Abstract
Noise is an intrinsic part of any sensor and is present to various degrees in any content that has been captured from real life environment. In imaging applications, several pre- and post-processing solutions have been proposed to cope with noise in captured images. More recently, learning-based solutions based on artificial intelligence have shown impressive solutions in image enhancement in general and in image denoising in particular. In this paper we propose an innovative solution for image denoising in the compressed domain. The paper starts by explaining the advantages of such an approach not only from complexity but also from other points of views. We then describe the proposed solution and compare it to state of the art and draw conclusions.
Show Abstract + Hide Abstract
A method of 3D Mueller-matrix layer-by-layer reproduction of the distributions of the parameters of linear and circular birefringence and dichroism of partially depolarizing polycrystalline films of biological fluids is proposed and substantiated. The dynamics of changes in the value of statistical moments of the 1st - 4th orders, characterizing the distributions of the optical anisotropy parameters of a partially depolarizing polycrystalline blood film in various "phase" sections of its volume, has been investigated and analyzed. The most sensitive to prostate cancer parameters were revealed - statistical moments of the 3rd and 4th orders, which characterize the polarization-reproduced distributions of the magnitude of the phase and amplitude anisotropy of polycrystalline blood films of healthy donors and patients with prostate cancer and endometriosis. Excellent accuracy of differentiation of samples of polycrystalline blood films from different groups of patients has been achieved. Comparative studies of the diagnostic efficiency of 2D polarization and Mueller-matrix mapping methods, as well as 3D Mueller-matrix reproduction of the polycrystalline structure of blood films of healthy donors and patients with prostate cancer and endometriosis have been carried out.
Session 5: New Imaging Standards
12226-26
Author(s): Leonard Rosenthol, Adobe Inc. (United States)
Show Abstract + Hide Abstract
Given the deluge of digital content and rapidly advancing technology, it is challenging for consumers to trust what they see online. Deceptive content, such as deepfakes generated by artificial intelligence or more traditionally manipulated media, can be indistinguishable from the real thing, so establishing the provenance of media is critical to ensure transparency, understanding, and trust. The C2PA specification provides platforms with a open standards-based method to define what information is associated with each type of asset (e.g., images, videos, audio, or documents), how that information is presented and stored, and how evidence of tampering can be identified.
12226-27
Author(s): Zhijun Lei, Meta Platforms, Inc. (United States); Kaustubh Patankar, Jeeva Raj Arumugam, Mukund Srinivasan, Ittiam Systems Pvt. Ltd. (India); Ronald Bultje, Two Orioles (United States); Jean-Baptiste Kempf, VideoLAN (France); Ioannis Katsavounidis, David Ronca, Meta Platforms, Inc. (United States)
Show Abstract + Hide Abstract
AV1 is the first generation of royalty-free video coding standard developed by Alliance for Open Media (AOM). Since it was released in 2018, it has gained great adoption in the industry. Major services providers, such as YouTube and Netflix, have started streaming AV1 encoded content. Even though more and more vendors have started to implement HW AV1 decoders in their products, in order to enable AV1 playback on a broader range of devices, especially mobile devices, software decoders with very good performance are still critical. For this purpose, VideoLAN created dav1d: a portable and highly-optimized AV1 software decoder. The decoder implements all AV1 bitstream features. Dataflow is organized to allow various decoding stages (bitstream parsing, pixel reconstruction and in-loop postfilters) to be executed directly after each other for the same superblock row, allowing memory to stay in cache for most common frame resolutions. To test the performance of dav1d on real devices, a set of low-end to high-end android mobile devices are selected to conduct benchmarking tests. Extensive testing is done using a wide range of video test vectors at various resolutions, bitrates and framerates. The benchmarking and analysis is conducted to get the insights of single and multithreading performance, impact of video coding tools, CPU utilization and battery drain. In this paper, we will introduce details about the architecture and design of the dav1d software decoder, especially the performance optimization in both lower level compute kernel and high level threading framework. Then the actual benchmarking test on a broad range of devices will be presented.
12226-28
Author(s): Zhijun Lei, Hsiao-Chiang Chuang, Meta Platforms, Inc. (United States); Andrey Norkin, Agata Opalach, Netflix, Inc. (United States)
Show Abstract + Hide Abstract
AV1 is the first open-source video coding standard developed by the Alliance for Open Media (AOM), which was finalized in 2018. During its standardization process, coding tools were gradually adopted into the specification based on a tradeoff between multiple parameters, such as bitrate, quality, encoding and decoding implementation complexity. A fair comparison of the coding tools supported by this standard can be essential for encoder designers who seek to achieve a good balance among all these factors within their implementations. To this end, this paper compiles a tool-on/off analysis of several prominent coding tools supported by the AV1 specification. The analysis includes the impact of such tools on several objective quality metrics, i.e. PSNR-Y/U/V, SSIM, and VMAF, when using the reference encoder libaom implementation, as well as the corresponding impact on the SW runtime complexity of both the libaom encoder and decoder.
12226-29
Author(s): Thomas Richter, Fraunhofer-Institut für Integrierte Schaltungen IIS (Germany)
Show Abstract + Hide Abstract
At its 94th meeting, SC29WG1 (JPEG) decided to create a third edition of the JPEG XS standard (ISO/IEC 21122) for lightweight, low-latency image coding. While existing coding tools already support the compression of natural content and CFA Bayer pattern data well, this third edition has a strong focus on extending the previous edition with coding tools to improve the performance on screen content data. In this work, the authors will describe some of the candidate tools that have been proposed for the third edition, and will report on their performance and performance improvements on such data.
12226-30
Author(s): Touradj Ebrahimi, Davi Nachtigall Lazzarotto, Ecole Polytechnique Fédérale de Lausanne (Switzerland)
Show Abstract + Hide Abstract
Point cloud representation is a popular modality to code immersive 3D contents. Several solutions and standards have been recently proposed in order to efficiently compress the large volume of data point clouds represent, in order to make them feasible for real-life applications. In this paper, we propose an efficient learning-based point cloud compression technique which in addition to offering superior compression efficiency, also allows for compressed domain processing and point cloud analysis which are often the case in applications relying on such types of contents.
12226-31
Author(s): Foued Ben Amara, Intel Corp. (Canada); Guendalina Cobianchi, V-Nova Ltd. (United Kingdom); Ioannis Katsavounidis, Meta (United States); Faouzi Kossentini, Intel Corp. (Canada); Stergios Poularakis, V-Nova Ltd. (United Kingdom); Cosmin Stejerean, Meta (United States); Hassene Tmar, Intel Corp. (Canada)
Show Abstract + Hide Abstract
Optimization of the tradeoffs between compression and encoding complexity is key in software video encoding. This paper describes the results of using LCEVC (Low Complexity Enhancement Video Coding) to improve further the quality-cycles tradeoffs of SVT-AV1 in high-latency VOD applications. LCEVC is a new codec enhancement standard recently standardized as MPEG-5 Part 2. SVT-AV1 is an open-source AV1 encoder, co-developed by Intel and Netflix, and later adopted by the Alliance for Open Media as the AV1 productization reference encoder. The paper will evaluate LCEVC gains in quality-cycles tradeoffs employing the same methodology used in SPIE-2021 paper “Towards much better SVT-AV1 quality-cycles tradeoffs for VOD applications”. The LCEVC results will be compared to those of the underlying SVT-AV1, x264 and x265 encoders.
Session 6: Imaging Applications
12226-32
Author(s): Pankaj Topiwala, Wei Dai, FastVDO Inc. (United States)
12226-33
Author(s): Lioz Noy, Itay Barnea, Simcha Mirsky, Dotan Kambar, Mattan Levi, Natan Tzvi Shaked, Tel Aviv Univ. (Israel)
Show Abstract + Hide Abstract
Intracytoplasmic sperm injection (ICSI) is the most common practice for in vitro fertilization (IVF) treatments. In ICSI, a single sperm is selected and injected into an oocyte. The quality of the sperm and specifically its DNA fragmentation index (DFI) have significant effects on the fertilization success rate. In our research, we use computer vision and deep learning methods to predict DFI scoring for a single sperm cell. Each cell in the dataset was acquired using multiple white light microscopy techniques combined with state-of-the-art interferometry. In our results, we see a strong correlation between the stained images and our score prediction which can be used in the ICSI process.
12226-34
Author(s): Touradj Ebrahimi, Ecole Polytechnique Fédérale de Lausanne (Switzerland)
Show Abstract + Hide Abstract
Telepresence has multiple applications in broadcasting such as tele-transportation of presenters and remote participants to the studio. In this paper, we present an AI-based solution that allows for remote participants from home to be inserted in a virtual plateau on a broadcasting studio and to interact with others in real-time. The end-to-end system is described along with its key components and the results of the trials and evaluations are reported.
12226-35
Author(s): Jihoon Park, Junho Choi, Changmo Jeong, Hoonil Jeong, Dongju Jang, Chan M. Lim, Sangwoo Park, Dongkyu Kim, Junhyun Jo, Yong Yi Lee, SOS Lab. Co., Ltd. (Korea, Republic of)
12226-36
Author(s): Sophia Rosney, Ciarán Donegan, Hugh Denman, Meegan Gower, Wissam Jassim, Sankalp Panghal, Donal Scannell, Anil Kokaram, Trinity College Dublin (Ireland)
Show Abstract + Hide Abstract
In this paper we present an exploration of the components of a semi-automated, high quality broadcasting system which combines both semantic level game analysis and human event selection to address the shortcomings of current technology. By deploying wide angle lenses we reduce the number of physical cameras required to capture the event. We can then simulate multiple dynamic views from each single fixed camera. These views can then be selected at the discretion of the director for broadcast.
12226-37
Author(s): Jose A. González-Fraga, Univ. Autónoma de Baja California (Mexico); Vitaly Kober, Ctr. de Investigación Científica y de Educación Superior de Ensenada B.C. (Mexico); Everardo Gutiérrez López, Jose Alejandro Gonzalez Sarabia, Univ. Autónoma de Baja California (Mexico)
Show Abstract + Hide Abstract
In this paper, a method for automatic detection of breast pathologies using a deep convolutional neural network and a class activation map is proposed. The neural network is pretrained on the regions of interest in order to modify the output layers to have two output classes. The tuned neural network will then be used to localize pathologies on full-size mammograms. The proposed method is compared with different CNN models and is applied to classify the public datasets: Mammographic Image Analysis Society (MIAS) and the Curated Breast Imaging Subset of DDSM (CBIS-DDSM).
Session 7: Image and Video Processing
12226-38
Author(s): Andreas Kah, Maurice Klein, Hochschule RheinMain (Germany); Christoph Burgmair, Markus Rasokat, Joyn GmbH (Germany); Wolfgang Ruppel, Matthias Narroschke, Hochschule RheinMain (Germany)
Show Abstract + Hide Abstract
To generate a bitrate ladder for video streaming services, which minimizes the bit rate and maximizes the subjective quality for a given transmission rate, certain fundamental VMAF constraints need to be fulfilled. The key problem is to find the optimum bit rate ladder in a content-dependent, multidimensional solution space of bit rates, spatial resolutions, and VMAF values with a minimum encoding effort. This paper presents an efficient algorithm solving this problem by using a neural network. Experiments reveal that this algorithm needs to encode a video signal around 40 times to generate a bit rate ladder of 9 representations.
12226-39
Author(s): Roberto Herrera-Charles, Jose C. Nuñez-Perez, Jesus O. Sandoval-Solis, Ctr. de Investigación y Desarrollo de Tecnología Digital, Instituto Politécnico Nacional (Mexico)
Show Abstract + Hide Abstract
This article considers a nonlinear dynamical system capable of generating spatial attractors. The main activity is the realization of a spatial chaotic attractor on Xilinx FPGA boards, with a focus on the implementation of a secure communication system. The first great contribution is the successful synchronization of two chaotic attractors systems, in VHDL program, in a master-slave topology. The second important contribution is the FPGA realization of a secure communication system based on a spatial chaotic attractor, which involves encrypting grayscale and RGB images with chaos and broadcast key in the transmission system, sending the encrypted image through of the state variables and reconstruct the encrypted image. Image in the receiving system. Xilinx PYNQ-Z1 results are the same as Python. Statistical analyzes of the encrypted and received images show that the implemented system is very effective, since it reveals a high degree of randomness in the encrypted images with the entropy test, and the correlation coefficient obtained, which is zero, eliminates relativity, between the original and the encrypted images. Finally, the transmission system completely recovers the original grayscale and RGB images without loss of information.
12226-40
Author(s): Varoun Hanooman, Anil Kokaram, Trinity College Dublin (Ireland); Yeping Su, Neil Birkbeck, Balu Adsumilli, Google (United States)
Show Abstract + Hide Abstract
A major problem for streaming of User Generated Content is the impact of degradation like noise, exposure/lighting, camera shake and missing frames. Preprocessors are commonly used modules to reduce the impact of noise (degradation) on transcoding. [2, 3, 4]. In a transcoder pipeline it is likely that the target bitrate and other parameters of the transcoder itself also interact with the input noise in a non-linear fashion. This relationship was explored by Segall et al [5] who showed that the quantiser has an impact as well. As we showed in our previous work, the transcoder itself can act as a picture quality improvement process in some scenarios [4]. In this paper we will apply a methodology for optimising the impact of a preprocessor in a transcoding pipeline, across various preprocessors (3D wiener filter, wavelet) [6, 7] and transcoders. The idea is to measure the performance of the combined pre-processor/transcoder combination in a simulated transcoder pipeline.
12226-41
Author(s): Gonzalo Urcid, Instituto Nacional de Astrofísica, Óptica y Electrónica (Mexico); Rocio Morales, Univ. Popular Autónoma del Estado de Puebla A.C. (Mexico); José-Angel Nieves-Vázquez, Instituto Tecnológico Superior de San Andrés Tuxtla (Mexico)
Show Abstract + Hide Abstract
Since lattice algebra based associative memories can store any number k of associated vector pairs (x,y), where x is a real n-dimensional vector and y is a real m-dimensional vector, by adding redundant patterns in order to enlarge the original number of associations we enhance the retrieval capability in the presence of random or structured noise for the canonical min-W and max-M lattice associative memories as well as for dendritic lattice associative memories,. Redundant patterns consist of masked or sample noisy versions of each exemplar. The proposed redundancy technique is applied to grayscale and color images to measure the retrieval capability performance of lattice associative memories. Illustrative examples are provided to show the increase in recall capability.
12226-42
Author(s): Olexander V. Dubolazov, Alexander Ushenko, Chernivtsi National Univ. Y. Fedkovich (Ukraine)
Show Abstract + Hide Abstract
The presented materials of analytical development and experimental testing of the Stocks-polarimetry method using a reference laser wave. The results of layer-by-layer measurements of the coordinate distributions of the polarization ellipticity of laser radiation transformed by polycrystalline films of liquor are presented. Within the framework of the statistical and crosscorrelation approaches, the values and ranges of variation of the statistical and correlation moments of the 1st - 4th orders, which characterize the distributions of the polarization ellipticity of laser radiation transformed by liqour films in different phase sections, have been determined. The forensic efficiency of the method of 3D polarization mapping of polycrystalline networks of liquor in determining the antiquity of the onset of death (AOD) is considered. The possibility of high-precision (12 min-20 min) determination of AOD within 48 hours has been demonstrated.
12226-43
Author(s): Jesus Antonio Alvarez-Cedillo, Teodoro Álvarez-Sánchez, Raul Junior Sandoval-Gomez, Instituto Politécnico Nacional (Mexico)
Show Abstract + Hide Abstract
Affective computing studies how machines recognize, analyze, model and represent human emotions. The detection of facial expressions to this day represents a problem to be solved that has been addressed in multiple works. In the literature, many solutions have been proposed, and different approaches have approached it. However, to detect human emotion, it is necessary to perform a correct classification of each class, and it is necessary to use high-performance hardware and optimize its operation using parallel algorithms. Under this parallelism approach, the algorithms must respond intelligently and in real-time. In this article, a parallel algorithm for real-time expression detection was developed using the Haartraning algorithm where facial expressions are interpreted to create an intelligent human-computer interface in real-time using video, captured images, and a camera. For a machine to recognize, model and express human emotions, it needs a robust set of processes because a machine does not have the context to only interpret verbal language and body language. The perception of the physiological, visual and vocal characteristics that denote emotions is natural and inherent in human beings but not machines. The facial movements of a human are interpreted as active units, and to verify that the detected emotion is correct, we will use Robert Plutchik's classification.
Session 8: New Imaging Modalities and Applications
12226-44
Author(s): Lukas Jütte, Institut für Transport- und Automatisierungstechnik, Leibniz Univ. Hannover (Germany); Alexander Poschke, Institut für Integrierte Produktion Hannover gGmbH (Germany); Ludger Overmeyer, Institut für Transport- und Automatisierungstechnik, Leibniz Univ. Hannover (Germany)
Show Abstract + Hide Abstract
The temporally and spatially accurate display of information in augmented reality (AR) systems is essential for immersion and operational reliability when using the technology. We developed an assistant system using a head-mounted display (HMD) to hide visual restrictions on forklifts. We propose a method to evaluate the accuracy and latency of AR systems using HMD. For measuring accuracy, we compare the deviation between real and virtual markers. For latency measurement, we count the frame difference between real and virtual events. We present the influence of different system parameters and dynamics on latency and overlay accuracy.
12226-45
Author(s): Mary Guindy, Holografika Kft. (Hungary); Vamsi Kiran Adhikarla, Pázmány Péter Catholic Univ. (Hungary); Peter Andras Kara, Budapest Univ. of Technology and Economics (Hungary); Tibor Balogh, Holografika Kft. (Hungary); Aniko Simon, Sigma Technology (Hungary)
Show Abstract + Hide Abstract
Due to the recent technological advancements of light field visualization and its increasing relevance in research, the need for light field image databases has risen significantly. Among the applications of such databases, high dynamic range light field image reconstruction has gained notable attention in the past years. In this paper, we discuss our work on creating the ``CLASSROOM'' light field image dataset, depicting a classroom scene. The content is rendered in horizontal-only parallax and full parallax as well. The scene contains a high variety of light distribution, particularly involving under-exposed and over-exposed regions, which are essential to HDR image applications.
12226-46
Author(s): Marta Lange, Univ. of Latvia (Latvia); Szabolcs Bozsányi, Norbert Kiss, Nora Noemi Varga, Semmelweis Univ. (Hungary); Emilija Vija Plorina, Ilze Lihacova, Alexey Lihachev, Univ. of Latvia (Latvia)
Show Abstract + Hide Abstract
There are 9500 people in the USA are diagnosed with skin cancer every day. After surgical removal of the cancer, the patient dynamic observation protocol includes regular check-ups of the lesion site to evaluate the healing of the scar, as well as making sure that no residual cancerous cells are left. Unfortunately, a great number of patients suffer from recurring cancer. In this study the patients were screened under a supervision of an experienced dermatologist with the non-invasive multispectral, portable LED device, acquiring images at illumination ranging from 405 to 950nm. Images were analyzed to evaluate any suspicious pigmentation and identify possible signs of cancer recurrence using autofluorescence features of the tissue.
Show Abstract + Hide Abstract
We proposed a novel method to improve the axial resolution of photoacoustic microscopy. This method integrates the U-net semantic segmentation model with the simulation platform of photoacoustic microscopy based on K-Wave. Firstly, the dataset (including B-scans and their corresponding ground truth images) required for deep learning is obtained by using the simulation platform of photoacoustic microscopy based on K-Wave. The dataset is randomly divided into training set and test set with a ratio of 7:1. In the training process, the B-scans are used as the input of U-Net based convolutional neural network architecture, while the ground truth images are the desired output of the neural network. A two-fold increase in axial resolution was measured by imaging carbon nanoparticles, and the three-dimensional spatial resolution is more uniform
12226-49
Author(s): Debesh Choudhury, Infosensys Research and Engineering (India); Sujoy Chakraborty, Stockton Univ. (United States)
Show Abstract + Hide Abstract
We propose a technique to protect and preserve a private key or a passcode in an encrypted two-dimensional graphical image. The plaintext private key or the passcode is converted into an encrypted QR code and embedded into a real-life color image with a steganographic scheme. The private key or the passcode is recovered from the stego color image by first extracting the encrypted QR code from the color image, followed by decryption of the QR code. Experimental results are presented that prove the feasibility.
12226-50
Author(s): Touradj Ebrahimi, Changsheng Gao, Ecole Polytechnique Fédérale de Lausanne (Switzerland); Kambiz Homayounfar, RayShaper SA (Switzerland)
Show Abstract + Hide Abstract
Non-fungible tokens (NFTs) are becoming very popular in a large number of applications ranging from copyright protection to monetization of both physical and digital assets. It is however a fact that NFTs suffer from a large number of security issues that create a lack of trust in solutions based on them. In this paper, we overview some of the most critical security challenges in media assets in form of visual content and propose solutions to overcome them.
Poster Session
Conference attendees are invited to view a collection of posters within the topics of Nanoscience + Engineering, Organic Photonics + Electronics, and Optical Engineering + Applications. Enjoy light refreshments, ask questions, and network with colleagues in your field. Authors of poster papers will be present to answer questions concerning their papers. Attendees are required to wear their conference registration badges to the poster session.

Poster authors, visit Poster Presentation Guidelines for set-up instructions.
12226-51
Author(s): Alexey Ruchay, Federal Research Ctr. of Biological Systems and Agrotechnologies (Russian Federation); Ramin Chelabiev, Chelyabinsk State Univ. (Russian Federation)
Show Abstract + Hide Abstract
The point cloud obtained from the depth map suffers from noise contamination and contains outliers. The noise and holes can greatly affect the accuracy of the 3-D reconstruction; therefore, noise-reduction and hole-filling enhancement algorithms are intended to serve as a pre-processing step. In this paper, we proposed the filtering algorithm of a point cloud to improve the quality of the 3-D reconstruction. The performance of the proposed algorithm is compared in terms of the accuracy of 3-D surface reconstruction and processing time with that of common point cloud filtering algorithms.
12226-52
Author(s): Alexey Ruchay, Federal Research Ctr. of Biological Systems and Agrotechnologies (Russian Federation); Ramin Chelabiev, Chelyabinsk State Univ. (Russian Federation)
Show Abstract + Hide Abstract
Point clouds are today widely used in numerous application fields, including agriculture and so on. There are still a lot of efforts to be made to compress and standard the point clouds in a sufficient ratio and make them suitable for real-time, portable applications such as livestock. In recent years, various families of point cloud compression and standardization methods have been proposed. In this paper, we propose a new fast algorithm of point cloud compression and standardization. The proposed algorithm is based on the voxel representation of the point cloud and slice method. The accuracy and speed of the proposed algorithm on real data are compared to that of the state-of-the-art algorithms.
12226-54
Author(s): Carlos Alexander O. Quero, Daniel D. Durini, José de Jesús R. Magdaleno, Jose Martinez-Carranza, Rubén O. Ramos-García, Instituto Nacional de Astrofísica, Óptica y Electrónica (Mexico)
Show Abstract + Hide Abstract
This work presents a method to 3D video at ~8 frames per second with a 32 × 32 pixels resolution using single-pixel imaging (SPI) video generation using active illumination in the near-infrared (NIR) part of the spectral and RADAR. For outdoor applications of GPS - denied low - visibility or hazard weather scenarios. The proposed solution is multi-spectral is based on SPI at wavelength 1550 nm and millimeter RADAR in band 80 GHz, using the Shape-for-Shading (SFS) method and deep learning, we can estimate surface shape from the reflectivity light of the scene or photographed object, that in combination with the information RADAR we would obtain 3D images. To reach a frame rate near-continuous real-time, we will use architecture GPU and parallelize the 3D reconstruction algorithms to generate a sequence frame. For evaluation of our vision system, we defined a dynamic scenario test with low-illumination, scattering, and irregularity surface
12226-55
Author(s): Sergei Voronin, Artyom Makovetskii, Chelyabinsk State Univ. (Russian Federation); Vitaly Kober, Ctr. de Investigación Científica y de Educación Superior de Ensenada B.C. (Mexico); Aleksei Voronin, Chelyabinsk State Univ. (Russian Federation)
Show Abstract + Hide Abstract
Point cloud registration is managed to find the optimal rigid transformation that can align two input point clouds to a common coordinate system. The disadvantage of classical ICP variants is the dependence on the initial placement of point clouds. The coarse point clouds registration algorithms are utilized for finding a fit initial alignment of two clouds. In this paper, we propose an algorithm to extract the common parts of the non-congruent point clouds and their coarse alignment. The proposed algorithm allows the extraction of the common parts of the non-congruent point clouds. Computer simulation results are provided to illustrate the performance of the proposed method.
12226-56
Author(s): Artyom Makovetskii, Sergei Voronin, Chelyabinsk State Univ. (Russian Federation); Vitaly Kober, Ctr. de Investigación Científica y de Educación Superior de Ensenada B.C. (Mexico); Aleksei Voronin, Chelyabinsk State Univ. (Russian Federation)
Show Abstract + Hide Abstract
The purpose of point clouds registration is to find a rigid geometric transformation to align two point clouds. The registration problem can be affected by noise and partiality. The Iterative Closed Point (ICP) algorithm is a common method for solving the registration problem. The probability of obtaining an acceptable transformation as a result of the ICP algorithm is a comparative criterion for different types of ICP algorithms. In this paper, we propose an ICP-type registration algorithm that uses a new type of error metric functional. The functional uses the thin geometrical characteristics of the point cloud. Computer simulation results are provided to illustrate the performance of the proposed method.
12226-57
Author(s): Sergei Voronin, Chelyabinsk State Univ. (Russian Federation); Vitaly Kober, Ctr. de Investigación Científica y de Educación Superior de Ensenada B.C. (Mexico); Artyom Makovetskii, Aleksei Voronin, Dmitrii Zhernov, Chelyabinsk State Univ. (Russian Federation)
Show Abstract + Hide Abstract
Point clouds is an important type of geometric data structure. The applications require high-level point cloud processing. Instead of defining geometric elements such as corners and edges, state-of-the-art algorithms use semantic matching. Modern deep neural networks are designed specifically to process point clouds directly, without going to an intermediate regular representation. The Deep Closest Point (DCP) network is a neural network that implements the ICP algorithm. DCP utilizes the point-to-point functional for error metric minimization. In this paper, we propose the modified variant of DCP based on other types of ICP error minimization functionals. Computer simulation results are provided to illustrate the performance of the proposed algorithm.
12226-58
Author(s): Jose Manuel Valencia-Moreno, Jose Alejandro Gonzalez Sarabia, Jose A. González-Fraga, Everardo Gutiérrez López, Univ. Autónoma de Baja California (Mexico)
Show Abstract + Hide Abstract
Breast cancer is the most common cancer and one of the main cause of death in women. Early diagnosis of breast cancer is essential to ensure a high chance of survival for the affected women. Computer-aided detection (CAD) systems based on convolutional neural networks (CNN) could assist in the classification of abnormalities such as masses and calcifications. In this paper, several convolutional network models for the automatic classification of pathology in mammograms are analyzed. As well as different preprocessing and tuning techniques, such as data augmentation, hyperparameter tuning, and fine-tuning are used to train the models. Finally, these models are validated on various publicly available benchmark datasets.
12226-59
Author(s): Alexey Ruchay, Konstantin Dorofeev, Federal Research Ctr. of Biological Systems and Agrotechnologies (Russian Federation)
Show Abstract + Hide Abstract
The сurve skeleton extraction is a common task in many computer graphics applications, for instance obtaining 3-D model measurements. The сurve skeleton extraction is usually performed by exact algorithms, however, it is a rather time-consuming process and does not always guarantee to get a good result in the case of missing data or cloud distortions. In this paper, we propose a fast algorithm of сurve skeleton extraction from point clouds of livestock. Computer simulation results for the proposed algorithm of сurve skeleton extraction in terms of accuracy and speed of computation are presented and discussed.
12226-60
Author(s): Konstantin Dorofeev, Alexey Ruchay, Federal Research Ctr. of Biological Systems and Agrotechnologies (Russian Federation)
Show Abstract + Hide Abstract
A robust face recognition is an urgent task, many researchers offer various approaches to solve it. There are known problems with such systems that use only 2-D RGB images. Recently, attempts have been made to use a 3-D approach based on depth maps. Such systems also have disadvantages, there are problems with recognition accuracy. In this paper, we propose a new approach using combination 2-D and 3-D face recognition algorithms. The proposed approach consists the VGG-Face network to analyze 2-D RGB images and the IRIS-Face RGBD network to analyze depth maps. The decision algorithm is based on a fusion the results of two networks. The proposed approach using combination 2-D and 3-D face recognition algorithms is to build a more robust face recognition. Computer simulation results for the proposed approach using combination 2-D and 3-D face recognition algorithms in terms of accuracy and speed of computation are presented and discussed.
12226-61
Author(s): Alexey Ruchay, Konstantin Dorofeev, Federal Research Ctr. of Biological Systems and Agrotechnologies (Russian Federation)
Show Abstract + Hide Abstract
With the rapid development of animal husbandry modernization, individual identification of animals has become an important task. A novel 3-D cow identification system based on point clouds using a deep convolutional neural network and a 3-D augmentation technique was proposed. We used the pre-trained VGG-Face model for cow identification. In the training phase, we align 3-D point clouds with a reference cow model, augment the point clouds, and convert them to 2-D depth maps. Depth maps are resized to fit the size of VGG-Face. In the testing phase, a probe scan is preprocessed and resized. Then, a cow representation is extracted from the fine-tuned CNN. After normalization of features and Principal Component Analysis transform, the cow’s identity is determined by the matching step. The performance of the proposed 3-D cow identification system is compared in terms of recognition accuracy and processing time.
12226-62
Author(s): Alexey Ruchay, Alexey Gladkov, Federal Research Ctr. of Biological Systems and Agrotechnologies (Russian Federation)
Show Abstract + Hide Abstract
Convolutional neural networks (CNN) show their effectiveness for solving a wide range of tasks. Unfortunately, it is impossible to even come close to this result for three-dimensional tasks. The main reason is that there is not a large enough labeled 3D database. Therefore, a deep neural network is often developed on 2-D images, then is extended for 3-D data. To pass 3-D data to the 2-D trained CNN, the point clouds should be projected onto a 2-D image plane with an orthographic projection. In this paper, we propose a fast algorithm of projection a point cloud orthographically onto a 2D image to generate a depth map. 3-D point clouds are scaled to create the same size depth 2-D map. We calculate each pixel value using a mean with bilinear interpolation. We apply classical median filtering to the depth images to generate the final results. Computer simulation results for the proposed algorithm of plane projection from the point cloud in terms of accuracy and speed of computation are presented and discussed.
Panel Discussion on Advanced Video Compression and Applications
Moderator:
Pankaj Topiwala, FastVDO Inc. (USA)

Panelists:
Benjamin Bross, Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut, HHI (Germany)
Fabien Racapé, InterDigital, Inc. (France)
Ioannis Katsavounidis, Meta (USA)
Mathias Wien, RWTH Aachen Univ. (Germany)
Simone Ferrera, V-Nova Ltd. (United Kingdom)
Michael Horowitz, Google (USA)
Gary J. Sullivan, Microsoft Corp. (USA)

There is currently the widest breadth of video codecs available for the massive $200B video services industry, comprising broadcast, streaming, and other services, in history: MPEG-2, MPEG-4, AVC, HEVC, VVC, VP8, VP9, AV1, EVC, and LC-EVC. While these codecs compete in the marketplace for share of streams, the consumer surely benefits from having advanced services at lower rates. Is 4K HDR HEVC going to become the new norm for broadcast/streaming? But this is a challenging environment for developers and service providers. In this panel, we explore the breadth of consumer services that are enabled by these technologies, including high resolution: 4K, 8K, and beyond, as well as HDR, and AR/VR – will these finally take off and fulfill their promise? And is 8K the end of the line for consumer devices such as TVs, and even computers, tablets, and smartphones?
Conference Chair
AGT Associates (United States)
Conference Chair
Ecole Polytechnique Fédérale de Lausanne (Switzerland)
Program Committee
Qualcomm Inc. (United States)
Program Committee
intoPIX s.a. (Belgium)
Program Committee
Comcast Corp. (United States)
Program Committee
Ben-Gurion Univ. of the Negev (Israel)
Program Committee
Facebook Inc. (United States)
Program Committee
The Univ. of Southern California (United States)
Program Committee
Tencent America, LLC (United States)
Program Committee
Andre J. Oosterlinck
KU Leuven Association (Belgium)
Program Committee
Instituto de Telecomunicações (Portugal)
Program Committee
Brightcove, Inc. (United States)
Program Committee
Thomas Richter
Fraunhofer-Institut für Integrierte Schaltungen IIS (Germany)
Program Committee
California Polytechnic State Univ., San Luis Obispo (United States)
Program Committee
Vrije Univ. Brussel (Belgium)
Program Committee
Microsoft Corp. (United States)
Program Committee
The Univ. of New South Wales (Australia)
Program Committee
FastVDO Inc. (United States)