21 - 25 April 2024
National Harbor, Maryland, US
2024 Keynote Speaker:

Anthony Hoogs, Kitware Inc., New York (United States)

The goal of this conference is to establish a strong presence for those seeking to publish or consume Synthetic Data research in the SPIE DCS community. This conference will expedite dissemination of information and advances in synthetic data for Artificial Intelligence and Machine Learning and will aid the larger research community with increased collaboration opportunities. This conference facilitates the development of tools and processes for generating Synthetic Data and dissemination of meaningful, evidence-based guidance for its use.

Conference topics include:
Panel session
This conference will host a session on the unique utility of synthetic data: what can it do for you? A panel of academic and government experts will discuss the unique advantages conferred by synthetic data regarding its use for artificial intelligence and machine learning, including training, testing, sensitivity analysis, and improved resilience. The panel will touch on current success stories, applications that stand to benefit the most from synthetic data capabilities in the near future, current and looming challenges in the field, and more.

Joint sessions
This conference will hold a joint session:
  1. Synthetic Data for Unmanned Systems Technology Applications DCS305: Unmanned Systems Technology XXVI

Awards
We are pleased to announce two awards for this conference:
  1. Best Oral Presentation Award: presentations will be evaluated in terms of scientific content, audience accessibility, and quality of visual aids
  2. Best Poster Award: posters will be evaluated in terms of scientific content, audience accessibility, and quality of visual aids.
Award winners will receive an awards certificate.
All accepted presentations in this conference are automatically eligible.
;
In progress – view active session
Conference 13035

Synthetic Data for Artificial Intelligence and Machine Learning: Tools, Techniques, and Applications II

22 - 25 April 2024 | Potomac 6
View Session ∨
  • Opening Remarks
  • 1: Virtual Asset Creation
  • 2: Synthetic Data Generation Tools I
  • 3: Synthetic Data Generation Tools II
  • 4: Pose and Gesture Recognition
  • Symposium Plenary
  • Symposium Panel on Microelectronics Commercial Crossover
  • 5: Panel Discussion: Generative Models
  • 6: Data Management
  • 7: Unmanned Systems
  • 8: Generative Models
  • Poster Session
  • Symposium Plenary on AI/ML + Sustainability
  • 9: Multi-Domain Operations
  • 10: Integrated Machine Learning and Synthesis Pipelines
  • 11: Fidelity and Sensitivity Analysis I
  • 12: Fidelity and Sensitivity Analysis II
  • 13: Fidelity and Sensitivity Analysis III
Opening Remarks
22 April 2024 • 8:20 AM - 8:30 AM EDT | Potomac 6
Session Chair: Kimberly E. Manser, DEVCOM C5ISR (United States)
Opening remarks for Synthetic Data for Artificial Intelligence and Machine Learning: Tools, Techniques, and Applications II.
Session 1: Virtual Asset Creation
22 April 2024 • 8:30 AM - 10:30 AM EDT | Potomac 6
Session Chair: Vincent J. Velten, Air Force Research Lab. (United States)
13035-1
Author(s): Anthony J. Hoogs, Kitware, Inc. (United States)
22 April 2024 • 8:30 AM - 9:10 AM EDT | Potomac 6
Show Abstract + Hide Abstract
The use of generated data for training visual AI models has been increasing rapidly in recent years as the quality of AI-generated imagery has drastically improved. Since the beginning of the deep learning revolution about 10 years ago, deep learning methods have also relied on data augmentation to expand the effective size and diversity of training datasets without collecting or generating additional images. In both cases, however, the realism of the generated data is difficult to assess. Many studies have shown that generated and augmented data improve accuracy on real data, but when real test data has a significant domain shift from the training data, it can be difficult to predict whether data generation and augmentation will help to improve robustness. This talk will cover recent advances in realistic data augmentation, for applications where a target domain is known but has little test or training data. Under the CDAO JATIC program, Kitware is developing the Natural Robustness Toolkit (NRTK), an open-source toolkit for generating realistic image augmentations and perturbations that correspond to specified sensor and scene parameters. NRTK enables the significant expansion of test and training datasets to both previously-unseen scene conditions and realistic emulation of new imaging sensors that have different optical properties. Our results demonstrate that NRTK dataset augmentation is more effective than typical methods based on random pixel-level perturbations and AI-generated images.
13035-2
Author(s): Brett S. Sicard, Quade Butler, Yuandi Wu, Sepehr M. Abdolahi, McMaster Univ. (Canada); Youssef Ziada, Ford Motor Co. (United States); Stephen A. Gadsden, McMaster Univ. (Canada)
On demand | Presented live 22 April 2024
Show Abstract + Hide Abstract
Machine tools (MT) are critical to modern manufacturing. To best achieve reliability and high performance it is necessary to implement condition monitoring, fault detection and predictive maintenance. One solution for implementing these is by utilizing data-driven methods such as neural networks. One issue with any data-driven method is that they require large quantities of labeled data, which is difficult for fault detection applications as faults tend to be rare. One emerging technology that can be implemented to solve this issue is the digital twin (DT). DTs provide a solution for data collection, modeling, simulation, and smart services. One way that DTs can be used is to generate synthetic data which can be used for various data-driven methods. Synthetic data generated from the DT model can be used to create a dataset for various condition monitoring DT services. This study involved the use of simulation software to generate synthetic data which was used to implement a fault detection algorithm for preload loss monitoring. This method work has promise improve reliability and performance in MTs.
13035-3
Author(s): Kevin McKenzie, Eddie Jacobs, Alf Ramirez, Joseph Conroy, Thomas P. Watson, The Univ. of Memphis (United States)
22 April 2024 • 9:30 AM - 9:50 AM EDT | Potomac 6
Show Abstract + Hide Abstract
This research presents an in-depth investigation into the application of Convolutional Neural Networks (CNN) for acoustic remote sensing on multi-rotor UAVs, with a specific focus on detecting large vehicles on the ground. We used a multi-rotor UAV equipped with a custom audio recorder, calibrated microphones, and uniquely designed microphone mounts for data collection. We explored optimal features for training our CNN, experimented with different normalization techniques, and examined their synergy between various activation functions. The study further explores the fine-tuning of model parameters to enhance detection performance and reliability. The outcome was a CNN model, trained with a combination of both real-world and synthetic data, demonstrating a proficient capability in target detection.
13035-4
Author(s): Nicolas Hueber, Alexander Pichler, Institut Franco-Allemand de Recherches de Saint-Louis (France)
22 April 2024 • 9:50 AM - 10:10 AM EDT | Potomac 6
Show Abstract + Hide Abstract
Vision-based object detection remains an active research area in both civilian and military domains. While the state-of-the-art relies on deep learning techniques, these demand large multi-context datasets. Given the rarity of open-access datasets for military applications, alternative methods for data collection and training dataset creation are essential. This paper presents a novel vehicle signature acquisition based on indoor 3D-scanning of miniature military vehicles. By using 3D projections of the scanned vehicles as well as off-the-shelves computer aided design models, relevant image signatures are generated showing the vehicle from different perspectives. The resulting context-independent signatures are enhanced with data augmentation techniques and used for object detection model training. The trained models are evaluated by means of aerial test sequences showing real vehicles and situations. Results are compared to state-of-the art methodologies. Our method is shown to be a suitable indoor solution for training a vehicle detector for real situations.
13035-5
Author(s): Rachel Kinard, Igor Ternovskiy, Air Force Research Lab. (United States); Robert Schueler, Riverside Research (United States); Brandon Kinard, Air Force Research Lab. (United States); James Graham, Matthew Rustad, Riverside Research (United States)
On demand | Presented live 22 April 2024
Show Abstract + Hide Abstract
In Machine Learning (ML) based autonomous technology research (ATR), it is crucial to have large and reliable data sets to train deep learning-based classifiers and implement object detection methods. For air-to-ground ATR, the gold standard, obtained by limited and expensive controlled field collections, is measured data. However, carefully curated research data intended to test or isolate specific qualities of object detection (low-light, heavy shadow, cloud cover, obscurations, and other operational use cases) is still difficult to obtain. For advanced research problems, synthetic data generated in simulated environments meets both quantity and quality requirements. Most synthetic data is generated in a software simulated environment using various rendering techniques, limited by available computational resources. Among the many types of synthetic data is scale model data, generated by 3D printing and imaging the same 3D Computer-Aided Design (CAD) models at a reduced scale (1:285 or 1:125) on a turntable in controlled environmental conditions. We present a workflow for the rapid generation of ATR Training Data customized to isolate and identify data features of interest.
Break
Coffee Break 10:30 AM - 11:00 AM
Session 2: Synthetic Data Generation Tools I
22 April 2024 • 11:00 AM - 12:00 PM EDT | Potomac 6
Session Chair: Kimberly E. Manser, DEVCOM C5ISR (United States)
13035-6
Author(s): Keith F. Prussing, Christopher E. Cordell, Daniel Levy, Georgia Tech Research Institute (United States)
On demand | Presented live 22 April 2024
Show Abstract + Hide Abstract
Development of novel search and track algorithms needs to account for measurements that can arise from both radio frequency and electro-optical infrared measurements. Historically, the information needed was provided to the tracking algorithm from measured data or from synthetically generated data. In the case of synthetic data, these models were frequently developed independent of one another and did not share a common sense of “truth” about the environment or the objects in the simulation. To address this problem, Georgia Tech Research Institute has developed General High-fidelity Omni-Spectral Toolbox as a plug and play architecture to allow for algorithm development within a fully integrated electro-optical infrared and radio frequency environment.
13035-8
Author(s): Jeffrey Kerley, Derek T. Anderson, Andrew Buck, Brendan Alvey, Univ. of Missouri (United States)
On demand | Presented live 22 April 2024
Show Abstract + Hide Abstract
We explore the use of Large Language Models (LLMs) as an intermediate between the nuanced, syntactical programming language and the natural (human) way of describing the world. Our formal language LSCENE is a way to procedurally generate realistic synthetic scenes in the Unreal Engine. This tool is useful because artificial intelligence (AI) typically requires large volumes of labeled data with variety. To generate such data for training and evaluating AI, we employ an LLM to interpret and sample LSCENEs that are compatible with user input. Through this approach, we demonstrate a reduction in abstract complexity, elimination of syntax complexity, and the ability to tackle complex tasks in LSCENE using natural language. To illustrate our findings, we present three experiments with quantitative results focused on spatial reasoning, along with a more intricate qualitative example of automatically generating an environment for a specific biome.
13035-9
Author(s): Friso G. Heslinga, Thijs A. Eker, Ella P. Fokkinga, Jan Erik van Woerden, Frank A. Ruis, Richard J. M. den Hollander, Klamer Schutte, TNO (Netherlands)
On demand | Presented live 22 April 2024
Show Abstract + Hide Abstract
Deep learning-based object detection offers potential for various military applications, but large amounts of realistic training data are typically lacking. In this study, we address this challenge by using simulated data as well as foundation models for military vehicle detection. We compare finetuning of a Mask R-CNN with few real samples with finetuning of a foundation model. In addition, we evaluate the added value of the real data with respect to simulation variation, highlighting the importance of both development strategies.
Break
Lunch Break 12:00 PM - 1:40 PM
Session 3: Synthetic Data Generation Tools II
22 April 2024 • 1:40 PM - 3:00 PM EDT | Potomac 6
Session Chair: Kimberly E. Manser, DEVCOM C5ISR (United States)
13035-11
Author(s): Michael S. Lee, Gail Vaucher, Michael S. D'Arcy, Robert Jane, Morris Berman, DEVCOM Army Research Lab. (United States)
On demand | Presented live 22 April 2024
Show Abstract + Hide Abstract
Optical whole sky imaging (WSI) is a valuable tool for atmospheric intelligence across a diverse array of applications including solar radiation prediction and microenvironment characterization. In this work, we introduce standalone algorithms and software to render clouds of different sizes, shapes, and base heights with the goal of developing datasets suitable for machine learning applications such as cloud position and base height estimation, and resultant ground shadow prediction. Three-dimensional voxel cloud textures are generated with thresholded fractal noise and rendered with two-step ray tracing. We compare real and synthetic imagery for fisheye camera views and predict what two-camera stereo pairs might look like when the WSI cameras become operational at the Multipurpose Sensor Array (MSA) in White Sands, NM.
13035-12
Author(s): Mark D. Klein, Zachary J. Edel, Corey D. Packard, Jacob N. Hendrickson, Audrey C. Levanen, Peter L. Rynes, ThermoAnalytics, Inc. (United States)
On demand | Presented live 22 April 2024
Show Abstract + Hide Abstract
Deep learning for processing image-based scenes and detecting/recognizing embedded targets has been demonstrated, but performant algorithms must be robustly trained. This typically requires a large, varied set of training data on which to base statistical predictions. Acquiring such a diverse image set from measured sources can be a challenge in thermal infrared wavebands, but variation in clothing ensembles, pose, season, times of day, sensor platform perspectives, scene backgrounds and weather conditions can be included in synthetic imagery. Suitable performance requires a careful methodology to be followed if robust training is to be accomplished. In this work, MuSES and CoTherm are used to generate synthetic EO/IR remote sensing imagery of various human dismounts with a range of clothing, poses and environmental factors. The performance of a YOLO deep learning algorithm is studied, and sensitivity conclusions are discussed.
13035-60
Author(s): Huong Ninh, Doan Thinh Vo, Hai Tran Tien, Viettel Aerospace Institute (Vietnam)
22 April 2024 • 2:40 PM - 3:00 PM EDT | Potomac 6
Show Abstract + Hide Abstract
In this paper, we propose a method to produce synthetic thermal infrared (TIR) images using a generation-based image-to-image (I2I) translation model. The model translates the abundantly available RGB images to synthetic TIR data closer to the domain of authentic TIR images. For this purpose, we explore the usage of an unpaired image translation neural model based on Schrödinger Bridge algorithms. Additionally, the visual characteristic of the object in the image is of important consideration in generating results. Thus, we take advantage of a segmentation module before I2I translation model to discriminate the background and object regions. In practice, we train the model with a self-proposed dataset comprising unpairs realistic RGB-TIR images. We evaluate the model's performance in synthesizing thermal images by comparing them to our original thermal dataset, achieving a Fréchet Inception Distance (FID) score of approximately 80, indicative of high-quality image generation. Notably, our model's synthesized images boost the classification accuracy by 15% when only using the realistic TIR images.
Break
Coffee Break 3:00 PM - 3:30 PM
Session 4: Pose and Gesture Recognition
22 April 2024 • 3:30 PM - 4:50 PM EDT | Potomac 6
Session Chair: Christopher L. Howell, DEVCOM C5ISR (United States)
13035-14
Author(s): Shuhong Lu, Zhangyu Jin, USC Institute for Creative Technologies (United States); Vickram Rajendran, Michal Harari, Applied Intuition, Inc. (United States); Andrew Feng, USC Institute for Creative Technologies (United States); Celso M. De Melo, DEVCOM Army Research Lab. (United States)
On demand | Presented live 22 April 2024
Show Abstract + Hide Abstract
We propose to enhance action recognition accuracy by leveraging synthetic data and domain adaptation. Specifically, We achieve this through the creation of a synthetic dataset mimicking the Multi-View Extended Video with Activities (MEVA) dataset and the introduction of a multi-modal model for domain adaptation. This synthetic-to-real adaptation approach leverages the synthetic data to enhance model generalization. We created the synthetic datasets through a high-fidelity physically-based rendering system and the sensor simulation to effectively address the challenges of real data scarcity in action recognition. Complementing the synthetic dataset generation, we leverage the multi-modal models in the synthetic-to-real adaptation experiments that utilize RGB images and skeleton features. The experiment results highlight the effectiveness of the approach and its practical applications across various domains, including surveillance systems, threat identification, and disaster response.
13035-15
Author(s): Xiaoyu Zhu, Wenhe Liu, Carnegie Mellon Univ. (United States); Celso M. De Mello, DEVCOM Army Research Lab. (United States); Alexander Hauptmann, Carnegie Mellon Univ. (United States)
On demand | Presented live 22 April 2024
Show Abstract + Hide Abstract
Effectively recognizing human actions from variant viewpoints is crucial for successful collaboration between humans and robots. Deep learning approaches have achieved promising performance in action recognition given sufficient well-annotated data from the real world. However, collecting and annotating real-world videos can be challenging, particularly for rare or violent actions. Synthetic data, on the other hand, can be easily obtained from simulators with fine-grained annotations and variant modalities. To learn domain-invariant feature representations, we propose a novel method to distill the pseudo labels from the strong mesh-based action recognition model into a light-weighted I3D model. In this way, the model can leverage robust 3D representations and maintain real-time inference speed. We empirically evaluate our model on the Mixamo->Kinetics dataset. The proposed model achieves state-of-the-art performance compared to the existing video domain adaptation methods.
13035-16
Author(s): Arun Reddy, Johns Hopkins Univ. Applied Physics Lab. (United States); Ketul Shah, Johns Hopkins Univ. (United States); Corban Rivera, William Paul, Johns Hopkins Univ. Applied Physics Lab., LLC (United States); Celso M. De Melo, DEVCOM Army Research Lab. (United States); Rama Chellappa, Johns Hopkins Univ. (United States)
On demand | Presented live 22 April 2024
Show Abstract + Hide Abstract
In this work, we explore the possibility of using synthetically generated data for video-based gesture recognition with large pre-trained models. We consider whether these models have sufficiently robust and expressive representation spaces to enable "training-free" classification. Specifically, we utilize various modern video encoders to extract features for use in k-nearest neighbors classification, where the training data points are derived from synthetic videos only. We compare these results with another training-free approach – zero-shot classification using text descriptions of each gesture.
13035-17
Author(s): Christopher Liberatore, Air Force Research Lab. (United States); Corey Marrs, Univ. of Missouri-Kansas City (United States), Wright State Univ. (United States); John Bielas, Applied Research Solutions, Inc. (United States); Amanda Baxter, Richard Borth, National Air and Space Intelligence Ctr. (United States); Ian Matejka, Yuki Adams, Applied Research Solutions, Inc. (United States); Patrick Benasutti, Applied Research Solutions (United States); Rachel Kinard, Air Force Research Lab. (United States)
On demand | Presented live 22 April 2024
Show Abstract + Hide Abstract
We present an application of synthetic datasets to a pose estimation problem called “Microwave Dish Mensuration”, or the task of determining a said dish pointing angle from photogrammetry. Pose estimation presents a difficult case for machine learning, as it is onerous to collect a measured dataset capturing all possible configurations of an object; however, the ease of generating synthetic data may make the pose estimation problem tractable. Additionally, dish Mensuration has a well-known geometric invariance, which will help the synthetic training regime generalize to measured data and present a path forward to generalized models trained on synthetic datasets. For this research, we generated a dataset of 86,400 images of 5 different Microwave Dish models taken at 6 different times of day, generating both rendered image chips and component masks, facilitating pose estimation. We discuss the methods for generating the synthetic dataset, difficulties associated with generating sufficient variance, and a method for performing dish mensuration with a Deep Learning regression model. We conclude by addressing next steps and ways to further generalize into more pose estimation problems.
Symposium Plenary
22 April 2024 • 5:00 PM - 6:30 PM EDT | Potomac A
Session Chairs: Tien Pham, The MITRE Corp. (United States), Douglas R. Droege, L3Harris Technologies, Inc. (United States)

View Full Details: spie.org/dcs/symposium-plenary

Chair welcome and introduction
22 April 2024 • 5:00 PM - 5:05 PM EDT

DoD's microelectronics for the defense and commercial sensing ecosystem (Plenary Presentation)
Presenter(s): Dev Shenoy, Principal Director for Microelectronics, Office of the Under Secretary of Defense for Research and Engineering (United States)
22 April 2024 • 5:05 PM - 5:45 PM EDT

NATO DIANA: a case study for reimagining defence innovation (Plenary Presentation)
Presenter(s): Deeph Chana, Managing Director, NATO Defence Innovation Accelerator for the North Atlantic (DIANA) (United Kingdom)
22 April 2024 • 5:50 PM - 6:30 PM EDT

Symposium Panel on Microelectronics Commercial Crossover
23 April 2024 • 8:30 AM - 10:00 AM EDT | Potomac A

View Full Details: spie.org/dcs/symposium-panel

The CHIPS Act Microelectronics Commons network is accelerating the pace of microelectronics technology development in the U.S. This panel discussion will explore opportunities for crossover from commercial technology into DoD systems and applications, discussing what emerging commercial microelectronics technologies could be most impactful on photonics and sensors and how the DoD might best leverage commercial innovations in microelectronics.

Moderator:
John Pellegrino, Electro-Optical Systems Lab., Georgia Tech Research Institute (retired) (United States)

Panelists:
Shamik Das, The MITRE Corporation (United States)
Erin Gawron-Hyla, OUSD (R&E) (United States)
Carl McCants, Defense Advanced Research Projects Agency (United States)
Kyle Squires, Ira A. Fulton Schools of Engineering, Arizona State Univ. (United States)
Anil Rao, Intel Corporation (United States)

Break
Coffee Break 10:00 AM - 10:30 AM
Session 5: Panel Discussion: Generative Models
23 April 2024 • 10:30 AM - 11:30 AM EDT | Potomac 6
Generative AI promises nearly endless possibilities for myriad use cases, with potential applications cutting across nearly every business and research sector. From advertising and art to algorithm training, generative AI is making waves and becoming more and more popular. In this panel discussion, we invite panel members to explore the current state of generative AI, its shortcomings and strengths, and look to the future by positing how we might make improvements in the field, and how the technology may be used to further benefit in the future. We will be accepting some audience questions, so please come curious!

Moderators:
Kimberly E. Manser, DEVCOM C5ISR (United States)

Panelists:
Raghuveer Rao, Army Research Laboratory
Colin Reinhardt, Naval Intelligence Warfare Center
Corban Rivera, Johns Hopkins University, Applied Physics Laboratory
Sek Chai, Latent AI
Alex Hauptmann, Carnegie Mellon University
Session 6: Data Management
23 April 2024 • 11:30 AM - 12:00 PM EDT | Potomac 6
Session Chair: Christopher L. Howell, DEVCOM C5ISR (United States)
13035-18
CANCELED: A data management approach to enable efficient AI/ML training (Invited Paper)
Author(s): Michael F. Finch, Mark Jeiran, DEVCOM C5ISR (United States)
23 April 2024 • 11:30 AM - 12:00 PM EDT | Potomac 6
Show Abstract + Hide Abstract
This paper describes ongoing work being conducted at the U.S. Army Combat Capabilities Development Command (DEVCOM) C5ISR Center related to data curation and data management. . This paper describes the repurposing of the U.S. Army DEVCOM C5ISR Center Center’s Common Data Format (CDF) which was developed for sharing dataset for AI algorithm testing and developing into a data model or schema that can be used for data warehousing using a relational database thus reaping the benefits that a databases can provide include fast data querying with SQL (Structured Query Language), ACID (Atomicity, Consistency, Isolation and Durability) transitions, and CRUD (Create, Read, Update, and Delete) operations. In addition, efforts for establishing and leveraging exciting tools for data processing and Extract, Transform, Load (ETL) pipelines using frameworks like Apache Spark will be described.
Break
Lunch/Exhibition Break 12:00 PM - 2:10 PM
Session 7: Unmanned Systems
23 April 2024 • 2:10 PM - 3:10 PM EDT | Potomac 6
Session Chair: Vincent J. Velten, Air Force Research Lab. (United States)
13035-19
Author(s): Ruiqi Xian, Univ. of Maryland, College Park (United States); Bryan I. Vogel, Booz Allen Hamilton Inc. (United States); Celso M. De Melo, Andre V. Harrison, DEVCOM Army Research Lab. (United States); Dinesh Manocha, Univ. of Maryland, College Park (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
In this paper, we propose a novel approach for real-time human action recognition (HAR) on resource-constrained UAVs. Our approach tackles the limited availability of labeled UAV video data (compared to ground-based datasets) by incorporating synthetic data augmentation to improve the performance of a lightweight action recognition model. This combined strategy offers a robust and efficient solution for UAV-based HAR. We evaluate our method on the RoCoG v2 and UAV-Human datasets, showing a notable increase in top-1 accuracy across all scenarios on RoCoG: 9.1\% improvement when training with synthetic data only, 6.9\% with real data only, and the highest improvement of 11.8\% with a combined approach. Additionally, using an X3D backbone further improves accuracy on the UAV-Human dataset by 5.5\%. Our models deployed on a Qualcomm Robotics RB5 platform achieve real-time predictions at approximately 10 frames per second (fps) and demonstrate a superior trade-off between performance and inference rate on both low-power edge devices and high-end desktops.
13035-20
Author(s): Christopher T. Goodin, Daniel Carruth, Lalitha Dabbiru, Lucas Cagle, Nicholas Harvel, Mississippi State Univ. (United States); John G. Monroe, Michael W. Parker, U.S. Army Engineer Research and Development Ctr. (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
Previous work on large-scale simulation of snow accumulation is not relevant for simulations of autonomous ground vehicle (AGV) performance, for which the relevant length scales are a few meters to a few hundred meters. In this work, we present a physics-based simulation of the accumulation of falling snow that is implemented using smoothed-particle hydrodynamics (SPH) to represent snow mass-elements. We show how the SPH simulation output can be combined with a rendering simulation to create synthetic images for training and testing snow detection algorithms that use machine learning. DISTRIBUTION A: APPROVED FOR PUBLIC RELEASE.
13035-21
Author(s): James Uplinger, U.S. Army Research Lab. (United States); Adam Goertz, Johns Hopkins Univ. Applied Physics Lab., LLC (United States); Vickram Rajendran, Nikhil Dev Deshmudre, Applied Intuition, Inc. (United States); Celso M. De Melo, Philip Osteen, U.S. Army Research Lab. (United States); Frits van Paasschen, Applied Intuition (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
Semantic segmentation of 2D images is a critical capability for Unmanned Ground Vehicles (UGV) navigation. A significant amount of work has been performed in data collection for road rated civilian UGVs, but Army applications are more challenging, requiring algorithms to identify a wider range of terrain and conditions. Acquiring sufficient off-road data is challenging, time intensive, and expensive due to the vast amount of variation in factors, such as off- road terrain, lighting conditions, and weather that are not present in on-road applications. Simulators can rapidly synthesize imagery appropriate to target environments that can be used to re-train models for environments with sparse datasets. Here we show that synthetic off-road data generated in simulation improved the performance of a scene segmentation algorithm deployed on a UGV. We discuss solutions to optimize the generation of synthetic data, as well as mixing with real data, for autonomous navigation in rough terrain.
Break
Coffee Break 3:10 PM - 3:40 PM
Session 8: Generative Models
23 April 2024 • 3:40 PM - 5:00 PM EDT | Potomac 6
Session Chair: Celso De Melo, DEVCOM Army Research Lab. (United States)
13035-23
Author(s): Alexander Pichler, Nicolas Hueber, Institut Franco-Allemand de Recherches de Saint-Louis (France)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
Deep neural network based military vehicle detectors pose particular challenges due to the scarcity of relevant images and limited access to vehicles in this domain, particularly in the infrared spectrum. To address these issues, a novel drone-based bi-modal vehicle acquisition method is proposed, capturing 72 key images from different view angles of a vehicle in a fast and automated way. By overlaying vehicle patches with relevant background images and utilizing data augmentation techniques, synthetic training images are obtained. This study introduces the use of AI-generated synthetic background images compared to real video footage. Several models were trained and their performance compared in real-world situations. Results demonstrate that the combination of data augmentation, context-specific background samples, and synthetic background images significantly improves model precision while maintaining Mean Average Precision, highlighting the potential of utilizing Generative AI (Stable Diffusion) and drones to generate training datasets for object detectors in challenging domains.
13035-25
Author(s): Boyang Deng, Yuzhen Lu, Michigan State Univ. (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
Robust weed recognition relies on curating large-scale, diverse datasets, which are, however, practically difficult to come by. Deep generative modeling has received widespread attention in synthesizing visually realistic images beneficial for wide-ranging applications. This study investigates the efficacy of state-of-the-art deep learning-based diffusion models as an image augmentation technique for synthesizing weed images towards enhanced weed detection performance. A 10-weed-class dataset was created as a testbed for image generation and weed detection tasks. A ControlNet-added Stable Diffusion model was trained to generate weed images with broad intra-class variations of targeted weed species and diverse backgrounds to adapt to changing field conditions. The quality of generated images was assessed using metrics including the Fréchet Inception Distance and Inception Score. The generated images had an average FID score of 0.98 and an IS score of 3.63. YOLOv8l was trained for weed detection. Combining the generated with real images yielded a 1.2% mAP@50:95 improvement in weed detection, compared to modeling using real images alone. Further research is needed to exploit controlla
13035-59
Author(s): Jonathan Christian, Max Bright, Jason Summers, Applied Research in Acoustics LLC (United States); Ashley Olson, Timothy C. Havens, Michigan Technological Univ. (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
In this work, we demonstrate the utility of a conditionally generative, multi-scale vision transformer that learns the spatial and spectral structures and the interactions between them in an unsupervised manner in order to accurately synthesize near-infrared (NIR) and short-wave infrared (SWIR) from 3-band RGB. This synthesis is performed over a diverse set of target objects observed over multiple seasons, at multiple look angles, over varying topographies, with images sampled globally from multiple satellites. For both training and inference, the model is provided no context or metadata as input. Compared to using RGB alone, the average precision (AP) of an off-the-shelf object detection model trained with the additional synthesized IR data improves by up to 48% on a target class that is difficult for an analyst to identify. In conjunction with RGB data, using synthetic instead of true IR data for object detection provides higher AP values over all target classes.
13035-57
Author(s): Prasanna Reddy Pulakurthi, Rochester Institute of Technology (United States); Celso M. De Melo, Raghuveer Rao, DEVCOM Army Research Lab. (United States); Majid Rabbani, Rochester Institute of Technology (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
Deep Neural Networks (DNNs) have emerged as a powerful tool for human action recognition, yet their reliance on vast amounts of high-quality labeled data poses significant challenges. A promising alternative is to train the network on generated synthetic data. However, existing synthetic data generation pipelines require complex simulation environments. Our novel solution bypasses this requirement by employing Generative Adversarial Networks (GANs) to generate synthetic data from only a small existing real-world dataset. Our training pipeline extracts the motion from each training video and augments it across various subject appearances within the training set. This approach increases the diversity in both motion and subject representations, thus significantly enhancing the model's performance. A rigorous evaluation of the model's performance is presented under diverse scenarios, including ground and aerial views. Moreover, an insightful analysis of critical factors influencing human action recognition performance, such as gesture motion diversity and subject appearance, is presented.
Poster Session
23 April 2024 • 6:00 PM - 7:30 PM EDT | Potomac C
Conference attendees are invited to attend the symposium-wide poster session on Tuesday evening. Come view the SPIE DCS posters, enjoy light refreshments, ask questions, and network with colleagues in your field. Authors of poster papers will be present to answer questions concerning their papers. Attendees are required to wear their conference registration badges to the poster session.

Poster Setup: Tuesday 12:00 PM - 5:30 PM
Poster authors, view poster presentation guidelines and set-up instructions at http://spie.org/DCSPosterGuidelines.
13035-44
Author(s): Kimmy Chang, U.S. Space Force (United States); Alex Cabello, Jeff Houchard, EO Solutions (United States); Jonathan Gazak, Justin Fletcher, U.S. Space Force (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
Aperture photometry is a critical method for estimating visual magnitudes of stars and satellites, essential in Space Domain Awareness (SDA) for tasks like collision avoidance. Traditional methods have fixed aperture shapes, limiting accuracy and adaptability. We introduce a novel approach that defines pixel-specific regions for the aperture and annulus, significantly improving accuracy. Nevertheless, conventional aperture photometry is constrained by predefined equations, leading to errors and sensitivity to image conditions. To overcome these limitations, we propose a learned photometry pipeline that combines aperture photometry with machine learning. Our approach demonstrates remarkable effectiveness for both stars and satellites across diverse image conditions. We rigorously tested it on three datasets, including a custom synthetic dataset and real imagery. Our results showcase outstanding performance, with a 1.44% error in star visual magnitude estimation and a 0.64% error in satellite visual magnitude estimation.
13035-45
Author(s): Sourabh Yadav, Thanh Le, Shaohua Dong, Heng Fan, Qing Yang, Chenxi Qiu, Yan Huang, Univ. of North Texas (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
Recent AI advancements hold significant promise for enhancing Radio Frequency (RF) tracking capabilities, enabling the detection, localization, and tracking of highly directional signals through coordinated swarms. However, these advancements also bring new challenges, such as the need for comprehensive training datasets that consider various environmental factors affecting RF signal propagation. This paper introduces a new simulation platform specifically for evaluating the performance of RF tracking methods and, more importantly, generating comprehensive signal map training datasets for reinforcement learning-based RF tracking algorithms. By leveraging the MATLAB toolbox, the simulator can model RF signal propagation and swarm mobility, accounting for free space loss, diffraction loss, and environmental factors like terrain and weather conditions. Additionally, the platform can simulate the trajectories of different types of moving transmitters and receivers and offers users flexibility to incorporate their own mobility models into the simulator, enabling the training of reinforcement learning for RF tracking in complex scenarios generated by the platform.
13035-46
Author(s): RyeAnne Ricker, National Institutes of Health (United States); Nestor Perea, The Pennsylvania State Univ. (United States); Elodie Ghedin, National Institutes of Health (United States); Murray Loew, The George Washington Univ. (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
A Generative Adversarial Network was used to produce Raman spectra of Influenza A virus in culture and then used to train a virus detection classification model. Dimensionality reduction plotting using t-Distributed Stochastic Neighbor Embedding (t-SNE) demonstrated overlap between the real and synthetic spectra but not complete blending, which can be attributed to the subtle differences between the real and synthetic data. Nevertheless, the real and synthetic spectra also exhibited similar Raman peak patterns. Moreover, the inclusion of synthetic spectra into the training set was able to increase the virus classification accuracy from 83.5% to 91.5%. This indicates that the GANs were able to synthesize spectra closely related to virus-positive spectra yet distinctly different from virus-negative spectra, which appear visually similar. We conclude that the synthetic spectra produced by the GANs were similar to the real data but not an exact replacement.
13035-47
Author(s): Kenneth Witham, Kostas Research Institute, Northeastern Univ. (United States); Nishanth Marer Prabhu, Aly Sultan, Northeastern Univ. (United States); Marius Necsoiu, DEVCOM ARL, San Antonio, TX (United States); Chad Spooner, NorthWest Research Associates (United States); Gunar Schirner, Northeastern Univ. (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
Automatic Modulation Recognition (AMR) is an important part of spectrum management. Existing work and datasets focus on variety in the modulations transmitted and only apply rudimentary channel effects. We propose a new dataset which supports AMR tasks which focuses on only a few common modulations but introduces a large variation to the propagation channel. Simple scenarios with rural and urban areas are randomly generated using Simplex noise and a receiver/transmitter pair is placed in the scenario. The 3GPP model is combined with the propagation vector from the scenario generator to simulate a signal propagating across the generated terrain. This dataset brings more realism to the AMR task and will allow machine learning models to adapt to changing environments.
13035-48
Author(s): Indranil Sinharoy, Aditya Dave, SAMSUNG Research America (United States); Gaurav Duggal, Virginia Polytechnic Institute and State Univ. (United States); Vutha Va, Lianjun Li, Hao Chen, Abhishek Sehgal, SAMSUNG Research America (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
Recently, there is a growing interest in utilizing wireless signals for human gesture recognition and activity recognition. At the same time, the scarcity and lack of diversity of radar echo signature datasets of human gestures and activities is well recognized. This work demonstrates a framework for synthetically generating a vast and diverse set of radar echo signatures starting from a small set of optical motion capture (MoCap) trajectories. The captured trajectories are perturbed using a pool of composable spatial and temporal transformation functions assembled by a data augmentation pipeline builder. The transformed trajectories, combined with a simple radar cross-section (RCS) modeling process, are used to simulate radar CIR signals. Features extracted from this synthetic dataset show a strong correlation with the features obtained from simultaneously collected real radar data. Furthermore, we demonstrate that the synthetically generated radar echo signals can improve the performance of ML-based wireless gesture and activity recognition systems especially where the availability of real data is limited.
13035-49
Author(s): Rachel Kinard, Air Force Research Lab. (United States); Nathan Jones, The Univ. of Oklahoma (United States); Brandon Kinard, Elizabeth Sudkamp, Air Force Research Lab. (United States); Alexander Mattingly, Univ. of Maryland, College Park (United States); Joshua Rice, Applied Research Solutions, Inc. (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
The reconstruction of a watertight surface mesh from point clouds is a difficult problem. Constructing a watertight model from a polygonal mesh is just as difficult since there can be many issues in these models, such as intersecting surfaces and non-manifold geometry. We first describe a complete repair process for a single CAD object, resulting in a repaired static model. Next, we implement a novel workflow that can be used to repair local issues on almost every model, allowing one to use global repair methods on local areas of the model. This workflow can be applied to an assembly of CAD objects to retain articulations in the final repaired dynamic model. We introduce methods from Topological Data Analysis (TDA) to show that topological features can be used in the definition of robust mesh metrics, to characterize and determine quality of meshes, and to implement fully-automated watertight & repair of CAD meshes.
13035-50
Author(s): Emily Kenul, Margaret Black, Drew Massey, Zachary Havelka, Mawia Henkai, Kyle Gavin, Luke Shellhorn, Booz Allen Hamilton Inc. (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
Acquiring representative data samples is pivotal to the process of creating machine learning models. However, gathering real-world imagery often presents challenges related to privacy concerns, regulatory constraints, financial resources, and accessibility limitations. Synthetic imagery offers an opportunity to augment real-world computer vision datasets while bypassing these obstacles. Yet, a fundamental challenge in working with synthetic imagery is ensuring that the generated data closely resembles its real-world counterpart. Further, it can be difficult to generate synthetic imagery with the same features and quality required to train well-generalized computer vision models. This research paper introduces and evaluates our custom-built Replicant framework – a novel synthetic data generation framework integrated into Booz Allen’s Vision AI Stack. We leverage this framework to produce synthetic imagery that closely resembles a real-world maritime dataset and utilize this data to train object detection models, demonstrating how synthetic data benefits model performance. Additionally, we employ similarity metrics, including perceptual hashing (pHash), Optimal Transport Dataset Distance (OTDD) metric, and Fréchet Inception Distance (FID) to assess the likeness of these real and synthetic datasets. Finally, we explore the applicability and effectiveness of explainable AI (XAI) techniques, such as Eigen Class Activation Mapping (Eigen CAM) and Shapley Additive Explanation (SHAP), to gain insights into the performance of our deep learning models and the utility of our synthetic data. Our findings underscore the vast potential of synthetic data to benefit deep learning model performance while overcoming challenges associated with real-world data acquisition.
13035-51
Author(s): John B. Peace, Benjamin S. Riggan, Univ. of Nebraska-Lincoln (United States)
On demand | Presented live 23 April 2024
Show Abstract + Hide Abstract
In the realm of facial recognition, three-dimensional (3D) textured meshes are pivotal for comprehensive identification across different viewpoints. However, challenges such as data scarcity and domain shifts pose significant hurdles. Addressing these, our proposed methodology synthesizes textured 3D facial meshes from standard two-dimensional images. This synthesis not only bolsters pose invariance but also optimizes the fusion of both synthetic and real 3D facial data, enhancing recognition accuracy. By employing a 2D-to-3D domain adaptation technique, we have fine-tuned the Adaface framework to discern 3D facial traits, adapted from Pointnet++. Using denoising diffusion probabilistic models (DDPMs), our approach successfully crafts 3D textured meshes from 2D representations. This novel method, when compared to traditional techniques, underscores the potential of 2D systems in decoding 3D features, setting the stage for groundbreaking advancements in facial recognition.
13035-63
Author(s): Hannah Lensing, Thomson Reuters Special Services, LLC (United States)
23 April 2024 • 6:00 PM - 7:30 PM EDT | Potomac C
Show Abstract + Hide Abstract
Synthetic data introduces a unique solution to overcome the hurdles of gathering or purchasing large quantities of data for use-cases where the realistic representation of that data is important – not the real-world data itself. While extremely useful, the generation of synthetic data can be difficult, especially when modeling highly complex environments, specifically social networks and online communities. To address this challenge, we developed an approach using statistical methodologies and graph analysis that captures and stores the patterns of real-world data for the generation of synthetic data. This way, the data itself is real, the underlying patterns/logic in which the data presents itself is real, but the outputted aggregation of that data into a new object is synthetic. Once the synthetic objects are generated, training environments can further be enriched using multiple generative AI models to generate content consistent with the synthetic objects and the needs of the environment.
Symposium Plenary on AI/ML + Sustainability
24 April 2024 • 8:30 AM - 10:00 AM EDT | Potomac A
Session Chairs: Latasha Solomon, DEVCOM Army Research Lab. (United States), Ann Marie Raynal, Sandia National Labs. (United States)

Welcome and opening remarks
24 April 2024 • 8:30 AM - 8:40 AM EDT

Army intelligence data and AI in modern warfare (Plenary Presentation)
Presenter(s): David Pierce, U.S. Army Intelligence (United States)
24 April 2024 • 8:40 AM - 9:20 AM EDT

FUTUR-IC: A three-dimensional optimization path towards building a sustainable microchip industry (Plenary Presentation)
Presenter(s): Anu Agarwal, Massachusetts Institute of Technology, Microphotonics Ctr. and Materials Research Lab. (United States)
24 April 2024 • 9:20 AM - 10:00 AM EDT

Break
Coffee Break 10:00 AM - 10:30 AM
Session 9: Multi-Domain Operations
24 April 2024 • 10:30 AM - 12:10 PM EDT | Potomac 6
Session Chair: Raghuveer M. Rao, DEVCOM Army Research Lab. (United States)
13035-26
Author(s): Stuart W. Card, Critical Technologies Inc. (United States)
On demand | Presented live 24 April 2024
Show Abstract + Hide Abstract
In many applications, it is important that data mining and Machine Learning (ML) systems discriminate, among discovered statistical associations, those that are at least plausibly causal from those that are mere coincidences. A graph of causal relationships may be complex, with fan-in, fan-out, transitive, and various combinations of dependencies. To test a system’s power to filter out non-causal associations and untangle the causal web, suitable synthetic data is needed. We report the development, in Wolfram Mathematica, of code that synthesizes data with subtle, complex, causal dependencies among some but not all of the generated observable variables. Problem difficulty is tunable. A set of generated data and code for generating more are both released openly on-line.
13035-27
Author(s): Clint Morris, Jason Zutty, Georgia Tech Research Institute (United States)
On demand | Presented live 24 April 2024
Show Abstract + Hide Abstract
In this study, we introduce a depth prediction model, transcending traditional applications, emphasizing absolute accuracy, especially at longer ranges. Using the AirSim Unreal Engine simulator, we crafted a dataset with 2.7 million images, capturing diverse scenes and environments. Additional images from 14 RGB and depth sensor pairs on a drone further enhance the model's versatility. Key features of our model, such as the overlap patch embedding block and the Mixed-Feed Forward Network, facilitate depth prediction up to 1900 meters with a MAPE of 5-10%. Beyond this, performance dips, indicating areas for improvement. Real-world data analysis was qualitative due to supervisory constraints, yielding promising results. Overall, our work showcases potential strides in depth prediction, supported by robust simulation data.
13035-28
Author(s): Jonathan S. Kent, Lockheed Martin Corp. (United States)
On demand | Presented live 24 April 2024
Show Abstract + Hide Abstract
Current standard practices in computational military simulation, especially the simulation of historical battles, result in fundamental epistemic error that significantly reduces its evidentiary power, the usefulness of any generated synthetic data for machine learning systems, as well as its capacity to develop meaningful and general results which might be applied to contemporary affairs. This paper lays out this criticism by analogizing military simulation to the numerical approximation of dynamical systems, via which we demonstrate the limitations associated with attempting to model a single battle. We end with a discourse on the nature of the results that should be expected from high quality computational military simulation, and its role in military doctrine, both from a Clausewitzian perspective.
13035-29
Author(s): Guangkun Li, Wayne Shanks, Jovan Barac, Pedro Rodriguez, Johns Hopkins Univ. Applied Physics Lab., LLC (United States)
On demand | Presented live 24 April 2024
Show Abstract + Hide Abstract
See through walls, is a much needed capability by special operation and security forces and centimeter wave (CMW) imaging system operated at around 5GHz provides a low power solution with good range and penetration performance. In this work, we aim to design a scene reconstruction system using 5GHz WIFI signals across obstacles and develop deep learning (DL) based algorithm for real-time 3D reconstruction. The DL model is based on an encoding-decoding type of neural network. Our approach includes the integration of self-attention modules to transform the position encoded RF signal into a latent space representation effectively. Furthermore, to address the memory consumption challenges in 3d reconstruction and enhance performance, our decoder employs sparse tensors and sparse convolutions via the Minkowski Engine. Our results showcase the system's capability to reconstruct scenes with a resolution nearing the Rayleigh criterion.
13035-52
Author(s): Terry Traylor, North Dakota State Univ. (United States)
24 April 2024 • 11:50 AM - 12:10 PM EDT | Potomac 6
Show Abstract + Hide Abstract
Inspired by Learning Theory, cognitive science, psychological descriptions of experience and memory, unsupervised labeling, and computer vision; Terry Traylor - a retired military information and artificial intelligence professional - borrows techniques from both the social and natural sciences to identify processes that enable experimental AI learning from cybersecurity videos. Specifically, he uses mixed-methods theory development techniques from qualitative science to study students learning cybersecurity processes to develop a biologically-inspired synthetic framework to bootstrap machine learning or other generalized synthetic learning processes. Using the learning cybersecurity tradecraft from videos case, he exposes processes and challenges associated with handling multi-modal information that enables generalized synthetic learning. Special attention is paid to Sensory AI challenges, synthetic perception, and multi-modal processing. The session will expose attendees to a synthetic structure for multi-signal/multi-modal learning, a proposed language for synthetic experience memory structures, and a biologically-inspired structure for the multi-modal learning problem.
Break
Lunch/Exhibition Break 12:10 PM - 1:40 PM
Session 10: Integrated Machine Learning and Synthesis Pipelines
24 April 2024 • 1:40 PM - 2:20 PM EDT | Potomac 6
Session Chair: Celso De Melo, DEVCOM Army Research Lab. (United States)
13035-32
Author(s): Edgar A. Bernal, FLX AI, Inc. (United States); Rohan Sharma, Univ. at Buffalo (United States); Shanmukha Yenneti, FLX AI, Inc. (United States); Ian Mackey, FLX AI (United States); Javier Malave, Derek J. Walvoord, Bernard Brower, L3Harris Technologies, Inc. (United States)
On demand | Presented live 24 April 2024
Show Abstract + Hide Abstract
Training state-of-the-art image classifiers and object detectors remains an extremely data-intensive process to this day. The significant data needs in turn impose strict requirements on the data acquisition, curation, and labelling stages that typically precede the learning process. This poses a particularly significant challenge for military and defense applications where the availability of high-quality labeled data is often scarce. What is needed are methods that can effectively learn from sparse amounts of labeled, real-world data. In this paper, we propose a novel framework that incorporates a synthetic data generator into a supervised learning pipeline in order to enable end-to-end co-optimization of the discriminability and realism of the synthetic data, as well as the performance of the supervised engine. We demonstrate, via extensive empirical validation on image classification and object detection tasks, that the proposed framework is capable of learning from a small fraction of the real-world data required to train traditional, standalone supervised engines, while matching or even outperforming its off-the-shelf counterparts.
13035-31
Author(s): Andrii Soloviov, Derek T. Anderson, Jeffrey Kerley, Brendan Alvey, Univ. of Missouri (United States)
On demand | Presented live 24 April 2024
Show Abstract + Hide Abstract
This paper presents MizSIM, a novel open-source framework utilizing Unreal Engine (UE) to generate synthetic datasets for AI training and evaluation. Overcoming the challenges of data reliance and model opacity, MizSIM enables detailed performance analysis and failure diagnosis through manipulation of agent and environment parameters. Leveraging UE's open-source distribution and high-quality graphics, along with tools like AirSim and ROS, MizSIM offers cost-effective, user-friendly design for seamless data extraction. Demonstrated workflows include single-life computer vision tasks and object detector evaluations across simulated lives. MizSIM aims to establish a closed-loop environment, enhancing AI effectiveness and transparency.
Session 11: Fidelity and Sensitivity Analysis I
25 April 2024 • 8:30 AM - 10:00 AM EDT | Potomac 6
Session Chair: Kimberly E. Manser, DEVCOM C5ISR (United States)
13035-61
Author(s): Colin N. Reinhardt, Naval Information Warfare Ctr. Pacific (United States); Sarah Brockman, Kitware (United States); Rusty Blue, Kitware, Inc. (United States); Brian Clipp, Kitware (United States); Anthony Hoogs, Kitware, Inc. (United States)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
Synthetically-generated imagery holds the promise of being a panacea for the challenges of real world datasets. Yet it continues to be frequently observed that deep learning model performance is not as good when trained with synthetic data versus real measured imagery. In this study we present analyses and illustration of the use of several statistical metrics, measures, and visualization tools based on the distance and similarity between real and synthetic data empirical distributions in the latent feature embedding space, which provide a quantitative understanding of the relevant image-domain distribution discrepancy issues hampering the generation of performant simulated datasets. We also demonstrate the practical applications of these tools and techniques in a novel study comparing latent space embedding vector distributions of real, pristine synthetic, and synthetic modified by physics-based degradation models. The results may assist deep learning practitioners and synthetic imagery modelers with evaluating latent space embedding distributional dissimilarity and improving model performance when using simulation tools to generate synthetic imagery training data.
13035-33
Author(s): Gregory P. Spell, Michael Tran, Peter Torrione, CoVar, LLC (United States); Mark Jeiran, Bassam Bahhur, Kimberly Manser, DEVCOM C5ISR (United States)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
Convolutional Neural Networks (CNNs) achieve state of the art performance on Infrared (IR) detection and identification (e.g., classification) problems. To train such algorithms, however, requires a tremendous quantity of labeled data, which are less available in the IR domain than for “natural imagery”, and are further less available for CV-related tasks. In this work, we train deep models on a combination of real and synthetic IR data -- which is a cheap and attractive alternative to real data. We evaluate model performance on real IR data. We focus on the tasks of vehicle and person detection, object identification, and vehicle parts segmentation. We find that for both detection and object identification, training on a combination of real and synthetic data performs better than training only on real data. This improvement demonstrates an advantage to using synthetic data for computer vision. Furthermore, we believe that the utility of synthetic data – when combined with real data – will only increase as the realism gap closes.
13035-34
Author(s): Nicholas Hamilton, Adam Webb, Michigan Technological Univ. (United States); Matt Wilder, Michigan Tech Research Institute (United States); Ben Hendrickson, Matthew Blanck, Erin Nelson, Wiley Roemer, Signature Research, Inc. (United States); Timothy C. Havens, Michigan Technological Univ. (United States)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
Geospatial intelligence is a subject with many opportunities for machine automation. Object detection is one desirable application. However, a lack of high-volume relevant datasets can make this task difficult. To combat this issue, we introduced a spin-set augmentation technique to generate synthetic training data. We used these synthetic datasets to augment the training of an object detection deep network, focusing on visible band imagery. We have continued our efforts by further testing this method on long-wave infrared imagery, including results from YOLO, SSD, and Faster R-CNN algorithms. We also introduce another synthetic augmentation technique which involves generating physics-based fully-rendered images of 3D synthetic scenery and targets and compared the rendered image performance to that of spin-sets. This paper analyzes both the spin-set and rendered image augmentation techniques in terms of object detection performance, complexity, generalizability, and explainability.
13035-35
Author(s): Ashley Dale, William Reindl, Edwin Sanchez, Albert William, Lauren Christopher, Indiana Univ.-Purdue Univ. Indianapolis (United States)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
Synthetic data are frequently used to supplement a small set of real images and create a dataset with diverse features, but this may not improve the equivariance of a computer vision model. Our work answers the following questions: First, what metrics are useful for measuring a domain gap between real and synthetic data distributions? Second, is there an effective method for bridging an observed domain gap? We explore these questions by presenting a pathological case where the inclusion of synthetic data did not improve model performance, then presenting measurements of the difference between the real and synthetic distributions in the image space, latent space, and model prediction space. We find that augmenting the dataset with pixel-level augmentation effectively reduced the observed domain gap, and improves the model F1 score to 0.95 compared to 0.43 for un-augmented data. We also observe that an increase in the average cross entropy of the latent space feature vectors is positively correlated with increased model equivariance and the closing of the domain gap. The results are explained using a framework of model regularization effects.
Break
Coffee Break 10:00 AM - 10:30 AM
Session 12: Fidelity and Sensitivity Analysis II
25 April 2024 • 10:30 AM - 11:50 AM EDT | Potomac 6
Session Chair: Kimberly E. Manser, DEVCOM C5ISR (United States)
13035-39
Author(s): Marilyn Esposito, Jing Lin, Renea Young, Keefa Nelson, Air Force Research Lab. (United States)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
Robust and resilient machine learning is critical to leading the world in cutting-edge technology for defense, but to achieve it, we need large amounts of representative data. Unfortunately, collecting and labeling real world data can be expensive and time-consuming. Computer generated data, often referred to as synthetic data, has made it possible to exponentially increase the amount of labeled data available with methods of creation such as generative models. Despite this growing trend to dedicate money and resources to produce synthetic data via simulated environments, it remains undetermined if training algorithms on synthetic data is an advantage for mission critical object detection tasks. In this paper, we propose a unique data quality metric that will support or counter the hypothesis that synthetic data is a viable alternative to using real world data.
13035-40
Author(s): Frank A. Ruis, Alma Liezenga, Friso G. Heslinga, Luca Ballan, Thijs A. Eker, Richard J. M. den Hollander, Martin C. van Leeuwen, Judith Dijk, Wyke Huizinga, TNO (Netherlands)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
Collecting and annotating real-world data for the development of object detection models, in particular in the military domain, is time-consuming, expensive, and sometimes unfeasible. Training models on synthetic data may provide a solution. However, bridging the reality gap between synthetic and real data remains a challenge. Existing methods usually build on top of baseline CNNs and ignore best practices from object detection on real data. In this paper we propose a methodology for improving the performance of a pre-trained object detector when training on synthetic data. Our approach focuses on extracting the salient information from synthetic data without forgetting useful features learned from pre-training on real images. Based on the state of the art, we incorporate data augmentation methods and a Transformer backbone. We show that our methods improve the state of the art on synthetic data trained object detection for the RarePlanes and DGTA-VisDrone datasets, and reach near-perfect performance on an in-house vehicle detection dataset.
13035-41
Author(s): Lalitha Dabbiru, Christopher T. Goodin, Daniel Carruth, Mississippi State Univ. (United States); Zachary Aspin, Justin Carrillo, U.S. Army Engineer Research and Development Ctr. (United States); John Kaniarz, U.S. Army Combat Capabilities Development Command (United States)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
Machine learning algorithms require datasets that are both massive and varied to train and generalize effectively. However, preparing real-world semantically labeled datasets is a very time-consuming and cumbersome task. This algorithm performance and generalization gap due to limited quantities of real-world data could be decreased with the help of synthetic datasets that are generated with the consideration of real-world features. In this work, a combination of synthetic and real-world datasets are used to demonstrate and assess the performance of simulated-to-real-world transfer learning algorithms where the training is done in synthetic and testing in real-world datasets. The performance is further evaluated with a mixture of real and synthetic datasets. Finally, a variety of synthetic scene fidelities were considered for training data in order to evaluate the effectiveness of low-fidelity synthetic data for training neural networks.
13035-42
Author(s): Justin T. Carrillo, Barbara Pilate, Andrew Trautz, Matthew Bray, Jonathan D. Sherburn, Madeline S. Karr, Orie Cecil, Matthew Farthing, U.S. Army Engineer Research and Development Ctr. (United States)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
The rising use of AI in crucial sectors underscores the need for advancements in explainable AI (XAI) to maintain transparency and trust in AI decisions. This paper introduces a new method that combines the Virtual Environmental Simulation for Physics-based Analysis (VESPA) with Randomized Input Sampling for Explanation (RISE) to enhance AI model explainability, especially in complex simulations. VESPA is recognized for its high-fidelity, physics-based simulations, creating extensive datasets under varied conditions, which include different sensor setups and environmental and material changes. This data is essential for employing RISE, a model-agnostic technique that creates pixel-level importance maps by testing the AI model with masked input images. This integration provides a systematic method to visualize and comprehend how various environmental factors affect AI decisions. Our method clarifies AI decision-making and offers a scalable framework to assess AI model robustness and reliability in diverse simulated environments.
Break
Lunch/Exhibition Break 11:50 AM - 1:20 PM
Session 13: Fidelity and Sensitivity Analysis III
25 April 2024 • 1:20 PM - 2:40 PM EDT | Potomac 6
Session Chair: Kimberly E. Manser, DEVCOM C5ISR (United States)
13035-43
Author(s): Peter Rizzi, Michael Gormish, Jacob Kovarskiy, Aaron Reite, Matthew Zeiler, Clarifai, Inc. (United States)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
This talk presents a study on how to use AI models and synthetic data for target recognition from overhead images. The main challenge is that AI models need a lot of labeled data, which is costly and scarce for some objects. The authors explore different ways of generating AI models with synthetic data to augment the real data. They also introduce the Clarifai platform, which is a tool that allows users to create and use AI models for visual content using existing and synthetic data. The main findings are: - Synthetic data can help improve AI model performance, but not as much as real data. - Physics-based synthetic data is more realistic, but also more time-consuming, than AI-generated synthetic data. - Clarifai platform is a convenient and flexible tool for synthetic data generation and AI model development.
13035-37
Author(s): Oliver Pierson, Georgia Tech Research Institute (United States)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
Clouds are a persistent and ubiquitous source of clutter in ground- and air-based imagery. Thus reproducing realistic cloud clutter in simulated imagery is a useful tool for analyzing both sensor and algorithm performance. Moreover, as AI-based processing algorithms become more prevalent, the need to produce accurate synthetic imagery will only increase. Unfortunately the generation of accurate cloud imagery is computationally difficult. In this talk, we review relevant theory and discuss our work to precompute cloud radiance and its assumptions, limitations and applicability. In particular, we build off previous work in radiative transport and physics-based rendering and demonstrate background and foreground clouds in imagery with reasonable render times.
13035-38
Author(s): Anna X. Mason, Ryan Connal, Jacob A. Irizarry, Byron K. Eng, Michael G. Saunders, Adam A. Goodenough, Scott D. Brown, Carl Salvaggio, Rochester Institute of Technology (United States)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
In active research being conducted by the Digital Imaging and Remote Sensing laboratory in the Chester F. Carlson Center for Imaging Science at Rochester Institute of Technology, researchers are focusing on volume estimation of condensed water vapor plumes that are generated from mechanical draft cooling towers at a variety of facilities. Various modalities of remote sensing data from different imaging platforms have been utilized in undertaking the automatic segmentation and volume estimation question. Since real imagery cannot be easily or reliably obtained in capturing sporadically occurring events or anomalous targets, this creates an issue leading to an imbalance or under-representation in machine learning tasks. Prior research has supported the use of machine learning for plume segmentation and multi-view geometry techniques for 3D reconstruction and subsequent volume estimation on real imagery. This research focuses on the training of a U-Net model to mask and segment the synthetically generated condensed water vapor plume imagery from other objects in the scene, similar in the way that it has been previously and successfully applied with real imagery.
13035-36
Author(s): Friso G. Heslinga, Miguel Caro Cuenca, Rob J. Knight, Faruk Uysal, TNO (Netherlands)
On demand | Presented live 25 April 2024
Show Abstract + Hide Abstract
Deep learning-based image analysis offers opportunities for space domain awareness, in which radar techniques can be used to monitor the fast-growing population of satellites. Current techniques focus on detection and tracking, but for the characterization of a satellite’s capabilities, more detailed information is needed. In this study, we present a deep-learning based pipeline for automated segmentation of ISAR images of a satellite. We use synthetic data and a domain adaptation technique that only requires a few samples from the target domain. As a result, synthetic datasets are proven to be invaluable for training segmentation models for ISAR images, especially when combined with domain adaptation techniques.
Conference Chair
DEVCOM C5ISR (United States)
Conference Chair
DEVCOM C5ISR (United States)
Conference Chair
DEVCOM Army Research Lab. (United States)
Conference Co-Chair
DEVCOM Army Research Lab. (United States)
Program Committee
Univ. of Missouri (United States)
Program Committee
Johns Hopkins Univ. (United States)
Program Committee
U.S. Navy (United States)
Program Committee
Stanford Univ. (United States)
Program Committee
Johns Hopkins Univ. Applied Physics Lab., LLC (United States)
Program Committee
Univ. of Maryland, College Park (United States), The Univ. of North Carolina at Chapel Hill (United States)
Program Committee
Nevada National Security Site (United States)
Program Committee
DEVCOM Army Research Lab. (United States)
Program Committee
Georgia Tech Research Institute (United States)
Program Committee
Naval Information Warfare Ctr. Pacific (United States)
Program Committee
Univ. of California, Los Angeles (United States)
Program Committee
Massachusetts Institute of Technology (United States)
Program Committee
Air Force Research Lab. (United States)
Program Committee
Air Force Research Lab. (United States)
Additional Information

View call for papers

 

What you will need to submit:

  • Presentation title
  • Author(s) information
  • Speaker biography (1000-character max including spaces)
  • Abstract for technical review (200-300 words; text only)
  • Summary of abstract for display in the program (50-150 words; text only)
  • Keywords used in search for your paper (optional)
  • Check the individual conference call for papers for additional requirements (i.e. extended abstract PDF upload for review or instructions for award competitions)
Note: Only original material should be submitted. Commercial papers, papers with no new research/development content, and papers with proprietary restrictions will not be accepted for presentation.