Share Email Print

Proceedings Paper

Robust photometric stereo endoscopy via deep learning trained on synthetic data (Conference Presentation)

Paper Abstract

Colorectal cancer is the second leading cause of cancer deaths in the United States and causes over 50,000 deaths annually. The standard of care for colorectal cancer detection and prevention is an optical colonoscopy and polypectomy. However, over 20% of the polyps are typically missed during a standard colonoscopy procedure and 60% of colorectal cancer cases are attributed to these missed polyps. Surface topography plays a vital role in identification and characterization of lesions, but topographic features often appear subtle to a conventional endoscope. Chromoendoscopy can highlight topographic features of the mucosa and has shown to improve lesion detection rate, but requires dedicated training and increases procedure time. Photometric stereo endoscopy captures this topography but is qualitative due to unknown working distances from each point of mucosa to the endoscope. In this work, we use deep learning to estimate a depth map from an endoscope camera with four alternating light sources. Since endoscopy videos with ground truth depth maps are challenging to attain, we generated synthetic data using graphical rendering from an anatomically realistic 3D colon model and a forward model of a virtual endoscope with alternating light sources. We propose an encoder-decoder style deep network, where the encoder is split into four branches of sub-encoder networks that simultaneously extract features from each of the four sources and fuse these feature maps as the network goes deeper. This is complemented by skip connections, which maintain spatial consistency when the features are decoded. We demonstrate that, when compared to monocular depth estimation, this setup can reduce the average NRMS error for depth estimation in a silicone colon phantom by 38% and in a pig colon by 31%.

Paper Details

Date Published: 4 March 2019
Proc. SPIE 10871, Multimodal Biomedical Imaging XIV, 108710N (4 March 2019); doi: 10.1117/12.2509878
Show Author Affiliations
Faisal Mahmood, Johns Hopkins Univ. (United States)
Daniel Borders, Johns Hopkins Univ. (United States)
Richard Chen, Johns Hopkins Univ. (United States)
Jordan Sweer, Johns Hopkins Univ. (United States)
(United States)
Steven Tilley II, Johns Hopkins Univ. (United States)
Norman S. Nishioka, Massachusetts General Hospital (United States)
J. Webster Stayman, Johns Hopkins Univ. (United States)
Nicholas J. Durr, Johns Hopkins Univ. (United States)

Published in SPIE Proceedings Vol. 10871:
Multimodal Biomedical Imaging XIV
Fred S. Azar; Xavier Intes; Qianqian Fang, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?