Share Email Print

Proceedings Paper

A prospective randomized clinical trial for measuring radiology study reporting time on Artificial Intelligence-based detection of intracranial hemorrhage in emergent care head CT
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

The quantitative evaluation of Artificial Intelligence (AI) systems in a clinical context is a challenging endeavor, where the development and implementation of meaningful performance metrics is still in its infancy. Here, we propose a scientific concept, Artificial Intelligence Prospective Randomized Observer Blinding Evaluation (AI-PROBE) for quantitative clinical performance evaluation of radiology AI systems within prospective randomized clinical trials. Our evaluation workflow encompasses a study design and a corresponding radiology Information Technology (IT) infrastructure that randomly blinds radiologists with regards to the presence of positive reads as provided by AI-based image analysis systems. To demonstrate the applicability of our AI-evaluation framework, we present a first prospective randomized clinical trial on investigating the effect of automatic identification of Intra-Cranial Hemorrhage (ICH) in emergent care head CT scans on radiology study Turn-Around Time (TAT) in a clinical environment. Here, we acquired 620 consecutive non-contrast head CT scans from CT scanners used for inpatient and emergency room patients at a large academic hospital over a time period of 14 consecutive days. Immediately following image acquisition, scans were automatically analyzed for the presence of ICH using commercially available software (Aidoc, Tel Aviv, Israel). Cases identified as positive for ICH by AI (ICH-AI+) were automatically flagged in the radiologists' reading worklists, where flagging was randomly switched off with a probability of 50%. Study TAT was measured automatically as the time difference between study completion and first clinically communicated study reporting, with time stamps for these events automatically retrieved from various radiology IT systems. TATs for flagged cases (73 ± 143 min) were significantly lower than TATs for non-flagged (132 ± 193 min) cases (p<0.05, one-sided t-test), where 105 of the 122 ICH-AI+ cases were true positive reads. Total sensitivity, specificity, and accuracy over all analyzed cases were 95.0%, 96.7%, and 96.4%, respectively. We conclude that automatic identification of ICH reduces study TAT for ICH in emergent care head CT settings, which carries the potential for improving clinical management of ICH by accelerating clinically indicated therapeutic interventions. In a broader context, our results suggest that our AI-PROBE framework can contribute to a systematic quantitative evaluation of AI systems in a clinical workflow environment with regards to clinically meaningful performance measures, such as TAT or diagnostic accuracy metrics.

Paper Details

Date Published: 28 February 2020
PDF: 7 pages
Proc. SPIE 11317, Medical Imaging 2020: Biomedical Applications in Molecular, Structural, and Functional Imaging, 113170M (28 February 2020);
Show Author Affiliations
Axel Wismüller, Univ. of Rochester Medical Ctr. (United States)
Univ. of Rochester (United States)
Ludwig Maximilian Univ. (Germany)
Larry Stockmaster, Univ. of Rochester Medical Ctr. (United States)

Published in SPIE Proceedings Vol. 11317:
Medical Imaging 2020: Biomedical Applications in Molecular, Structural, and Functional Imaging
Andrzej Krol; Barjor S. Gimi, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?