SPIE hosts grand challenges to advance CAD in medical imaging

SPIE is collaborating with other organizations to host grand challenge events aimed at solving problems in medical imaging.

07 February 2018
Samuel G. Armato III

Computer-aided diagnosis (CAD) has demonstrated value to the decision-making process of radiologists and other physicians. When computers are used to quantitatively evaluate medical images and objectively combine extracted image features with clinical data, genomics data, and other patient-specific information into decision models, there is great potential for patient benefit.

The complex algorithms required to achieve such success, however, require years of effort from dedicated research groups around the world. Such groups, working independently, typically suffer from limited local resources such as patient data and access to the expert opinion or "ground truth" required to properly train and test their algorithms.

When these research groups report their CAD methods in the literature, it is difficult for the medical imaging research community to compare the relative merits of different approaches, since the performance of these methods can greatly depend on factors such as database composition, subtlety of the target lesions, "truth" definition, and the performance-evaluation metric.

Grand challenges are valuable because they allow for a direct comparison of different algorithms designed for a specific radiologic task with all algorithms following the same set of rules, operating on a common set of images, and being evaluated with a uniform performance assessment approach. In effect, challenges serve to level the differences across the various factors that make comparisons of different CAD methods so difficult.

Maryellen Giger asks a question from the audience during presentations by LUNGx Challenge participants during the 2015 conference

2018 SPIE President Maryellen Giger asks a question from the audience during presentations by LUNGx Challenge participants during the 2015 SPIE Medical Imaging conference.

The resulting head-to-head comparison among the methods of participating groups can serve to steer the field in the direction of approaches that are the most promising for a specific clinical task. Grand challenges can also shape the landscape of CAD research by encouraging advancement of techniques for specific clinically relevant tasks.

Ultimately, grand challenges advance the concept of "open science" by providing the resources necessary for friendly competition among research groups with the overall goal of generating interest and collaborations among investigators.

LUNGx was 1st challenge
SPIE recently joined with the American Association of Physicists in Medicine (AAPM) and the US National Cancer Institute (NCI) to sponsor the development, design, and conduct of grand challenges in medical imaging. The first SPIE-AAPM-NCI Challenge was held in conjunction with the 2015 SPIE Medical Imaging symposium.

Dubbed the "LUNGx Challenge," this initial challenge tasked participants with the development of computerized methods for the classification of lung nodules on diagnostic computed tomography (CT) scans as benign or malignant. Ten groups participated from around the world, and the two best-performing groups presented their methods at a special panel session within the CAD conference during the symposium.

SPIE-AAPM-NCI Lung Nodule Classification Challenge

A team led by Lyndsey Pickup of Mirada Medical (UK) achieved the best performance result. A team led by Yoganand Balagurunathan of the Moffitt Cancer Center (USA) had the second best result.

The success of the LUNGx Challenge encouraged the continuation of the SPIE-AAPM-NCI collaboration, and efforts to identify a unique, clinically relevant radiologic task resulted in the crafting of a two-part challenge in 2017. The first part of that challenge was held in conjunction with SPIE Medical Imaging 2017, and the second was held in conjunction with the 2017 AAPM Annual Meeting. It involved a database of multiparametric magnetic-resonance-imaging (MRI) images of the prostate from Radboud University Nijmegen (The Netherlands).

PROSTATEx was 2-part challenge
The key to a two-part challenge was to identify two related challenge tasks that could make use of the same set of cases in such a way that the conduct of Part 1 would not compromise Part 2, either by biasing the results of Part 2 or giving groups that participated in Part 1 an unfair advantage during Part 2.

The database from Radboud University contained MRI scans, and for 538 prostate lesions, it also indicated whether the lesions were "clinically significant" or "not clinically significant." For 182 prostate lesions, it also included the pathology-derived Gleason Grade Group (a 1-5 designation obtained from the more familiar Gleason score).

This information became the basis for the two-part prostate MRI challenge. In the "PROSTATEx Challenge" (more formally known as the SPIE-AAPM-NCI Prostate MR Classification Challenge), participants were asked to develop quantitative image-analysis methods for the diagnostic classification of clinically significant prostate lesions. Participants in the "PROSTATEx-2 Challenge" (the SPIE-AAPM-NCI Prostate MR Gleason Grade Group Challenge) were asked to develop quantitative MRI biomarkers for the determination of Gleason Grade Group in prostate cancer.

Both challenges explored the use of imaging alone (along with advanced computer algorithms) to achieve what is currently accomplished clinically with biopsy specimens invasively obtained from patients.


The PROSTATEx Challenge released a training set of cases in November 2016 that contained MRI scans of 330 prostate lesions along with spatial location coordinates, anatomic zone location, and known clinical significance of each lesion. Three weeks later, the test set of cases was made available; the test set contained MRI scans of 208 prostate lesions, again with spatial location and anatomic zone, but the clinical significance information for these lesions was not included.

After five weeks, the results from each participant were submitted to the organizers in the form of a single real number on a 0-1 range for each lesion, with the number representing the computer-determined likelihood of the lesion being clinically significant. Receiver operating characteristic (ROC) analysis was used to assess performance. Thirty-two groups submitted results from a total of 71 methods.

The performance-assessment metric area under the ROC curve ranged from 0.45-0.87. The two best-performing groups presented their methods at a special session during the SPIE Medical Imaging in February 2017.

The first-place winner was Silvio Moreto Pereira of Albert Einstein Hospital (Brazil). Tied for second place were Jarrel Chen Yi Seah of Alfred Health (Australia) and Saifeng Liu of the MRI Institute for Biomedical Research (Canada), whose papers can be found in the SPIE Digital Library.

The PROSTATEx-2 Challenge released training cases in May 2017 that contained MRI scans of 112 prostate lesions and, for each lesion, included spatial location coordinates, anatomic zone location, and known Gleason Grade Group.

Three weeks later, the test cases were made available with MRI scans of 70 prostate lesions along with spatial location and anatomic zone, but the Gleason Grade Group information for these lesions was not included.

After eight weeks, each participant submitted their results to the organizers as a single ordinal value ranging from 1 to 5 for each lesion representing the computer-determined Gleason Grade Group. The quadratic-weighted kappa statistic was used to assess performance.

Twenty-one groups submitted results from a total of 43 methods, with kappa values ranging from -0.24 to 0.28. The two best-performing groups presented their methods at a special session during the 2017 AAPM Annual Meeting.

Taking first place in the PROSTATEx-2 Challenge were Bejoy Abraham and Madhu Nair of Kerala University (India). Liu, who took first place in the first PROSTATEx Challenge, was the second place winner.

3rd challenge in 2019?
SPIE has taken a leadership role, along with other scientific organizations, to host challenges to explore similarities and differences in the ability of different algorithms to perform the same task under the same conditions.

The team of volunteer organizers for the PROSTATEx Challenges came from the University of Chicago, Radboud University, University of Michigan, the US Food & Drug Administration, and the NCI. Their contributions and the effort of all the participating groups resulted in the success of these challenges.

A panel discussion will be held 13 February at SPIE Medical Imaging 2018 in Houston, TX (USA), to review the lessons learned from the PROSTATEx and LUNGx Challenges. The session will also provide an overview of the proposed 2019 SPIE Medical Imaging Joint CAD/Pathology Challenge.

SPIE Fellow Samuel G. Armato III, University of Chicago (USA)-SPIE Fellow Samuel G. Armato III is an associate professor at the University of Chicago (USA) where he is chair of the Committee on Medical Physics and director of the Medical Physics Graduate Program. He is a member of the program committee and past chair for the computer-aided diagnosis (CAD) conference at SPIE Medical Imaging. His PhD in medical physics was centered around CAD.

Jan-Mar cover of SPIE Journal of Medical Imaging

The Journal of Medical Imaging has published two open-access papers on the LUNGx Challenge and has a special section on quantitative imaging and the pioneering efforts of the late Laurence Clarke.

Papers from the first PROSTATEx Challenge can be found in the SPIE Digital Library.

This article was originally published in the January 2018 edition of SPIE Professional magazine.

Recent News
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?