AI based diagnosis of Anterior Cruciate Ligament Tears performs at levels similar to subspecialized musculoskeletal radiologists. New study from the University Hospital Balgrist published in the Journal Investigative Radiology assess AI Algorithm Scandiags by Balzano on MR examinations of 59 different institutions – different vendors, different scanners and different field strength.
The aim of this study was to clinically validate a Deep Convolutional Neural Network (DCNN) for the detection of surgically proven anterior cruciate ligament (ACL) tears in a large patient cohort and to analyze the effect of magnetic resonance examinations from different institutions, varying protocols, and field strengths.
MATERIALS AND METHODS:
After ethics committee approval, this retrospective analysis of prospectively collected data was performed on 512 consecutive subjects, who underwent knee magnetic resonance imaging (MRI) in a total of 59 different institutions followed by arthroscopic knee surgery at our institution. The DCNN and 3 fellowship-trained full-time academic musculoskeletal radiologists evaluated the MRI examinations for full-thickness ACL tears independently. Surgical reports served as the reference standard. Statistics included diagnostic performance metrics, including sensitivity, specificity, area under the receiver operating curve („AUC ROC“), and kappa statistics. P values less than 0.05 were considered to represent statistical significance.
Anterior cruciate ligament tears were present in 45.7% (234/512) and absent in 54.3% (278/512) of the subjects. The DCNN had a sensitivity of 96.1%, which was not significantly different from the readers (97.5%-97.9%; all P ≥ 0.118), but significantly lower specificity of 93.1% (readers, 99.6%-100%; all P < 0.001) and „AUC ROC“ of 0.935 (readers, 0.989-0.991; all P < 0.001) for the entire cohort. Subgroup analysis showed a significantly lower sensitivity, specificity, and „AUC ROC“ of the DCNN for outside MRI (92.5%, 87.1%, and 0.898, respectively) than in-house MRI (99.0%, 94.4%, and 0.967, respectively) examinations (P = 0.026, P = 0.043, and P < 0.05, respectively). There were no significant differences in DCNN performance for 1.5-T and 3-T MRI examinations (all P ≥ 0.753, respectively).
Deep Convolutional Neural Network performance of ACL tear diagnosis can approach performance levels similar to fellowship-trained full-time academic musculoskeletal radiologists at 1.5 T and 3 T; however, the performance may decrease with increasing MRI examination heterogeneity.