Apr 10, 2025
WCO-IOF-ESCEO
Abstract
OSTEO
Enhancing Osteoporosis Detection in Chest X-Ray with a Large-Scale Vision Foundation Model: A Multi-Institutional Study
Saerom Park, Minje Kim, Minjee Kim, Hyun-Jin Bae*
Objective
FRAX is a widely used tool for predicting fracture risk but is limited by its reliance on dual-energy X-ray absorptiometry (DXA) and inability to assess short-term fracture risk. This study aimed to develop a deep learning (DL)-based model using chest radiography (CXR) and combine it with FRAX to enhance predictive performance.
Materials and Methods
This multicenter study included 42,014 patients from Institution A (2008–2019) for DL model development and 10,523 patients from Institution B (2003–2022) for external test. CXRs were preprocessed using localized energy-based normalization, and convolutional neural network-based DL models were trained separately for original and normalized images. Ensemble outputs of DL and FRAX (DL-FRAX) were used for final predictions. A logistic hazard loss function was employed to directly estimate survival functions. Performance was assessed using C-index and area under the receiver-operator curves (AUROCs) in internal (5,000 cases) and external (10,523 cases) test sets, comparing DL, DL-FRAX, and FRAX.
Results
Mean ages were 59.3, 61.4 years, and 79.8%, 67.9% were female in development and external test sets, respectively. In predicting major osteoporotic fractures, the DL model achieved a C-index of 0.867 and 2-, 3-, and 5-year AUROCs of 0.878, 0.887, and 0.886, outperforming FRAX (C-index: 0.800; AUROCs: 0.805, 0.804, and 0.805, all P<0.001) in the internal test set. The DL-FRAX ensemble model showed a C-index of 0.847 and AUROCs of 0.858, 0.873, and 0.868, which were also significantly higher than the performances of FRAX model (all P<0.001). Similarly, in the external validation set, C-index values were 0.763 and 0.752 for DL and DL-FRAX models, respectively, which showed significantly higher performances than that of FRAX (C-index of 0.737, both P<0.001). In terms of vertebral, nonvertebral, and hip fractures, DL model’s performances showed C-indices of 0.871, 0.852, and 0.923, respectively. Corresponding 2-, 3-, and 5-year AUROCs were 0.873, 0.886, and 0.888 for vertebral; 0.934, 0.907, and 0.874 for non-vertebral; and 0.931, 0.928, and 0.936 for hip fractures.
Conclusion
Combining a DL-based model using CXR with FRAX significantly improved fracture risk prediction compared to FRAX alone. This approach may provide a more accessible, effective tool for clinical fracture risk assessment.