Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11851/10982
Title: Can the generalizability issue of artificial intelligence be overcome? Pneumothorax detection algorithm
Authors: Verdi, Elvan Burak
Yılmaz, Muhammed
Mulazimoglu, Deniz Dogan
Türker, Abdüssamet
Gürün Kaya, Aslıhan
Işık, Özlem
Bostanoğlu Karacin, Aslı
Keywords: Artificial intelligence
chest radiograph
chest X-ray
generalizability
pneumothorax
Issue Date: 2024
Publisher: Sage Publications Ltd
Abstract: The generalizability of artificial intelligence (AI) models is a major issue in the field of AI applications. Therefore, we aimed to overcome the generalizability problem of an AI model developed for a particular center for pneumothorax detection using a small dataset for external validation. Chest radiographs of patients diagnosed with pneumothorax (n = 648) and those without pneumothorax (n = 650) who visited the Ankara University Faculty of Medicine (AUFM; center 1) were obtained. A deep learning-based pneumothorax detection algorithm (PDA-Alpha) was developed using the AUFM dataset. For implementation at the Health Sciences University (HSU; center 2), PDA-Beta was developed through external validation of PDA-Alpha using 50 radiographs with pneumothorax obtained from HSU. Both PDA algorithms were assessed using the HSU test dataset (n = 200) containing 50 pneumothorax and 150 non-pneumothorax radiographs. We compared the results generated by the algorithms with those of physicians to demonstrate the reliability of the results. The areas under the curve for PDA-Alpha and PDA-Beta were 0.993 (95% confidence interval (CI): 0.985-1.000) and 0.986 (95% CI: 0.962-1.000), respectively. Both algorithms successfully detected the presence of pneumothorax on 49/50 radiographs; however, PDA-Alpha had seven false-positive predictions, whereas PDA-Beta had one. The positive predictive value increased from 0.525 to 0.886 after external validation (p = 0.041). The physicians' sensitivity and specificity for detecting pneumothorax were 0.585 and 0.988, respectively. The performance scores of the algorithms were increased with a small dataset; however, further studies are required to determine the optimal amount of external validation data to fully address the generalizability issue.
URI: https://doi.org/10.1177/10815589231208479
https://hdl.handle.net/20.500.11851/10982
ISSN: 1081-5589
1708-8267
Appears in Collections:PubMed İndeksli Yayınlar Koleksiyonu / PubMed Indexed Publications Collection
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Show full item record



CORE Recommender

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.