Can the Generalizability Issue of Artificial Intelligence Be Overcome? Pneumothorax Detection Algorithm

Verdi, Elvan Burak; Yılmaz, Muhammed; Mulazimoglu, Deniz Dogan; Türker, Abdüssamet; Gürün Kaya, Aslıhan; Işık, Özlem; Özbayoğlu, Ahmet Murat

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11851/10982

Title:	Can the Generalizability Issue of Artificial Intelligence Be Overcome? Pneumothorax Detection Algorithm
Authors:	Verdi, Elvan Burak Yılmaz, Muhammed Mulazimoglu, Deniz Dogan Türker, Abdüssamet Gürün Kaya, Aslıhan Işık, Özlem Özbayoğlu, Ahmet Murat
Keywords:	Artificial intelligence chest radiograph chest X-ray generalizability pneumothorax
Publisher:	Sage Publications Ltd
Abstract:	The generalizability of artificial intelligence (AI) models is a major issue in the field of AI applications. Therefore, we aimed to overcome the generalizability problem of an AI model developed for a particular center for pneumothorax detection using a small dataset for external validation. Chest radiographs of patients diagnosed with pneumothorax (n = 648) and those without pneumothorax (n = 650) who visited the Ankara University Faculty of Medicine (AUFM; center 1) were obtained. A deep learning-based pneumothorax detection algorithm (PDA-Alpha) was developed using the AUFM dataset. For implementation at the Health Sciences University (HSU; center 2), PDA-Beta was developed through external validation of PDA-Alpha using 50 radiographs with pneumothorax obtained from HSU. Both PDA algorithms were assessed using the HSU test dataset (n = 200) containing 50 pneumothorax and 150 non-pneumothorax radiographs. We compared the results generated by the algorithms with those of physicians to demonstrate the reliability of the results. The areas under the curve for PDA-Alpha and PDA-Beta were 0.993 (95% confidence interval (CI): 0.985-1.000) and 0.986 (95% CI: 0.962-1.000), respectively. Both algorithms successfully detected the presence of pneumothorax on 49/50 radiographs; however, PDA-Alpha had seven false-positive predictions, whereas PDA-Beta had one. The positive predictive value increased from 0.525 to 0.886 after external validation (p = 0.041). The physicians' sensitivity and specificity for detecting pneumothorax were 0.585 and 0.988, respectively. The performance scores of the algorithms were increased with a small dataset; however, further studies are required to determine the optimal amount of external validation data to fully address the generalizability issue.
URI:	https://doi.org/10.1177/10815589231208479 https://hdl.handle.net/20.500.11851/10982
ISSN:	1081-5589 1708-8267
Appears in Collections:	PubMed İndeksli Yayınlar Koleksiyonu / PubMed Indexed Publications Collection Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection Yapay Zeka Mühendisliği Bölümü / Department of Artificial Intelligence Engineering

Show full item record

CORE Recommender

Page view(s)

276

checked on Sep 8, 2025

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Altmetric

Google Scholar^TM