Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.11851/10982
Title: | Can the Generalizability Issue of Artificial Intelligence Be Overcome? Pneumothorax Detection Algorithm | Authors: | Verdi, Elvan Burak Yılmaz, Muhammed Mulazimoglu, Deniz Dogan Türker, Abdüssamet Gürün Kaya, Aslıhan Işık, Özlem Bostanoğlu Karacin, Aslı |
Keywords: | Artificial intelligence chest radiograph chest X-ray generalizability pneumothorax |
Publisher: | Sage Publications Ltd | Abstract: | The generalizability of artificial intelligence (AI) models is a major issue in the field of AI applications. Therefore, we aimed to overcome the generalizability problem of an AI model developed for a particular center for pneumothorax detection using a small dataset for external validation. Chest radiographs of patients diagnosed with pneumothorax (n = 648) and those without pneumothorax (n = 650) who visited the Ankara University Faculty of Medicine (AUFM; center 1) were obtained. A deep learning-based pneumothorax detection algorithm (PDA-Alpha) was developed using the AUFM dataset. For implementation at the Health Sciences University (HSU; center 2), PDA-Beta was developed through external validation of PDA-Alpha using 50 radiographs with pneumothorax obtained from HSU. Both PDA algorithms were assessed using the HSU test dataset (n = 200) containing 50 pneumothorax and 150 non-pneumothorax radiographs. We compared the results generated by the algorithms with those of physicians to demonstrate the reliability of the results. The areas under the curve for PDA-Alpha and PDA-Beta were 0.993 (95% confidence interval (CI): 0.985-1.000) and 0.986 (95% CI: 0.962-1.000), respectively. Both algorithms successfully detected the presence of pneumothorax on 49/50 radiographs; however, PDA-Alpha had seven false-positive predictions, whereas PDA-Beta had one. The positive predictive value increased from 0.525 to 0.886 after external validation (p = 0.041). The physicians' sensitivity and specificity for detecting pneumothorax were 0.585 and 0.988, respectively. The performance scores of the algorithms were increased with a small dataset; however, further studies are required to determine the optimal amount of external validation data to fully address the generalizability issue. | URI: | https://doi.org/10.1177/10815589231208479 https://hdl.handle.net/20.500.11851/10982 |
ISSN: | 1081-5589 1708-8267 |
Appears in Collections: | PubMed İndeksli Yayınlar Koleksiyonu / PubMed Indexed Publications Collection Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection |
Show full item record
CORE Recommender
Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.